{"id":13583960,"url":"https://github.com/JackonYang/captcha-tensorflow","last_synced_at":"2025-04-06T21:33:42.032Z","repository":{"id":16995416,"uuid":"80978083","full_name":"JackonYang/captcha-tensorflow","owner":"JackonYang","description":"Image Captcha Solving Using TensorFlow and CNN Model. Accuracy 90%+","archived":false,"fork":false,"pushed_at":"2023-03-01T00:27:38.000Z","size":5683,"stargazers_count":997,"open_issues_count":4,"forks_count":272,"subscribers_count":35,"default_branch":"master","last_synced_at":"2024-11-06T00:39:40.388Z","etag":null,"topics":["captcha","captcha-breaking","captcha-generator","captcha-recognition","captcha-solver","captcha-solving","cnn-model","tensorflow"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/JackonYang.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2017-02-05T07:55:52.000Z","updated_at":"2024-11-02T13:55:53.000Z","dependencies_parsed_at":"2023-01-11T20:00:54.566Z","dependency_job_id":"1f368309-8ce9-4a76-88f2-7f575ad10ab9","html_url":"https://github.com/JackonYang/captcha-tensorflow","commit_stats":null,"previous_names":[],"tags_count":8,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JackonYang%2Fcaptcha-tensorflow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JackonYang%2Fcaptcha-tensorflow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JackonYang%2Fcaptcha-tensorflow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JackonYang%2Fcaptcha-tensorflow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/JackonYang","download_url":"https://codeload.github.com/JackonYang/captcha-tensorflow/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247556921,"owners_count":20958034,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["captcha","captcha-breaking","captcha-generator","captcha-recognition","captcha-solver","captcha-solving","cnn-model","tensorflow"],"created_at":"2024-08-01T15:03:55.607Z","updated_at":"2025-04-06T21:33:37.022Z","avatar_url":"https://github.com/JackonYang.png","language":"Jupyter Notebook","funding_links":[],"categories":["Jupyter Notebook","Łamanie","Choose project ideas from below 👇"],"sub_categories":["Ogólne","Deep Learning Projects 🌀"],"readme":"# Captcha Solving Using TensorFlow\n\n\n## Introduction\n\n1. Solve captcha using TensorFlow.\n2. Learn CNN and TensorFlow by a practical project.\n\nFollow the steps,\nrun the code,\nand it works!\n\nthe accuracy of 4 digits version can be as high as 99.8%!\n\nThere are several more steps to put this prototype on production.\n\n**Ping me for paid technical supports**.\n\n[i@jackon.me](mailto:i@jackon.me)\n\n\n## Table of Contents\n\n- Solve Captcha Using CNN Model\n\n  - Training: 4-digits Captcha\n  - Training: 4-letters Captcha\n  - Inference: load trained model and predict given images\n\n- Generate DataSet for Training\n\n  - Usage\n  - Example 1: 4 chars per captcha, use digits only\n  - Example 2: sampling random images\n\n## Solve Captcha Using CNN Model\n\n\nold code that using tensorflow 1.x is moved to [tensorflow_v1](tensorflow_v1).\n\n\n#### Training: 4-digits Captcha\n\nthis is a perfect project for beginers.\n\nwe will train a model of ~90% accuracy in 1 minute using one single GPU card (GTX 1080 or above).\n\nif we increase the dataset by 10x, the accuracy increases to 98.8%.\nwe can further increase the accuracy to 99.8% using 1M traning images.\n\nhere is the source code and running logs: [captcha-solver-tf2-4digits-AlexNet-98.8.ipynb](captcha-solver-tf2-4digits-AlexNet-98.8.ipynb)\n\nImages, Ground Truth and Predicted Values:\n\nthere is 1 predicton error out of the 20 examples below. 9871 -\u003e 9821\n\n![](img-doc/result-preview-4digits.png)\n\nAccuracy and Loss History:\n\n![](img-doc/history-4digits.png)\n\nModel Structure:\n\n- 3 convolutional layers, followed by 2x2 max pooling layer each.\n- 1 flatten layer\n- 2 dense layer\n\n![](img-doc/model-structure-alexnet-for-4digits.png)\n\n\n#### Training: 4-letters Captcha\n\nthis is a more practical project.\n\nthe code is the same as the 4-digits version, but the training dataset is much bigger.\n\nit costs 2-3 hours to generate training dataset and costs 30 min to train a 95% accuracy model.\n\nhere is the source code and running logs: [captcha-solver-tf2-4letters-AlexNet.ipynb](captcha-solver-tf2-4letters-AlexNet.ipynb)\n\n\n#### Inference: load trained model and predict given images\n\nexample: [captcha-solver-model-restore.ipynb](captcha-solver-model-restore.ipynb)\n\n\n## Generate DataSet for Training\n\n#### Usage\n\n```bash\n$ python datasets/gen_captcha.py  -h\nusage: gen_captcha.py [-h] [-n N] [-c C] [-t T] [-d] [-l] [-u] [--npi NPI] [--data_dir DATA_DIR]\n\noptional arguments:\n  -h, --help           show this help message and exit\n  -n N                 epoch number of character permutations.\n  -c C                 max count of images to generate. default unlimited\n  -t T                 ratio of test dataset.\n  -d, --digit          use digits in dataset.\n  -l, --lower          use lowercase in dataset.\n  -u, --upper          use uppercase in dataset.\n  --npi NPI            number of characters per image.\n  --data_dir DATA_DIR  where data will be saved.\n```\n\nexamples:\n\n![](img-doc/data-set-example.png)\n\n#### Example 1: 4 chars per captcha, use digits only\n\n1 epoch has `10*9*8*7=5040` images, generate 6 epoches for training.\n\ngenerating the dataset:\n\n```bash\n$ python datasets/gen_captcha.py -d --npi=4 -n 6\n10 choices: 0123456789\ngenerating 6 epoches of captchas in ./images/char-4-epoch-6/train\ngenerating 1 epoches of captchas in ./images/char-4-epoch-6/test\nwrite meta info in ./images/char-4-epoch-6/meta.json\n```\n\npreview the dataset:\n\n```bash\n$ python datasets/base.py images/char-4-epoch-6/\n========== Meta Info ==========\nnum_per_image: 4\nlabel_choices: 0123456789\nheight: 100\nwidth: 120\nn_epoch: 6\nlabel_size: 10\n==============================\ntrain images: (30240, 100, 120), labels: (30240, 40)\ntest images: (5040, 100, 120), labels: (5040, 40)\n```\n\n#### Example 2: sampling random images\n\nscenario: use digits/upper cases, 4 chars per captcha image.\n\n1 epoch will have `36*35*34*33=1.4M` images. the dataset is too big to debug.\n\nusing `-c 10000` param, sampling 10k *random* images.\n\ngenerating the dataset:\n\n```bash\n$ python3 datasets/gen_captcha.py -du --npi 4 -n 1 -c 10000\n36 choices: 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ\ngenerating 1 epoches of captchas in ./images/char-4-epoch-1/train.\nonly 10000 records used in epoche 1. epoche_count: 1413720\n```\n\n\n## Running Jupyter in docker\n\ntensorflow image: [https://hub.docker.com/r/jackon/tensorflow-2.1-gpu](https://hub.docker.com/r/jackon/tensorflow-2.1-gpu)\n\n```bash\ndocker pull jackon/tensorflow-2.1-gpu\n# check if gpu works in docker container\ndocker run --rm --gpus all -t jackon/tensorflow-2.1-gpu /usr/bin/nvidia-smi\n# start jupyter server in docker container\ndocker run --rm --gpus all -p 8899:8899 -v $(realpath .):/tf/notebooks -t jackon/tensorflow-2.1-gpu\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FJackonYang%2Fcaptcha-tensorflow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FJackonYang%2Fcaptcha-tensorflow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FJackonYang%2Fcaptcha-tensorflow/lists"}