https://github.com/rom1504/tensorflow_captcha_solver
Captcha solver based on https://medium.com/@ageitgey/how-to-break-a-captcha-system-in-15-minutes-with-machine-learning-dbebb035a710
https://github.com/rom1504/tensorflow_captcha_solver
captcha-solving deep-learning preprocessing tensorflow vision
Last synced: about 1 month ago
JSON representation
Captcha solver based on https://medium.com/@ageitgey/how-to-break-a-captcha-system-in-15-minutes-with-machine-learning-dbebb035a710
- Host: GitHub
- URL: https://github.com/rom1504/tensorflow_captcha_solver
- Owner: rom1504
- Created: 2018-10-13T20:18:49.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2018-11-09T23:08:25.000Z (over 7 years ago)
- Last Synced: 2025-02-08T10:33:27.410Z (over 1 year ago)
- Topics: captcha-solving, deep-learning, preprocessing, tensorflow, vision
- Language: Python
- Homepage:
- Size: 11 MB
- Stars: 4
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
### Before you get started
code based on https://medium.com/@ageitgey/how-to-break-a-captcha-system-in-15-minutes-with-machine-learning-dbebb035a710
### New work compared to article
This has been changed to train on fully synthetic data generated with all fonts present on the system.
It generalize to new data not seen in the training set.
For example it fully works on the data of the initial article without having been trained
at all on the data generated by the captcha system on the article.
### Installation
To run these scripts, you need the following installed:
1. Python 3
2. The python libraries listed in requirements.txt
- virtualenv --python=/usr/bin/python3 venv
- source venv/bin/activate
- pip3 install -r requirements.txt
you may regenerate the acceptable font by running python3 generate_acceptable_fontlist.py
installing `msttcorefonts` provide more fonts hence better results
(that was useful in my case to distinguish O and 0)
Run pipeline.sh or :
### Step 0: Generate images
python3 generate_image.py
### Step 1: Extract single letters from CAPTCHA images
Run:
python3 extract_single_letters_from_captchas.py
The results will be stored in the "extracted_letter_images" folder.
### Step 2: Train the neural network to recognize single letters
Run:
python3 train_model.py
This will write out "captcha_model.hdf5" and "model_labels.dat"
### Step 3: Use the model to solve CAPTCHAs!
Run:
python3 solve_captchas_with_model.py
### Step 4: use the flask endpoint
python3 flask_endpoint.py
then go to `http://127.0.0.1:5000/?url=http://somecaptcha.url/`
or `http://127.0.0.1:5000/?url=http://somecaptcha.url/&show_image=1` to see the image too