https://github.com/rom1504/tensorflow_captcha_solver

Captcha solver based on https://medium.com/@ageitgey/how-to-break-a-captcha-system-in-15-minutes-with-machine-learning-dbebb035a710
https://github.com/rom1504/tensorflow_captcha_solver

captcha-solving deep-learning preprocessing tensorflow vision

Last synced: 2 months ago
JSON representation

Captcha solver based on https://medium.com/@ageitgey/how-to-break-a-captcha-system-in-15-minutes-with-machine-learning-dbebb035a710

Host: GitHub
URL: https://github.com/rom1504/tensorflow_captcha_solver
Owner: rom1504
Created: 2018-10-13T20:18:49.000Z (almost 8 years ago)
Default Branch: master
Last Pushed: 2018-11-09T23:08:25.000Z (over 7 years ago)
Last Synced: 2025-02-08T10:33:27.410Z (over 1 year ago)
Topics: captcha-solving, deep-learning, preprocessing, tensorflow, vision
Language: Python
Homepage:
Size: 11 MB
Stars: 4
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

### Before you get started

code based on https://medium.com/@ageitgey/how-to-break-a-captcha-system-in-15-minutes-with-machine-learning-dbebb035a710

### New work compared to article

This has been changed to train on fully synthetic data generated with all fonts present on the system.

It generalize to new data not seen in the training set.
For example it fully works on the data of the initial article without having been trained
at all on the data generated by the captcha system on the article.

### Installation
To run these scripts, you need the following installed:

1. Python 3
2. The python libraries listed in requirements.txt
- virtualenv --python=/usr/bin/python3 venv
- source venv/bin/activate
- pip3 install -r requirements.txt

you may regenerate the acceptable font by running python3 generate_acceptable_fontlist.py

installing `msttcorefonts` provide more fonts hence better results
(that was useful in my case to distinguish O and 0)

Run pipeline.sh or :

### Step 0: Generate images

python3 generate_image.py

### Step 1: Extract single letters from CAPTCHA images

Run:

python3 extract_single_letters_from_captchas.py

The results will be stored in the "extracted_letter_images" folder.

### Step 2: Train the neural network to recognize single letters

Run:

python3 train_model.py

This will write out "captcha_model.hdf5" and "model_labels.dat"

### Step 3: Use the model to solve CAPTCHAs!

Run:

python3 solve_captchas_with_model.py

### Step 4: use the flask endpoint

python3 flask_endpoint.py

then go to `http://127.0.0.1:5000/?url=http://somecaptcha.url/`
or `http://127.0.0.1:5000/?url=http://somecaptcha.url/&show_image=1` to see the image too

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/rom1504/tensorflow_captcha_solver

Awesome Lists containing this project

README