Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/vicariousinc/science_rcn
Reference implementation of a two-level RCN model
https://github.com/vicariousinc/science_rcn
Last synced: 3 months ago
JSON representation
Reference implementation of a two-level RCN model
- Host: GitHub
- URL: https://github.com/vicariousinc/science_rcn
- Owner: vicariousinc
- License: mit
- Created: 2017-10-24T22:50:46.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2023-10-03T23:59:24.000Z (over 1 year ago)
- Last Synced: 2024-08-03T04:05:45.467Z (6 months ago)
- Language: Python
- Homepage: https://www.vicarious.com/Common_Sense_Cortex_and_CAPTCHA.html
- Size: 39 MB
- Stars: 665
- Watchers: 52
- Forks: 196
- Open Issues: 24
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-ocr - A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs
README
[![](data/vicarious_logo.png)](https://www.vicarious.com)
# Reference implementation of Recursive Cortical Network (RCN)
Reference implementation of a two-level RCN model on MNIST classification. See the *Science* article "A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs" and [Vicarious Blog](https://www.vicarious.com/posts/common-sense-cortex-and-captcha) for details.
> Note: this is an unoptimized reference implementation and is not intended for production.
## Setup
Note: Python 3.9 is supported. The code was tested on OSX 12.3.1. It may work on other system platforms but not guaranteed. You will need the packages listed in `requirements.txt` to be installed.
Clone the repository:
```
git clone https://github.com/vicariousinc/science_rcn.git
```The code is pure Python, so you can run it right away, although you will have to uncompress the ZIP in the data folder manually.
Alternatively, install with (setting up a virtual environment beforehand is recommended):
```
python setup.py install
```## Run
If you installed via `make` you need to activate the virtual environment:
```
source venv/bin/activate
```To run a small unit test that trains and tests on 20 MNIST images using one CPU (takes ~2 minutes, accuracy is ~60%):
```
python science_rcn/run.py
```To run a slightly more interesting experiment that trains on 100 images and tests on 20 MNIST images using multiple CPUs (takes <1 min using 7 CPUs, accuracy is ~90%):
```
python science_rcn/run.py --train_size 100 --test_size 20 --parallel
```To test on the full 10k MNIST test set, training on 1000 examples (could take hours depending on the number of available CPUs, average accuracy is ~97.7+%):
```
python science_rcn/run.py --full_test_set --train_size 1000 --parallel --pool_shape 25 --perturb_factor 2.0
```## Blog post
Check out our related [blog post](https://www.vicarious.com/Common_Sense_Cortex_and_CAPTCHA.html).
## Datasets
We used the following datasets for the Science paper:
CAPTCHA datasets
- [reCAPTCHA](http://datasets.vicarious.com/recaptcha.zip) (from [google.com](http://google.com))
- [BotDetect](http://datasets.vicarious.com/botdetect.zip) (from [captcha.com](http://captcha.com))
- [Paypal](http://datasets.vicarious.com/paypal.zip) (from [paypal.com](http://paypal.com))
- [Yahoo](http://datasets.vicarious.com/yahoo.zip) (from [yahoo.com](http://yahoo.com))MNIST datasets
- Original (available at [http://yann.lecun.com/exdb/mnist/](http://yann.lecun.com/exdb/mnist/))
- [With occlusions](http://datasets.vicarious.com/mnist-multioccluded.zip) (by us)
- [With noise](http://datasets.vicarious.com/noisyMNIST_tests.zip) (by us)## MNIST licensing
Yann LeCun (Courant Institute, NYU) and Corinna Cortes (Google Labs, New York) hold the copyright of MNIST dataset, which is a derivative work from original NIST datasets. MNIST dataset is made available under the terms of the Creative Commons Attribution-Share Alike 3.0 license.