https://github.com/j03-dev/tarzan
Python package for optical character recognition, CNN
https://github.com/j03-dev/tarzan
cnn keras ocr opencv tensorflow
Last synced: 2 months ago
JSON representation
Python package for optical character recognition, CNN
- Host: GitHub
- URL: https://github.com/j03-dev/tarzan
- Owner: j03-dev
- License: mit
- Created: 2022-10-18T11:26:30.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2023-09-29T17:45:32.000Z (over 2 years ago)
- Last Synced: 2025-03-16T01:41:51.593Z (over 1 year ago)
- Topics: cnn, keras, ocr, opencv, tensorflow
- Language: Python
- Homepage:
- Size: 14.6 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Tarzan
Tarzan no me is a simple package to do optical character recognition with python.
This package is made with python with [tensorflow](https://www.tensorflow.org) develop by google ,
and [opencv-python](https://docs.opencv.org).
this package uses CNN (convolution neuron network) to do image recognition
## Build this package
```bash
python -m pip install --upgrade build
python -m build
```
## Install package
```bash
python -m pip install dist/tarzan-0.0.1-py3-none-any.whl
```
## Train your own Ocr model with dataset
* #### Example
```python
from tarzan import OcrModel
ocr_model = OcrModel(
'dataset-a-z/data/training_data',
'dataset-a-z/data/testing_data'
)
ocr_model.train_and_save(path="model_ocr_v1.model")
ocr_model.save_classes(path="classes")
```
### this is the dataset [ocr_dataset](https://www.kaggle.com/datasets/preatcher/standard-ocr-dataset)