Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/watersink/ocrsegment
a deep learning model for page layout analysis / segmentation.
https://github.com/watersink/ocrsegment
Last synced: 10 days ago
JSON representation
a deep learning model for page layout analysis / segmentation.
- Host: GitHub
- URL: https://github.com/watersink/ocrsegment
- Owner: watersink
- Created: 2018-07-31T09:10:12.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2019-11-04T02:53:58.000Z (about 5 years ago)
- Last Synced: 2024-08-02T11:14:26.700Z (3 months ago)
- Language: Python
- Size: 2.19 MB
- Stars: 100
- Watchers: 6
- Forks: 27
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# OCR Segmentation
a deep learning model for page layout analysis / segmentation.## dependencies
tensorflow1.8
>
python3## dataset:
[uw3-framed-lines-degraded-000](https://storage.googleapis.com/tmb-ocr/uw3-framed-lines-degraded-000.tgz)## make training labels
python3 data_pre_process.py## train
python3 train_test.py
## test
python3 segmentation.py
![image]( https://github.com/watersink/ocrsegment/blob/master/make_training_labels/W001.png)
![image]( https://github.com/watersink/ocrsegment/blob/master/make_training_labels/out.png)
![image]( https://github.com/watersink/ocrsegment/blob/master/lines/0.png)## references
[Multi-Dimensional Recurrent Neural Networks](https://arxiv.org/abs/0705.2011)
[Robust_ Simple Page Segmentation Using Hybrid Convolutional MDLSTM Networks](https://github.com/wanghaisheng/awesome-ocr/files/2042377/Robust_.Simple.Page.Segmentation.Using.Hybrid.Convolutional.MDLSTM.Networks.pdf)
[https://github.com/NVlabs/ocroseg](https://github.com/NVlabs/ocroseg)
[https://github.com/philipperemy/tensorflow-multi-dimensional-lstm](https://github.com/philipperemy/tensorflow-multi-dimensional-lstm)