https://github.com/wenmuzhou/dbnet.pytorch

A pytorch re-implementation of Real-time Scene Text Detection with Differentiable Binarization
https://github.com/wenmuzhou/dbnet.pytorch

ocr python3 pytorch text-detection

Last synced: about 2 months ago
JSON representation

A pytorch re-implementation of Real-time Scene Text Detection with Differentiable Binarization

Host: GitHub
URL: https://github.com/wenmuzhou/dbnet.pytorch
Owner: WenmuZhou
License: apache-2.0
Created: 2019-11-29T03:38:37.000Z (over 5 years ago)
Default Branch: master
Last Pushed: 2022-12-29T04:26:21.000Z (over 2 years ago)
Last Synced: 2025-04-12T16:53:39.533Z (3 months ago)
Topics: ocr, python3, pytorch, text-detection
Language: Python
Homepage:
Size: 1.32 MB
Stars: 978
Watchers: 20
Forks: 253
Open Issues: 103
Metadata Files:
- Readme: README.MD
- License: LICENSE.md

Awesome Lists containing this project

README

        # Real-time Scene Text Detection with Differentiable Binarization

**note**: some code is inherited from [MhLiao/DB](https://github.com/MhLiao/DB)

[中文解读](https://zhuanlan.zhihu.com/p/94677957)

![network](imgs/paper/db.jpg)

## update 

2020-06-07: 添加灰度图训练，训练灰度图时需要在配置里移除`dataset.args.transforms.Normalize`

## Install Using Conda

```

conda env create -f environment.yml

git clone https://github.com/WenmuZhou/DBNet.pytorch.git

cd DBNet.pytorch/

```

or

## Install Manually 

```bash

conda create -n dbnet python=3.6

conda activate dbnet

conda install ipython pip

# python dependencies

pip install -r requirement.txt

# install PyTorch with cuda-10.1

# Note that you can change the cudatoolkit version to the version you want.

conda install pytorch torchvision cudatoolkit=10.1 -c pytorch

# clone repo

git clone https://github.com/WenmuZhou/DBNet.pytorch.git

cd DBNet.pytorch/

```

## Requirements

* pytorch 1.4+

* torchvision 0.5+

* gcc 4.9+

## Download

TBD

## Data Preparation

Training data: prepare a text `train.txt` in the following format, use '\t' as a separator

```

./datasets/train/img/001.jpg	./datasets/train/gt/001.txt

```

Validation data: prepare a text `test.txt` in the following format, use '\t' as a separator

```

./datasets/test/img/001.jpg	./datasets/test/gt/001.txt

```

- Store images in the `img` folder

- Store groundtruth in the `gt` folder

The groundtruth can be `.txt` files, with the following format:

```

x1, y1, x2, y2, x3, y3, x4, y4, annotation

```

## Train

1. config the `dataset['train']['dataset'['data_path']'`,`dataset['validate']['dataset'['data_path']`in [config/icdar2015_resnet18_fpn_DBhead_polyLR.yaml](cconfig/icdar2015_resnet18_fpn_DBhead_polyLR.yaml)

* . single gpu train

```bash

bash singlel_gpu_train.sh

```

* . Multi-gpu training

```bash

bash multi_gpu_train.sh

```

## Test

[eval.py](tools/eval.py) is used to test model on test dataset

1. config `model_path` in [eval.sh](eval.sh)

2. use following script to test

```bash

bash eval.sh

```

## Predict 

[predict.py](tools/predict.py) Can be used to inference on all images in a folder

1. config `model_path`,`input_folder`,`output_folder` in [predict.sh](predict.sh)

2. use following script to predict

```

bash predict.sh

```

You can change the `model_path` in the `predict.sh` file to your model location. 

tips: if result is not good, you can change `thre` in [predict.sh](predict.sh) 

    

The project is still under development.

Performance


### [ICDAR 2015](http://rrc.cvc.uab.es/?ch=4)

only train on ICDAR2015 dataset

| Method                   | image size (short size) |learning rate | Precision (%) | Recall (%) | F-measure (%) | FPS |

|:--------------------------:|:-------:|:--------:|:--------:|:------------:|:---------------:|:-----:|

| SynthText-Defrom-ResNet-18(paper)  | 736 |0.007 | 86.8 | 78.4 | 82.3 | 48 |

| ImageNet-resnet18-FPN-DBHead  |736 |1e-3| 87.03 | 75.06 | 80.6 | 43 |

| ImageNet-Defrom-Resnet18-FPN-DBHead  |736 |1e-3| 88.61 | 73.84 | 80.56 | 36 |

| ImageNet-resnet50-FPN-DBHead  |736 |1e-3| 88.06 | 77.14 | 82.24 | 27 |

| ImageNet-resnest50-FPN-DBHead  |736 |1e-3| 88.18 | 76.27 | 81.78 | 27 |

### examples

TBD

### todo

- [x] mutil gpu training

### reference

1. https://arxiv.org/pdf/1911.08947.pdf

2. https://github.com/WenmuZhou/PANet.pytorch

3. https://github.com/MhLiao/DB

**If this repository helps you，please star it. Thanks.**

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/wenmuzhou/dbnet.pytorch

Awesome Lists containing this project

README

Performance