{"id":18735375,"url":"https://github.com/wenmuzhou/dbnet.pytorch","last_synced_at":"2025-05-16T18:04:15.997Z","repository":{"id":46176874,"uuid":"224771387","full_name":"WenmuZhou/DBNet.pytorch","owner":"WenmuZhou","description":"A pytorch re-implementation of Real-time Scene Text Detection with Differentiable Binarization","archived":false,"fork":false,"pushed_at":"2022-12-29T04:26:21.000Z","size":1380,"stargazers_count":978,"open_issues_count":103,"forks_count":253,"subscribers_count":20,"default_branch":"master","last_synced_at":"2025-04-12T16:53:39.533Z","etag":null,"topics":["ocr","python3","pytorch","text-detection"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/WenmuZhou.png","metadata":{"files":{"readme":"README.MD","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-11-29T03:38:37.000Z","updated_at":"2025-04-03T11:24:01.000Z","dependencies_parsed_at":"2023-01-31T08:01:04.864Z","dependency_job_id":null,"html_url":"https://github.com/WenmuZhou/DBNet.pytorch","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WenmuZhou%2FDBNet.pytorch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WenmuZhou%2FDBNet.pytorch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WenmuZhou%2FDBNet.pytorch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WenmuZhou%2FDBNet.pytorch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/WenmuZhou","download_url":"https://codeload.github.com/WenmuZhou/DBNet.pytorch/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254582902,"owners_count":22095518,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ocr","python3","pytorch","text-detection"],"created_at":"2024-11-07T15:16:46.399Z","updated_at":"2025-05-16T18:04:15.946Z","avatar_url":"https://github.com/WenmuZhou.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Real-time Scene Text Detection with Differentiable Binarization\n\n**note**: some code is inherited from [MhLiao/DB](https://github.com/MhLiao/DB)\n\n[中文解读](https://zhuanlan.zhihu.com/p/94677957)\n\n![network](imgs/paper/db.jpg)\n\n## update \n2020-06-07: 添加灰度图训练，训练灰度图时需要在配置里移除`dataset.args.transforms.Normalize`\n\n## Install Using Conda\n```\nconda env create -f environment.yml\ngit clone https://github.com/WenmuZhou/DBNet.pytorch.git\ncd DBNet.pytorch/\n```\n\nor\n## Install Manually \n```bash\nconda create -n dbnet python=3.6\nconda activate dbnet\n\nconda install ipython pip\n\n# python dependencies\npip install -r requirement.txt\n\n# install PyTorch with cuda-10.1\n# Note that you can change the cudatoolkit version to the version you want.\nconda install pytorch torchvision cudatoolkit=10.1 -c pytorch\n\n# clone repo\ngit clone https://github.com/WenmuZhou/DBNet.pytorch.git\ncd DBNet.pytorch/\n\n```\n\n## Requirements\n* pytorch 1.4+\n* torchvision 0.5+\n* gcc 4.9+\n\n## Download\n\nTBD\n\n## Data Preparation\n\nTraining data: prepare a text `train.txt` in the following format, use '\\t' as a separator\n```\n./datasets/train/img/001.jpg\t./datasets/train/gt/001.txt\n```\n\nValidation data: prepare a text `test.txt` in the following format, use '\\t' as a separator\n```\n./datasets/test/img/001.jpg\t./datasets/test/gt/001.txt\n```\n- Store images in the `img` folder\n- Store groundtruth in the `gt` folder\n\nThe groundtruth can be `.txt` files, with the following format:\n```\nx1, y1, x2, y2, x3, y3, x4, y4, annotation\n```\n\n\n## Train\n1. config the `dataset['train']['dataset'['data_path']'`,`dataset['validate']['dataset'['data_path']`in [config/icdar2015_resnet18_fpn_DBhead_polyLR.yaml](cconfig/icdar2015_resnet18_fpn_DBhead_polyLR.yaml)\n* . single gpu train\n```bash\nbash singlel_gpu_train.sh\n```\n* . Multi-gpu training\n```bash\nbash multi_gpu_train.sh\n```\n## Test\n\n[eval.py](tools/eval.py) is used to test model on test dataset\n\n1. config `model_path` in [eval.sh](eval.sh)\n2. use following script to test\n```bash\nbash eval.sh\n```\n\n## Predict \n[predict.py](tools/predict.py) Can be used to inference on all images in a folder\n1. config `model_path`,`input_folder`,`output_folder` in [predict.sh](predict.sh)\n2. use following script to predict\n```\nbash predict.sh\n```\nYou can change the `model_path` in the `predict.sh` file to your model location. \n\ntips: if result is not good, you can change `thre` in [predict.sh](predict.sh) \n    \nThe project is still under development.\n\n\u003ch2 id=\"Performance\"\u003ePerformance\u003c/h2\u003e\n\n### [ICDAR 2015](http://rrc.cvc.uab.es/?ch=4)\nonly train on ICDAR2015 dataset\n\n| Method                   | image size (short size) |learning rate | Precision (%) | Recall (%) | F-measure (%) | FPS |\n|:--------------------------:|:-------:|:--------:|:--------:|:------------:|:---------------:|:-----:|\n| SynthText-Defrom-ResNet-18(paper)  | 736 |0.007 | 86.8 | 78.4 | 82.3 | 48 |\n| ImageNet-resnet18-FPN-DBHead  |736 |1e-3| 87.03 | 75.06 | 80.6 | 43 |\n| ImageNet-Defrom-Resnet18-FPN-DBHead  |736 |1e-3| 88.61 | 73.84 | 80.56 | 36 |\n| ImageNet-resnet50-FPN-DBHead  |736 |1e-3| 88.06 | 77.14 | 82.24 | 27 |\n| ImageNet-resnest50-FPN-DBHead  |736 |1e-3| 88.18 | 76.27 | 81.78 | 27 |\n\n\n### examples\nTBD\n\n\n### todo\n- [x] mutil gpu training\n\n### reference\n1. https://arxiv.org/pdf/1911.08947.pdf\n2. https://github.com/WenmuZhou/PANet.pytorch\n3. https://github.com/MhLiao/DB\n\n**If this repository helps you，please star it. Thanks.**\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwenmuzhou%2Fdbnet.pytorch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwenmuzhou%2Fdbnet.pytorch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwenmuzhou%2Fdbnet.pytorch/lists"}