{"id":20009067,"url":"https://github.com/stvir/pmtd","last_synced_at":"2025-09-16T05:02:27.535Z","repository":{"id":110283483,"uuid":"186947810","full_name":"STVIR/PMTD","owner":"STVIR","description":"Pyramid Mask Text Detector designed by SenseTime Video Intelligence Research team.","archived":false,"fork":false,"pushed_at":"2019-08-02T05:22:09.000Z","size":6729,"stargazers_count":215,"open_issues_count":20,"forks_count":223,"subscribers_count":25,"default_branch":"master","last_synced_at":"2025-09-16T05:02:07.062Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/STVIR.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-05-16T03:56:10.000Z","updated_at":"2025-03-19T02:17:31.000Z","dependencies_parsed_at":null,"dependency_job_id":"e42335fb-7a1b-4c16-94bd-3bcf1bef9a38","html_url":"https://github.com/STVIR/PMTD","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/STVIR/PMTD","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/STVIR%2FPMTD","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/STVIR%2FPMTD/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/STVIR%2FPMTD/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/STVIR%2FPMTD/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/STVIR","download_url":"https://codeload.github.com/STVIR/PMTD/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/STVIR%2FPMTD/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":275364760,"owners_count":25451517,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-16T02:00:10.229Z","response_time":65,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-13T07:14:12.404Z","updated_at":"2025-09-16T05:02:27.473Z","avatar_url":"https://github.com/STVIR.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# PMTD: Pyramid Mask Text Detector\nThis project hosts the inference code for implementing the PMTD algorithm for text detection, as presented in our paper:\n\n    Pyramid Mask Text Detector;\n    Liu Jingchao, Liu Xuebo, Sheng Jie, Liang Ding, Li Xin and Liu Qingjie;\n    arXiv preprint arXiv:1903.11800 (2019).\n\nThe full paper is available at: [https://arxiv.org/abs/1903.11800](https://arxiv.org/abs/1903.11800).\n\n![](./pmtd.png)\n\n## Installation\nCheck [INSTALL.md](INSTALL.md) for installation instructions.\n\n## Trained model\nWe provide trained model on ICDAR 2017 MLT dataset [here](https://drive.google.com/open?id=1kh5wXqvD1KkaSLtyEG8RUDUfSK1CHnQT) and ICDAR 2015 dataset [here](https://drive.google.com/open?id=1hI6uDaUefCrD1oYoKMdflTY6Ocl2Y46-) for downloading. Note that the result is slightly different from we reported in the paper, because PMTD is based on a private codebase, we reimplement inference code based on [maskrcnn-benchmark](https://github.com/facebookresearch/maskrcnn-benchmark).\n\nICDAR 2017  \n\nMethod|Precision|  Recall|    F-measure\n---|---|---|---\nThis project|85.13%|72.85%|    78.51%\nPaper reported|85.15%| 72.77%| 78.48%\n\nICDAR 2015  \n\nMethod|Precision|\tRecall|\tF-measure\n---|---|---|---\nThis project|87.48%|91.26%|\t89.33%\nPaper reported|87.43%| 91.30%| 89.33%\n\n## A quick demo\n\n```bash\ncd PROJECT_ROOT\npython demo/PMTD_demo.py \\\n--image_path=datasets/icdar2017mlt/ch8_validation_images/img_1.jpg \\\n--model_path=models/PMTD_ICDAR2017MLT.pth\n```\n\n## Perform testing on ICDAR 2017 MLT dataset\n\n### Prepare dataset\nWe recommend to symlink [ICDAR 2017 MLT](http://rrc.cvc.uab.es/?ch=8) dataset to `datasets/` as follows\n```bash\n# eg: ~/Projects/PMTD\ncd PROJECT_ROOT\n\nmkdir -p datasets/icdar2017mlt\ncd datasets/icdar2017mlt\n\n# symlink for images and annotations\nln -s /path_to_icdar2017mlt_dataset/ch8_test_images\n```\n\n### Generate coco label for dataset\n```bash\n# ${PWD} = datasets/icdar2017mlt\nmkdir annotations\ncd PROJECT_ROOT\npython demo/utils/generate_icdar2017.py\n# label will output to PROJECT_ROOT/datasets/icdar2017mlt/annotations/test_coco.json\n```\n\n### Test images\nIn the test stage, we use one GPU of TITANX 11G with a batch size 4. When encountering the out-of-memory (OOM) error, you may need to modify TEST.IMS_PER_BATCH in `configs/e2e_PMTD_R_50_FPN_1x_test.yaml`.\n```bash\n# the download model should place in the path: models/PMTD_ICDAR2017MLT.pth\npython tools/test_net.py --config=configs/e2e_PMTD_R_50_FPN_1x_ICDAR2017MLT_test.yaml\n# results will output to PROJECT_ROOT/inference/icdar_2017_mlt_test/\n# - bbox.json // when using coco evaluation criterion\n# - segm.json // when using coco evaluation criterion\n# - dataset.pth\n# - predictions.pth\n# - results_{scale}.pth, in default setting, scale=1600\n```\n\n### Convert results to ICDAR 2017 submission format\n```bash\npython demo/utils/convert_results_to_icdar.py\n# results will output to PROJECT_ROOT/inference/icdar_2017_mlt_test/\n# - icdar.zip\n```\n\n### submit icdar.zip to [ICDAR 2017 MLT](http://rrc.cvc.uab.es/?ch=8)\n\n## Citations\nPlease consider citing our paper in your publications if this project helps your research. BibTeX reference is as follows.\n```bibtex\n@article{liu2019pyramid,\n  title={Pyramid Mask Text Detector},\n  author={Liu, Jingchao and Liu, Xuebo and Sheng, Jie and Liang, Ding and Li, Xin and Liu, Qingjie},\n  journal={arXiv preprint arXiv:1903.11800},\n  year={2019}\n}\n```\n\n## Contributors\n\n- [Jingchao Liu](https://github.com/JingChaoLiu)\n- [Xuebo Liu](https://github.com/liuxuebo0)\n\n## License\nMaskrcnn-benchmark is released under the MIT license. PMTD is released under the [Apache 2.0 license](LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstvir%2Fpmtd","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fstvir%2Fpmtd","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstvir%2Fpmtd/lists"}