{"id":20046613,"url":"https://github.com/ncsoft/rotated-box-is-back","last_synced_at":"2025-05-05T09:31:44.803Z","repository":{"id":152713284,"uuid":"438814420","full_name":"ncsoft/rotated-box-is-back","owner":"ncsoft","description":"Accurate Box Proposal Network for Scene Text Detection","archived":false,"fork":false,"pushed_at":"2022-02-23T08:33:01.000Z","size":5304,"stargazers_count":31,"open_issues_count":0,"forks_count":4,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-04-08T20:51:36.967Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ncsoft.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-12-16T00:40:03.000Z","updated_at":"2024-09-24T00:28:18.000Z","dependencies_parsed_at":"2024-06-24T23:00:12.075Z","dependency_job_id":null,"html_url":"https://github.com/ncsoft/rotated-box-is-back","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ncsoft%2Frotated-box-is-back","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ncsoft%2Frotated-box-is-back/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ncsoft%2Frotated-box-is-back/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ncsoft%2Frotated-box-is-back/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ncsoft","download_url":"https://codeload.github.com/ncsoft/rotated-box-is-back/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252471526,"owners_count":21753204,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-13T11:25:14.337Z","updated_at":"2025-05-05T09:31:44.789Z","avatar_url":"https://github.com/ncsoft.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Rotated Box Is Back : Accurate Box Proposal Network for Scene Text Detection #\n![overview](./images/overview.jpg)\n## This material is supplementray code for paper accepted in ICDAR 2021 ##\n1. We highly recommend to use docker image because our model contains custom operation which depends on framework and cuda version.\n2. We provide trained model for ICDAR 2017, 2013 which is in final_checkpoint_ch8 and for ICDAR 2015 which is in final_checkpoint_ch4\n3. This code is mainly focused on inference. To train our model, training gpu like V100 is needed. please check our paper in detail. \n\n### REQUIREMENT ###\n1. Nvidia-docker \n2. Tensorflow 1.14\n3. Miminum GPU requirement : NVIDIA GTX 1080TI\n\n### INSTALLATION ###\n* Make docker image and container\n``` \ndocker build --tag rbimage ./dockerfile\ndocker run --runtime=nvidia --name rbcontainer -v /rotated-box-is-back-path:/rotated-box-is-back -i -t rbimage /bin/bash\n```\n\n* build custom operations in container\n``` \ncd /rotated-box-is-back/nms \ncmake ./\nmake\n./shell.sh\n```\n\n\n### SAMPLE IMAGE INFERENCE ###\n\n```\ncd /rotated-box-is-back/\npython viz.py --test_data_path=./sample --checkpoint_path=./final_checkpoint_ch8 --output_dir=./sample_result  --thres 0.6 --min_size=1600 --max_size=2000\n```\n\n\u003cdiv\u003e\n\u003cimg src=\"./images/ts_img_00416_resized_result.jpg\" height=\"1000\"\u003e\n\u003cimg src=\"./images/ts_img_01304_resized_result.jpg\" height=\"1000\"\u003e\n\u003c/div\u003e\n\n### ICDAR 2017 INFERENCE ###\n1. please replace icdar_testset_path to your-icdar-2017-testset-folder path.\n```\npython viz.py --test_data_path=icdar_testset_path --checkpoint_path=./final_checkpoint_ch8 --output_dir=./ic17  --thres 0.6 --min_size=1600 --max_size=2000\n```\n\n### ICDAR 2015 INFERENCE ###\n1. please replace icdar_testset_path to your-icdar-2015-testset-folder path.\n2. To converting evalutation format. Convert result text file like below\n```\npython viz.py --test_data_path=icdar_testset_path --checkpoint_path=./final_checkpoint_ch4 --output_dir=./ic15  --thres 0.7 --min_size=1100 --max_size=2000\npython text_postprocessing.py -i=./ic15/ -o=./ic15_format/ -e True\n```\n\n### ICDAR 2013 INFERENCE ###\n1. please replace icdar_testset_path to your-icdar-2013-testset-folder path.\n2. To converting evalutation format. Convert result text file like below\n```\npython viz.py --test_data_path=icdar_testset_path --checkpoint_path=./final_checkpoint_ch8 --output_dir=./ic13  --thres 0.55 --min_size=700 --max_size=900\npython text_postprocessing.py -i=./ic13/ -o=./ic13_format/ -e True -m rec\n```\n\n### EVALUATION TABLE ###\n\u003cdiv align=\"center\"\u003e\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003ctd colspan=\"3\"\u003eIC13\u003c/td\u003e\n    \u003ctd colspan=\"3\"\u003eIC15\u003c/td\u003e\n    \u003ctd colspan=\"3\"\u003eIC17\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eP\u003c/td\u003e\n    \u003ctd\u003eR\u003c/td\u003e\n    \u003ctd\u003eF\u003c/td\u003e\n    \u003ctd\u003eP\u003c/td\u003e\n    \u003ctd\u003eR\u003c/td\u003e\n    \u003ctd\u003eF\u003c/td\u003e\n    \u003ctd\u003eP\u003c/td\u003e\n    \u003ctd\u003eR\u003c/td\u003e\n    \u003ctd\u003eF\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e95.9\u003c/td\u003e\n    \u003ctd\u003e89.1\u003c/td\u003e\n    \u003ctd\u003e92.4\u003c/td\u003e\n    \u003ctd\u003e89.7\u003c/td\u003e\n    \u003ctd\u003e84.2\u003c/td\u003e\n    \u003ctd\u003e86.9\u003c/td\u003e\n    \u003ctd\u003e83.4\u003c/td\u003e\n    \u003ctd\u003e68.2\u003c/td\u003e\n    \u003ctd\u003e75.0\u003c/td\u003e\n  \u003c/tr\u003e\n\n\n\u003c/table\u003e\n\u003c/div\u003e\n\n### TRAINING ###\n1. It can be trained below command line\n```\npython train_refine_estimator.py --input_size=1024 --batch_size=2 --checkpoint_path=./finetuning --training_data_path=your-image-path --training_gt_path=your-gt-path  --learning_rate=0.00001 --max_epochs=500  --save_summary_steps=1000 --warmup_path=./final_checkpoint_ch8\n```\n### ACKNOWLEDGEMENT ###\nThis work was supported by Institute of Information \u0026 communications Technology Planning \u0026 Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 1711125972, Audio-Visual Perception for Autonomous Rescue Drones).\n\n### CITATION ###\nIf you found it is helpfull for your research, please cite: \n\n\u003e Lee J., Lee J., Yang C., Lee Y., Lee J. (2021) Rotated Box Is Back: An Accurate Box Proposal Network for Scene Text Detection. In: Lladós J., Lopresti D., Uchida S. (eds) Document Analysis and Recognition – ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science, vol 12824. Springer, Cham. https://doi.org/10.1007/978-3-030-86337-1_4\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fncsoft%2Frotated-box-is-back","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fncsoft%2Frotated-box-is-back","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fncsoft%2Frotated-box-is-back/lists"}