{"id":13543066,"url":"https://github.com/beacandler/R2CNN","last_synced_at":"2025-04-02T12:31:11.524Z","repository":{"id":201273925,"uuid":"125308825","full_name":"beacandler/R2CNN","owner":"beacandler","description":"caffe re-implementation of R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection","archived":false,"fork":false,"pushed_at":"2018-04-21T09:48:30.000Z","size":884,"stargazers_count":80,"open_issues_count":5,"forks_count":27,"subscribers_count":11,"default_branch":"master","last_synced_at":"2024-11-03T09:33:40.870Z","etag":null,"topics":["caffe","deep-learning","ocr","scene-text-detection"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/beacandler.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2018-03-15T03:44:43.000Z","updated_at":"2024-07-17T06:33:04.000Z","dependencies_parsed_at":null,"dependency_job_id":"2968a96d-713c-490e-9abf-359d0d9472c5","html_url":"https://github.com/beacandler/R2CNN","commit_stats":null,"previous_names":["beacandler/r2cnn"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/beacandler%2FR2CNN","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/beacandler%2FR2CNN/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/beacandler%2FR2CNN/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/beacandler%2FR2CNN/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/beacandler","download_url":"https://codeload.github.com/beacandler/R2CNN/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246815451,"owners_count":20838441,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["caffe","deep-learning","ocr","scene-text-detection"],"created_at":"2024-08-01T11:00:22.383Z","updated_at":"2025-04-02T12:31:06.516Z","avatar_url":"https://github.com/beacandler.png","language":"C++","funding_links":[],"categories":["Text detection and localization"],"sub_categories":["Form Segmentation"],"readme":"\n# R\u003csup\u003e2\u003c/sup\u003eCNN: Rotational Region CNN for Orientation Robust Scene Text Detection\n\n## Abstract\nThis is a caffe re-implementation of [R\u003csup\u003e2\u003c/sup\u003eCNN: Rotational Region CNN for Orientation Robust Scene Text Detection](https://arxiv.org/abs/1706.09579).\n\nThis project is modified from [py-R-FCN](https://github.com/YuwenXiong/py-R-FCN), and [inclined nms](./src/lib/lanms) and [generate rotated box](./src/lib/fast_rcnn/icdar.py) component is imported from [EAST project](https://github.com/argman/EAST).\nThanks for the author's([@zxytim](https://github.com/zxytim) [@argman](https://github.com/argman)) help. Please cite [this paper](https://arxiv.org/abs/1704.03155v2) if you find this useful.\n\n## Contents\n1. [Abstract](##Abstract)\n2. [Structor](#Structor)\n3. [Installation](#Installation)\n4. [Demo](#Demo)\n5. [Test](#Test)\n6. [Train](#Train)\n7. [Experiments](#Experiments)\n8. [Furthermore](#Furthermore)\n\n\n## Structor\n### Code structor\n```\n.\n├── docker-compose.yml\n├── docker // docker deps file\n├── Dockerfile // docker build file\n├── model // model directory\n│   ├── caffemodel // trained caffe model\n│   ├── icdar15_gt // ICDAR2015 groundtruth\n│   ├── prototxt // caffe prototxt file\n│   └── imagenet_models // pretrained on imagenet\n├── nvidia-docker-compose.yml\n├── logs\n│   ├── submit // original submit file\n│   ├── submit_zip // zip submit file\n│   ├── snapshots\n│   └── train\n│       ├── VGG16.txt.*\n│       └── snapshots\n├── README.md\n├── requirements.txt // python package\n├── src\n│   ├── cfgs // train config yml\n│   ├── data // cache file\n│   ├── lib\n│   ├── _init_path.py\n│   ├── demo.py\n│   ├── eval_icdar15.py // eval 2015 icdar dataset F-meaure\n│   ├── test_net.py\n│   └── train_net.py\n├── demo.sh\n├── train.sh\n├── images // test images\n│   ├── img_1.jpg\n│   ├── img_2.jpg\n│   ├── img_3.jpg\n│   ├── img_4.jpg\n│   └── img_5.jpg\n└── test.sh // test script\n```\n### Data structor\nIt should have this basic structure\n```\nICDARdevkit_Root\n.\n├── ICDAR2013\n├── merge_train.txt  // images list contains ICDAR2013+ICDAR2015 train dataset, then raw data augmentation the same as the paper\n├── ICDAR2015\n│   ├── augmentation // contains all augmented images\n│   └── ImageSets/Main/test.txt // ICDAR2015 test images list\n```\n## Installation\n### Install caffe\nIt is highly recommended to use docker to build environment. More about how to configure docker, see [Running with Docker](https://github.com/beacandler/tf-slim-demo#Running)\nIf you are familiar with docker, please run\n```\n    1. nvidia-docker-compose run --rm --service-ports rrcnn bash\n    2. bash ./demo.sh\n```\nIf you don't familiar with docker, please follow [py-R-FCN](https://github.com/YuwenXiong/py-R-FCN) to install caffe.\n### Build\n```\n    cd src/lib \u0026\u0026 make\n    \n```\n### Download Model\n1. please download [VGG16 pre-trained model](https://pan.baidu.com/s/1Pok-AYU0Jl-DNKrSqF3vNg#list/path=%2FRRCNN%2Fmodel%2Fimagenet_models) on Imagenet, place it to model/imagenet_models/VGG16.v2.caffemodel.\n2. please download [VGG16 trained model](https://pan.baidu.com/s/1Pok-AYU0Jl-DNKrSqF3vNg#list/path=%2FRRCNN%2Fmodel%2Fcaffemodel) by this project, place it model/caffemodel/TextBoxes-v2_iter_12w.caffemodel.\n \n## Demo\nIt is recommended to use UNIX socket to support GUI for docker, plesase open another terminal and type:\n```bash\n    xhost + # may be you need it when open a new terminal\n    # docker-compose.yml: mount host  volume : /tmp/.X11-unix to docker volume: /tmp/.X11-unix  \n    # pass DISPLAY variable to docker container so host X server can display image in docker\n    docker exec -it -e DISPLAY=$DISPLAY ${CURRENT_CONTAINER_ID} bash\n    bash ./demo.sh\n```\n\n## Test\n### Single Test\n```bash\n    bash ./test.sh\n```\n### Multi-scale Test\n\n\n```bash\n    # please uncomment two lines in src/cfgs/faster_rcnn_end2end.yml\n    SCALES: [720, 1200]\n    MULTI_SCALES_NOC: True\n    # modify src/lib/datasets/icdar.py to find ICDAR2015 test data, please refer to commit @bbac1cf\n    # then run\n    bash ./test.sh\n```\n## Train\n### Train data\n\u003e * Mine: ICDAR2013+ICDAR2015 train dataset, and raw data augmentation, at last got 15977 images.\n\u003e * Paper: ICDAR2015 + 2000 focused scene text images they collected.\n\n### Train commands\n1. Go to ./src/lib/datasets/icdar.py, modify images path to let train.py find merge_train.txt images list.\n2. Remove cache in src/data/*.pkl or you can load cached [roidb data](https://pan.baidu.com/s/1Pok-AYU0Jl-DNKrSqF3vNg#list/path=%2FRRCNN%2Fcache_roidb_data\u0026parentPath=%2F) of this project, and place it to src/data/\n3. \n```bash\n    # Train for RRCNN4-TextBoxes-v2-OHEM\n    bash ./train.sh\n```\nnote: If you use USE_FLIPPED=True\u0026USE_FLIPPED_QUAD=True, you will get almost 31200 roidb.\n## Experiments\n\n### Mine VS Paper\n\n|Approaches|Anchor Scales|Pooled sizes|Inclined NMS|Test scales(short side)|F-measure(Mine VS paper)|\n|-------------------|:---------------------:|:-----:|:------------------:|:------------------:|:------------------:|\n|R\u003csup\u003e2\u003c/sup\u003eCNN-2 | (4, 8, 16)   | (7, 7) |Y|(720)|71.12% VS 68.49%|           \n|R\u003csup\u003e2\u003c/sup\u003eCNN-3 | (4, 8, 16)   | (7, 7) |Y|(720)|73.10% VS 74.29%|           \n|R\u003csup\u003e2\u003c/sup\u003eCNN-4 | (4, 8, 16, 32)| (7, 7) |Y|(720)|74.14% VS 74.36%|           \n|R\u003csup\u003e2\u003c/sup\u003eCNN-4 | (4, 8, 16, 32)| (7, 7) |Y|(720, 1200)|79.05% VS 81.80%|           \n|R\u003csup\u003e2\u003c/sup\u003eCNN-5 | (4, 8, 16, 32)| (7, 7) (11, 3) (3, 11) |Y|(720)|74.34% VS 75.34%|            \n|R\u003csup\u003e2\u003c/sup\u003eCNN-5 | (4, 8, 16, 32)| (7, 7) (11, 3) (3, 11) |Y|(720, 1200)|78.70% VS 82.54%|              \n\n### Appendixes\n\n\n|Approaches      | Anchor Scales | aspect ration| Pooled sizes | Inclined NMS| Test scales(short side)| F-measure|\n|-------------------|:-------------------:|:---------------------:|:-----:|:------------------:|:------------------:|:------------------:|\n|R\u003csup\u003e2\u003c/sup\u003eCNN-4 | (4, 8, 16, 32)|(0.5, 1, 2)| (7, 7) |Y|(720)|74.36%|           \n|R\u003csup\u003e2\u003c/sup\u003eCNN-4 | (4, 8, 16, 32)|(0.5, 1, 2)| (7, 7) |Y|(720, 1200)|VS 81.80%|           \n|R\u003csup\u003e2\u003c/sup\u003eCNN-4-TextBoxes-OHEM | (4, 8, 16, 32)|(0.5, 1, 2, 3, 5, 7, 10)| (7, 7) |Y|(720)|76.53%|          \n\n## Furthermore\n\nYou can try Resnet-50, Resnet-101 and so on.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbeacandler%2FR2CNN","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbeacandler%2FR2CNN","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbeacandler%2FR2CNN/lists"}