{"id":13543053,"url":"https://github.com/tonghe90/textspotter","last_synced_at":"2025-04-02T12:31:12.461Z","repository":{"id":56909139,"uuid":"124620422","full_name":"tonghe90/textspotter","owner":"tonghe90","description":null,"archived":false,"fork":false,"pushed_at":"2018-12-15T01:58:34.000Z","size":47199,"stargazers_count":324,"open_issues_count":18,"forks_count":112,"subscribers_count":19,"default_branch":"master","last_synced_at":"2024-11-03T09:33:39.249Z","etag":null,"topics":["caffe","cvpr2018","end-to-end","text-detection-recognition","textspotter"],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tonghe90.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-03-10T03:59:06.000Z","updated_at":"2024-04-28T01:58:30.000Z","dependencies_parsed_at":"2022-08-21T02:50:45.244Z","dependency_job_id":null,"html_url":"https://github.com/tonghe90/textspotter","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tonghe90%2Ftextspotter","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tonghe90%2Ftextspotter/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tonghe90%2Ftextspotter/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tonghe90%2Ftextspotter/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tonghe90","download_url":"https://codeload.github.com/tonghe90/textspotter/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246815453,"owners_count":20838441,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["caffe","cvpr2018","end-to-end","text-detection-recognition","textspotter"],"created_at":"2024-08-01T11:00:22.122Z","updated_at":"2025-04-02T12:31:07.452Z","avatar_url":"https://github.com/tonghe90.png","language":"Jupyter Notebook","funding_links":[],"categories":["Text detection and localization"],"sub_categories":["Form Segmentation"],"readme":"# An End-to-End TextSpotter with Explicit Alignment and Attention\n\nThis is initially described in our [CVPR 2018 paper](https://arxiv.org/abs/1803.03474).\n\n\u003cimg src='imgs/screenshot.png' height=\"350px\"\u003e\n\n\n## Getting Started\n### Installation\n- Clone the code\n```bash\ngit clone https://github.com/tonghe90/textspotter\ncd textspotter\n```\n\n- Install caffe. You can follow this [this tutorial](http://caffe.berkeleyvision.org/installation.html). \nIf you have build problem about std::allocater, please refer to [this #3](https://github.com/tonghe90/textspotter/issues/3)\n```bash\n# make sure you set WITH_PYTHON_LAYER := 1\n# change Makefile.config according to your library path\ncp Makefile.config.example Makefile.config\nmake clean\nmake -j8\nmake pycaffe\n```\n### Training\n```\nwe provide part of the training code. But you can not run this directly. \nWe have give the comment in the [train.pt](https://github.com/tonghe90/textspotter/models/train.pt).\nYou have to write your own layer, IOUloss layer. We cannot publish this for some IP reason. \nTo be noticed: \n[L6902](https://github.com/tonghe90/textspotter/models/train.pt#L6902) \n[L6947](https://github.com/tonghe90/textspotter/models/train.pt#L6907)\n```\n\n### Testing\n- install editdistance and pyclipper: `pip install editdistance` and  `pip install pyclipper`\n\n- After Caffe is set up, you need to download a trained model (about 40M) from [Google Drive](https://drive.google.com/open?id=1lzM-V1Ec8KHr8fKxeO_d1x3zFaj3bmnU). This model\n  is trained with [VGG800k](http://www.robots.ox.ac.uk/~vgg/data/scenetext/) and finetuned on [ICDAR2015](http://rrc.cvc.uab.es/?ch=4\u0026com=introduction).\n- Run `python test.py --img=./imgs/img_105.jpg`\n\n- hyperparameters:\n\n```\ncfg.py --mean_val ==\u003e mean value during the testing.\n       --max_len ==\u003e maximum length of the text string (here we take 25, meaning a word can contain 25 characters at most.)\n       --recog_th ==\u003e the threshold during the recognition process. The score for a word is the average mean of every character.\n       --word_score ==\u003e the threshold for those words that contain number or symbols for they are not contained in the dictionary.\n\ntest.py --weight ==\u003e weights file of caffemodel\n        --prototxt-iou ==\u003e the prototxt file for detection.\n        --prototxt-lstm ==\u003e the prototxt file for recognition.\n        --img ==\u003e the folder or img file for testing. The format can be added in ./pylayer/tool is_image function.\n        --scales-ms ==\u003e multiscales input for input during the testing process.\n        --thresholds-ms ==\u003e corresponding thresholds of text region for multiscale inputs.\n        --nms ==\u003e nms threshold for testing\n        --save-dir ==\u003e the dir for save results in format of ICDAR2015 submition.\n```\n\n```\nOne thing should be noted: the recognition results are achieved by comparing direct output with words in dictionary, which has about 90K lexicons. \nThese lexicons don't contain any number and symbol. You can delete dictionary reference part and directly output recognition results.\n```\n\n## Citation\nIf you use this code for your research, please cite our papers.\n```\n@inproceedings{tong2018,\n  title={An End-to-End TextSpotter with Explicit Alignment and Attention},\n  author={T. He and Z. Tian and W. Huang and C. Shen and Y. Qiao and C. Sun},\n  booktitle={Computer Vision and Pattern Recognition (CVPR), 2018 IEEE Conference on},\n  year={2018}\n}\n\n```\n## License\n\nThis code is for NON-COMMERCIAL purposes only. For commerical purposes, please contact Chunhua Shen \u003cchhshen@gmail.com\u003e.\nThis program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 3. Please refer to \u003chttp://www.gnu.org/licenses/\u003e for more details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftonghe90%2Ftextspotter","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftonghe90%2Ftextspotter","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftonghe90%2Ftextspotter/lists"}