https://github.com/whitelok/image-text-localization-recognition

A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集シーンテキストの位置認識と識別のための論文リソースの要約
https://github.com/whitelok/image-text-localization-recognition
awesome convolutional-neural-networks deep-learning deep-learning-algorithms machine-learning ocr scene-texts text-detection text-extraction text-recognition
Last synced: about 2 months ago
JSON representation
A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集シーンテキストの位置認識と識別のための論文リソースの要約
Host: GitHub
URL: https://github.com/whitelok/image-text-localization-recognition
Owner: whitelok
Created: 2017-02-09T07:41:41.000Z (about 8 years ago)
Default Branch: master
Last Pushed: 2023-09-17T13:41:49.000Z (over 1 year ago)
Last Synced: 2024-08-01T04:02:07.826Z (9 months ago)
Topics: awesome, convolutional-neural-networks, deep-learning, deep-learning-algorithms, machine-learning, ocr, scene-texts, text-detection, text-extraction, text-recognition
Homepage:
Size: 333 KB
Stars: 937
Watchers: 76
Forks: 237
Open Issues: 0
Metadata Files:
- Readme: README.md
Awesome Lists containing this project

awesomeai - Scene Text Localization & Recognition Resources
awesome-ai-awesomeness - Scene Text Localization & Recognition Resources
awesome-ai-awesomeness - Scene Text Localization & Recognition Resources
README

        # Scene Text Localization & Recognition Resources

*Read this institute-wise: [English](README.md), [简体中文](README.zh-cn.md).*

*Read this year-wise: [English](README.yearwise.md), [简体中文](README.zh-cn.yearwise.md).*

*Tags: [STL] (Scene Text Localization), [TR] (Text Recognition)*

*[STL] (Scene Text Localization) Detect text area from scene input image*

*[TR] (Text Recognition) Recognize text content*

**Last update: Sep.17 2023**

## 1. Papers & Code

#### Overview

- [2020-arxiv] Text Detection and Recognition in the Wild: A Review [`paper`](https://arxiv.org/pdf/2006.04305.pdf)

- [2020-arxiv] Text Recognition in the Wild: A Survey [`paper`](https://arxiv.org/pdf/2005.03492.pdf)

- [2020-IJCV] Scene Text Detection and Recognition: The Deep Learning Era [`paper`](https://arxiv.org/pdf/1811.04256.pdf)

- [2019-ICCV] What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis [`paper`](http://openaccess.thecvf.com/content_ICCV_2019/html/Baek_What_Is_Wrong_With_Scene_Text_Recognition_Model_Comparisons_Dataset_ICCV_2019_paper.html) [`code`](https://github.com/clovaai/deep-text-recognition-benchmark)

- [2016-TIP] Text Detection Tracking and Recognition in Video: A Comprehensive Survey [`paper`](http://ieeexplore.ieee.org/application/enterprise/entconfirmation.jsp?arnumber=7452620&icp=false)

- [2015-PAMI] Text Detection and Recognition in Imagery: A Survey [`paper`](http://lampsrv02.umiacs.umd.edu/pubs/Papers/qixiangye-14/qixiangye-14.pdf)

- [2014-Front.Comput.Sci] Scene Text Detection and Recognition: Recent Advances and Future Trends [`paper`](http://mc.eistar.net/uploadfiles/Papers/FCS_TextSurvey_2015.pdf)

#### University of Oxford

- [2020-ECCV][STL][TR] Adaptive Text Recognition through Visual Matching [`paper`](http://www.ecva.net/papers/eccv_2020/papers_ECCV/html/2492_ECCV_2020_paper.php) [`code`](https://github.com/Chuhanxx/FontAdaptor)

- [2018-BMVC][TR] Inductive Visual Localisation: Factorised Training for Superior Generalisation [`paper`](https://arxiv.org/abs/1807.08179)

- [2016-IJCV][STL][TR] Reading Text in the Wild with Convolutional Neural Networks  [`paper`](http://arxiv.org/abs/1412.1842) [`demo`](http://zeus.robots.ox.ac.uk/textsearch/#/search/)  [`homepage`](http://www.robots.ox.ac.uk/~vgg/research/text/)

- [2016-CVPR][STL] Synthetic Data for Text Localisation in Natural Images [`paper`](http://www.robots.ox.ac.uk/~vgg/data/scenetext/gupta16.pdf) [`code`](https://github.com/ankush-me/SynthText) [`data`](http://www.robots.ox.ac.uk/~vgg/data/scenetext/)

- [2015-ICLR][TR] Deep structured output learning for unconstrained text recognition [`paper`](http://arxiv.org/abs/1412.5903)

- [2015-PhD Thesis][STL] Deep Learning for Text Spotting

 [`paper`](http://www.robots.ox.ac.uk/~vgg/publications/2015/Jaderberg15b/jaderberg15b.pdf) [`code`](https://bitbucket.org/jaderberg/eccv2014_textspotting)

- [2014-ECCV][STL] Deep Features for Text Spotting [`paper`](http://www.robots.ox.ac.uk/~vgg/publications/2014/Jaderberg14/jaderberg14.pdf) [`code`](https://bitbucket.org/jaderberg/eccv2014_textspotting) [`model`](https://bitbucket.org/jaderberg/eccv2014_textspotting)

- [2014-NIPS][TR] Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition [`paper`](http://www.robots.ox.ac.uk/~vgg/publications/2014/Jaderberg14c/jaderberg14c.pdf)  [`homepage`](http://www.robots.ox.ac.uk/~vgg/publications/2014/Jaderberg14c/) [`model`](http://www.robots.ox.ac.uk/~vgg/research/text/model_release.tar.gz)

#### Shenzhen Institutes of Advanced Technology

- [2018-arxiv][STL][TR] FOTS: Fast Oriented Text Spotting with a Unified Network [`paper`](https://arxiv.org/abs/1801.01671)

- [2016-ECCV][STL] CTPN: Detecting Text in Natural Image with Connectionist Text Proposal Network [`paper`](https://arxiv.org/abs/1609.03605) [`code`](https://github.com/tianzhi0549/CTPN)

- [2016-CVPR][STL] Accurate Text Localization in Natural Image with Cascaded Convolutional Text Network [`paper`](http://arxiv.org/abs/1603.09423)

- [2016-AAAI][STL] Reading Scene Text in Deep Convolutional Sequences [`paper`](http://whuang.org/papers/phe2016_aaai.pdf)

- [2016-TIP][STL] Text-Attentional Convolutional Neural Networks for Scene Text Detection [`paper`](http://whuang.org/papers/the2016_tip.pdf)

- [2016-TIP][STL] Text-Attentional Convolutional Neural Network for Scene Text Detection [`paper`](https://arxiv.org/pdf/1510.03283.pdf)

- [2014-ECCV][STL] Robust Scene Text Detection with Convolution Neural Network Induced MSER Trees [`paper`](http://www.whuang.org/papers/whuang2014_eccv.pdf)

#### South China University of Technology

- [2021-IJCV][STL] Exploring the Capacity of an Orderless Box Discretization Network for Multi-orientation Scene Text Detection [`paper`](https://arxiv.org/pdf/1912.09629.pdf) [`code`](https://github.com/Yuliang-Liu/Box_Discretization_Network)

- [2021-CVPR][STL] Fourier Contour Embedding for Arbitrary-Shaped Text Detection [`paper`](https://openaccess.thecvf.com/content/CVPR2021/papers/Zhu_Fourier_Contour_Embedding_for_Arbitrary-Shaped_Text_Detection_CVPR_2021_paper.pdf)

- [2021-CVPR][TR][STL] Implicit Feature Alignment: Learn To Convert Text Recognizer to Text Spotter [`paper`](https://openaccess.thecvf.com/content/CVPR2021/papers/Wang_Implicit_Feature_Alignment_Learn_To_Convert_Text_Recognizer_to_Text_CVPR_2021_paper.pdf) [`code`](https://github.com/Wang-Tianwei/Implicit-feature-alignment)

- [2020-CVPR][TR] Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition [`paper`](https://openaccess.thecvf.com/content_CVPR_2020/html/Luo_Learn_to_Augment_Joint_Data_Augmentation_and_Network_Optimization_for_CVPR_2020_paper.html) [`code`](https://github.com/Canjie-Luo/Text-Image-Augmentation)

- [2020-AAAI][STL][TR] Decoupled Attention Network for Text Recognition [`paper`](https://arxiv.org/pdf/1912.10205.pdf)

- [2020-CVPR][STL][TR] ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network [`paper`](https://arxiv.org/pdf/2002.10200.pdf) [`code`](https://github.com/Yuliang-Liu/bezier_curve_text_spotting)

- [2020-IJCV][TR] Separating Content from Style Using Adversarial Learning for Recognizing Text in the Wild [`paper`](https://arxiv.org/pdf/2001.04189.pdf)

- [2019-Pattern Recognition][TR] A Multi-Object Rectified Attention Network for Scene Text Recognition [`paper`](https://arxiv.org/pdf/1901.03003.pdf) [`code`](https://github.com/Canjie-Luo/MORAN_v2)

- [2019-CVPR][TR] Aggregation Cross-Entropy for Sequence Recognition [`paper`](https://arxiv.org/pdf/1904.08364.pdf) [`code`](https://github.com/summerlvsong/Aggregation-Cross-Entropy)

- [2019-arxiv][STL] Exploring the Capacity of an Orderless Box Discretization Network for Multi-orientation Scene Text Detection [`paper`](https://arxiv.org/pdf/1912.09629.pdf) [`code`](https://github.com/Yuliang-Liu/Box_Discretization_Network) [`code`](https://git.io/TextDet)

- [2019-CVPR][STL] Tightness-Aware Evaluation Protocol for Scene Text Detection [`paper`](http://openaccess.thecvf.com/content_CVPR_2019/html/Liu_Tightness-Aware_Evaluation_Protocol_for_Scene_Text_Detection_CVPR_2019_paper.html)

- [2018-AAAI][STL] Feature Enhancement Network: A Refined Scene Text Detector [`paper`](https://arxiv.org/pdf/1711.04249.pdf)

- [2017-arXiv][STL] Detecting Curve Text in the Wild: New Dataset and New Solution [`paper`](https://arxiv.org/pdf/1712.02170)

- [2020-arxiv][TR] Adaptive Embedding Gate for Attention-Based Scene Text Recognition [`paper`](https://arxiv.org/pdf/1908.09475.pdf)

- [2017-PAMI][TR] Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition [`paper`](http://discovery.ucl.ac.uk/1569458/1/TPAMI-2016-08-0656-R2.pdf)

- [2017-CVPR][STL] Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection [`paper`](https://arxiv.org/abs/1703.01425)

- [2016-arXiv][STL] DeepText:A Unified Framework for Text Proposal Generation and Text Detection in Natural Images [`paper`](http://arxiv.org/abs/1605.07314)

- [2016-IEEE Transactions on Multimedia][STL] A Convolutional Neural Network Based Chinese Text Detection Algorithm Via Text Structure Modeling [`paper`](http://www2.egr.uh.edu/~zhan2/ECE6111_spring2017/A%20Convolutional%20Neural%20Network%20%20Based%20Chinese%20Text%20Detection%20Algorithm%20Via%20Text%20Structure%20Modeling.pdf)

#### Fudan University

- [2022-AAAI][TR] Text Gestalt: Stroke-Aware Scene Text Image Super-resolution [`paper`](https://ojs.aaai.org/index.php/AAAI/article/view/19904) [`code`](https://github.com/FudanVI/FudanOCR)

- [2023-MM][TR] Chinese Character Recognition with Augmented Character Profile Matching [`paper`](https://dl.acm.org/doi/abs/10.1145/3503161.3547827) [`code`](https://github.com/FudanVI/FudanOCR)

- [2023-ICCV][TR] Chinese Text Recognition with A Pre-Trained CLIP-Like Model Through Image-IDS Aligning [`paper`](https://arxiv.org/abs/2309.01083) [`code`](https://github.com/FudanVI/FudanOCR)

- [2023-arxiv][STL][TR] Weakly-Supervised Text Instance Segmentation [`paper`](https://arxiv.org/abs/2303.10848) [`code`](https://github.com/FudanVI/FudanOCR)

- [2023-IJCAI][TR] Orientation-Independent Chinese Text Recognition in Scene Images [`paper`](https://www.ijcai.org/proceedings/2023/0185.pdf)

- [2023-IJCAI][TR] TPS++: Attention-Enhanced Thin-Plate Spline for Scene Text Recognition [`paper`](https://www.ijcai.org/proceedings/2023/0197.pdf) [`code`](https://github.com/simplify23/TPS_PP)

- [2023-IJCAI][STL][TR] Towards Accurate Video Text Spotting with Text-wise Semantic Reasoning [`paper`](https://www.ijcai.org/proceedings/2023/0206.pdf) [`code`](https://github.com/FudanVI/FudanOCR)

- [2022-MM][TR] Chinese Character Recognition with Augmented Character Profile Matching [`paper`](https://dl.acm.org/doi/abs/10.1145/3503161.3547827) [`code`](https://github.com/FudanVI/FudanOCR)

- [2022-WACV][TR] Robustly Recognizing Irregular Scene Text by Rectifying Principle Irregularities [`paper`](https://openaccess.thecvf.com/content/WACV2022/papers/Xu_Robustly_Recognizing_Irregular_Scene_Text_by_Rectifying_Principle_Irregularities_WACV_2022_paper.pdf)

- [2021-IJCAI][TR] Zero-Shot Chinese Character Recognition with Stroke-Level Decomposition [`paper`](https://www.ijcai.org/proceedings/2021/0085.pdf) [`code`](https://github.com/FudanVI/FudanOCR)

- [2022-IJCAI][TR] C3-STISR: Scene Text Image Super-resolution with Triple Clues [`paper`](https://www.ijcai.org/proceedings/2022/0238.pdf) [`code`][https://github.com/zhaominyiz/C3-STISR]

- [2021-CVPR][TR] Scene Text Telescope: Text-Focused Scene Image Super-Resolution [`paper`](https://openaccess.thecvf.com/content/CVPR2021/papers/Chen_Scene_Text_Telescope_Text-Focused_Scene_Image_Super-Resolution_CVPR_2021_paper.pdf)

- [2020-arxiv][TR] Text Recognition in Real Scenarios with a Few Labeled Samples [`paper`](https://arxiv.org/pdf/2006.12209.pdf)

- [2018-CVPR][TR] Edit Probability for Scene Text Recognition [`paper`](http://openaccess.thecvf.com/content_cvpr_2018/papers/Bai_Edit_Probability_for_CVPR_2018_paper.pdf)

- [2017-arXiv][STL] Arbitrary-Oriented Scene Text Detection via Rotation Proposals [`paper`](https://arxiv.org/abs/1703.01086) [`code`](https://github.com/mjq11302010044/RRPN)

#### Huazhong University of Science and Technology

- [2021-CVPR][STL][TR] Scene Text Retrieval via Joint Text Detection and Similarity Learning [`paper`](https://openaccess.thecvf.com/content/CVPR2021/papers/Wang_Scene_Text_Retrieval_via_Joint_Text_Detection_and_Similarity_Learning_CVPR_2021_paper.pdf) [`code`](https://github.com/lanfeng4659/STR-TDSL)

- [2021-CVPR][STL] MOST: A Multi-Oriented Scene Text Detector With Localization Refinement [`paper`](https://openaccess.thecvf.com/content/CVPR2021/papers/He_MOST_A_Multi-Oriented_Scene_Text_Detector_With_Localization_Refinement_CVPR_2021_paper.pdf)

- [2020-ECCV][TR] AutoSTR: Efficient Backbone Search for Scene Text Recognition [`paper`](http://www.ecva.net/papers/eccv_2020/papers_ECCV/html/4796_ECCV_2020_paper.php)

- [2020-AAAI][STL][TR] All You Need Is Boundary: Toward Arbitrary-Shaped Text Spotting [`paper`](https://arxiv.org/pdf/1911.09550.pdf)

- [2020-AAAI][STL] Real-time Scene Text Detection with Differentiable Binarization [`paper`](https://arxiv.org/pdf/1911.08947.pdf) [`code`](https://github.com/MhLiao/DB)

- [2020-ECCV][STL][TR] Mask TextSpotter V3: Segmentation Proposal Network for Robust Scene Text Spotting [`paper`](http://www.ecva.net/papers/eccv_2020/papers_ECCV/html/1436_ECCV_2020_paper.php) [`code`](https://github.com/MhLiao/MaskTextSpotterV3)

- [2019-PAMI][TR] ASTER: An Attentional Scene Text Recognizer with Flexible Rectification [`paper`](https://ieeexplore.ieee.org/document/8395027) [`code`](https://github.com/ayumiymk/aster.pytorch)

- [2019-AAAI][TR] Scene Text Recognition from Two-Dimensional Perspective [`paper`](https://arxiv.org/pdf/1809.06508.pdf)

- [2019-PAMI][STL] Gliding vertex on the horizontal bounding box for multi-oriented object detection [`paper`](https://arxiv.org/pdf/1911.09358.pdf) [`code`](https://github.com/MingtaoFu/gliding_vertex)

- [2019-ICCV][TR] Symmetry-Constrained Rectification Network for Scene Text Recognition [`paper`](http://openaccess.thecvf.com/content_ICCV_2019/html/Yang_Symmetry-Constrained_Rectification_Network_for_Scene_Text_Recognition_ICCV_2019_paper.html)

- [2018-arxiv][STL] TextField: Learning A Deep Direction Field for Irregular Scene Text Detection [`paper`](https://arxiv.org/pdf/1812.01393.pdf) [`code`](https://github.com/YukangWang/TextField)

- [2018-ECCV][TR][STL] Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes [`paper`](http://openaccess.thecvf.com/content_ECCV_2018/papers/Pengyuan_Lyu_Mask_TextSpotter_An_ECCV_2018_paper.pdf)

- [2018-ICIP][STL] Feature Fusion Network for Scene Text Detection [`paper`](https://ieeexplore.ieee.org/document/8395194/)

- [2018-CVPR][STL] Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation [`paper`](http://openaccess.thecvf.com/content_cvpr_2018/papers/Lyu_Multi-Oriented_Scene_Text_CVPR_2018_paper.pdf)

- [2018-CVPR][STL] Rotation-sensitive Regression for Oriented Scene Text Detection [`paper`](http://openaccess.thecvf.com/content_cvpr_2018/papers/Liao_Rotation-Sensitive_Regression_for_CVPR_2018_paper.pdf)

- [2018-TIP][STL] TextBoxes++: A Single-Shot Oriented Scene Text Detector [`paper`](https://arxiv.org/abs/1801.02765) [`code`](https://github.com/MhLiao/TextBoxes_plusplus)

- [2017-AAAI][STL] TextBoxes: A Fast TextDetector with a Single Deep Neural Network [`paper`](https://arxiv.org/abs/1611.06779) [`code`](https://github.com/MhLiao/TextBoxes)

- [2017-CVPR][STL] Detecting Oriented Text in Natural Images by Linking Segments [`paper`](http://mclab.eic.hust.edu.cn/UpLoadFiles/Papers/SegLink_CVPR17.pdf) [`code`](https://github.com/bgshih/seglink)

- [2016-CVPR][TR] Robust scene text recognition with automatic rectification [`paper`](http://arxiv.org/pdf/1603.03915v2.pdf)

- [2016-arXiv][STL] Scene Text Detection via Holistic, Multi-Channel Prediction [`paper`](https://arxiv.org/abs/1606.09002)

- [2016-CVPR][STL] Multi-oriented text detection with fully convolutional networks    [`paper`](http://mclab.eic.hust.edu.cn/UpLoadFiles/Papers/TextDectionFCN_CVPR16.pdf)

- [2015-PAMI][TR] An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition [`paper`](http://arxiv.org/pdf/1507.05717v1.pdf) [`code`](http://mclab.eic.hust.edu.cn/~xbai/CRNN/crnn_code.zip) [`code`](https://github.com/bgshih/crnn)

- [2014-CVPR][TR] Strokelets: A Learned Multi-Scale Representation for Scene Text Recognition [`paper`](https://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Yao_Strokelets_A_Learned_2014_CVPR_paper.pdf)

#### Universitat Autònoma de Barcelona

- [2019-ICCV][STL][TR] Scene Text Visual Question Answering [`paper`](http://openaccess.thecvf.com/content_ICCV_2019/html/Biten_Scene_Text_Visual_Question_Answering_ICCV_2019_paper.html)

- [2018-ECCV][STL] Single Shot Scene Text Retrieval [`paper`](http://openaccess.thecvf.com/content_ECCV_2018/papers/Lluis_Gomez_Single_Shot_Scene_ECCV_2018_paper.pdf)

- [2017-arXiv][STL] Improving Text Proposal for Scene Images with Fully Convolutional Networks [`paper`](https://arxiv.org/abs/1702.05089)

- [2016-arXiv][STL] TextProposals: a Text-specific Selective Search Algorithm for Word Spotting in the Wild [`paper`](https://arxiv.org/pdf/1604.02619.pdf) [`code`](https://github.com/lluisgomez/TextProposals)

- [2015-ICDAR][STL] Object Proposals for Text Extraction in the Wild [`paper`](http://arxiv.org/abs/1509.02317) [`code`](https://github.com/lluisgomez/TextProposals)

- [2014-PAMI][TR] Word Spotting and Recognition with Embedded Attributes [`paper`](http://www.cvc.uab.es/~afornes/publi/journals/2014_PAMI_Almazan.pdf) [`homepage`](http://www.cvc.uab.es/~almazan/index/projects/words-att/index.html) [`code`](https://github.com/almazan/watts)

#### Stanford University

- [2012-ICPR][TR] End-to-End Text Recognition with Convolutional Neural Networks [`paper`](http://www.cs.stanford.edu/~acoates/papers/wangwucoatesng_icpr2012.pdf) [`code`](http://cs.stanford.edu/people/twangcat/ICPR2012_code/SceneTextCNN_demo.tar) [`SVHN Dataset`](http://ufldl.stanford.edu/housenumbers/)

- [2012-PhD Thesis][TR] End-to-End Text Recognition with Convolutional Neural Networks [`paper`](http://cs.stanford.edu/people/dwu4/HonorThesis.pdf)

#### Seoul National University

- [2017-AAAI][STL][TR] Detection and Recognition of Text Embedding in Online Images via Neural Context Models [`paper`](https://github.com/cmkang/CTSN/blob/master/aaai2017_cameraready.pdf)

#### Megvii Technology Inc: Face++

- [2020-CVPR][TR] On Vocabulary Reliance in Scene Text Recognition [`paper`](https://openaccess.thecvf.com/content_CVPR_2020/html/Wan_On_Vocabulary_Reliance_in_Scene_Text_Recognition_CVPR_2020_paper.html)

- [2020-AAAI][STL][TR] TextScanner: Reading Characters in Order for Robust Scene Text Recognition [`paper`](https://arxiv.org/pdf/1912.12422.pdf)

- [2017-CVPR][STL] EAST: An Efficient and Accurate Scene Text Detector [`paper`](https://arxiv.org/abs/1704.03155) [`code`](https://github.com/argman/EAST) [`code with improvement`](https://github.com/huoyijie/AdvancedEAST)

#### Institute of Automation, Chinese Academy of Sciences

- [2020-IJCV][STL][TR] Residual Dual Scale Scene Text Spotting by Fusing Bottom-Up and Top-Down Processing [`paper`](https://link.springer.com/article/10.1007/s11263-020-01388-x)

- [2019-CVPR][TR] Sequence-to-Sequence Domain Adaptation Networkfor Robust Text Image Recognition [`paper`](https://ieeexplore.ieee.org/abstract/document/8953495)

- [2019-ICCV][STL][TR] TextDragon: An End-to-End Framework for Arbitrary Shaped Text Spotting [`paper`](http://openaccess.thecvf.com/content_ICCV_2019/html/Feng_TextDragon_An_End-to-End_Framework_for_Arbitrary_Shaped_Text_Spotting_ICCV_2019_paper.html)

- [2018-arxiv][TR] NRTR: A No-Recurrence Sequence-to-Sequence Model For Scene Text Recognition [`paper`](https://arxiv.org/pdf/1806.00926.pdf) [`code`](https://github.com/Belval/NRTR)

- [2018-arxiv][TR] SCAN: Sliding Convolutional Attention Network for Scene Text Recognition [`paper`](https://arxiv.org/pdf/1806.00578.pdf) [`code`](https://github.com/nameful/SCAN)

- [2018-arxiv][TR] Recurrent Calibration Network for Irregular Text Recognition [`paper`](https://arxiv.org/pdf/1812.07145.pdf)  

- [2017-arxiv][TR] Scene Text Recognition with Sliding Convolutional Character Models [`paper`](https://arxiv.org/pdf/1709.01727.pdf) [`code`](https://github.com/lsvih/Sliding-Convolution)

- [2017-arXiv][STL] Deep Direct Regression for Multi-Oriented Scene Text Detection [`paper`](https://arxiv.org/abs/1703.08289)

- [2017-IAPR][STL] Scene Text Detection with Novel Superpixel Based Character Candidate Extraction [`paper`](https://ieeexplore.ieee.org/abstract/document/8270087)

#### University of California, San Diego

- [2016-CVPR][TR] Recursive Recurrent Nets with Attention Modeling for OCR in the Wild [`paper`](http://arxiv.org/pdf/1603.03101v1.pdf)

#### University of California, Santa Cruz

- [2017-arXiv][STL] Cascaded Segmentation-Detection Networks for Word-Level Text Spotting [`paper`](https://arxiv.org/abs/1704.00834)

#### Cornell University

- [2016-arXiv][STL][TR] COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images [`paper`](http://vision.cornell.edu/se3/wp-content/uploads/2016/01/1601.07140v1.pdf)

#### Pennsylvania State University

- [2017-WACV][STL] TextContourNet: A Flexible and Effective Framework for Improving Scene Text Detection Architecture With a Multi-Task Cascade [`paper`](https://arxiv.org/pdf/1809.03050.pdf)

- [2016-PhD Thesis][STL] Context Modeling for Semantic Text Matching and Scene Text Detection [`paper`](https://etda.libraries.psu.edu/catalog/zw12z528p)

#### University of Science and Technology Beijing

- [2021-ICCV][STL] Adaptive Boundary Proposal Network for Arbitrary Shape Text Detection [`paper`](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhang_Adaptive_Boundary_Proposal_Network_for_Arbitrary_Shape_Text_Detection_ICCV_2021_paper.pdf) [`code`](https://github.com/GXYM/TextBPN)

- [2020-CVPR][STL] Deep Relational Reasoning Graph Network for Arbitrary Shape Text Detection [`paper`](https://openaccess.thecvf.com/content_CVPR_2020/html/Zhang_Deep_Relational_Reasoning_Graph_Network_for_Arbitrary_Shape_Text_Detection_CVPR_2020_paper.html)

- [2017-arxiv][TR] AdaDNNs: Adaptive Ensemble of Deep Neural Networks for Scene Text Recognition [`paper`](https://arxiv.org/pdf/1710.03425.pdf)

- [2016-IJCAI][STL] Scene Text Detection in Video by Learning Locally and Globally [`paper`](https://www.ijcai.org/Proceedings/16/Papers/376.pdf)

- [2014-PAMI][TR] Robust Text Detection in Natural Scene Images [`paper`](http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6613482)

#### Pohang University of Science and Technology

- [2016-CVPR][STL] CannyText Detector: Fast and Robust Scene Text Localization Algorithm [`paper`](http://ieeexplore.ieee.org/document/7780757/)

#### École d'Ingénieurs en Informatique

- [2016-IJDAR][STL] TextCatcher: a method to detect curved and challenging text in natural scenes [`paper`](https://link.springer.com/article/10.1007/s10032-016-0264-4)

#### České vysoké učení technické v Praze. Czech Technical University

- [2018-ACCV][STL][TR] E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene Text [`paper`](https://arxiv.org/pdf/1801.09919.pdf) [`code`](https://github.com/MichalBusta/E2E-MLT)

- [2017-ICCV][STL][TR] Deep TextSpotter: An End-to-End Trainable Scene Text Localization and

Recognition Framework [`peper`](http://openaccess.thecvf.com/content_ICCV_2017/papers/Busta_Deep_TextSpotter_An_ICCV_2017_paper.pdf) [`code`](https://github.com/MichalBusta/DeepTextSpotter)

- [2015-PAMI][STL][TR] Real-time Lexicon-free Scene Text Localization and Recognition [`paper`](http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=7313008)

- [2015-ICCV][STL] FASText: Efficient unconstrained scene text detector [`paper`](https://pdfs.semanticscholar.org/2131/106318d4674bc9260e671c9f427bfc3f1029.pdf) [`code`](https://github.com/MichalBusta/FASText)

- [2012-CVPR][STL][TR] Real-time scene text localization and recognition [`paper`](http://cmp.felk.cvut.cz/~matas/papers/neumann-2012-rt_text-cvpr.pdf) [`code`](http://docs.opencv.org/3.0-beta/modules/text/doc/erfilter.html)

#### Google Inc

- [2019-ICCV][STL] Towards Unconstrained End-to-End Text Spotting [`paper`](http://openaccess.thecvf.com/content_ICCV_2019/html/Qin_Towards_Unconstrained_End-to-End_Text_Spotting_ICCV_2019_paper.html)

- [2013-ICCV][STL][TR] Photo OCR: Reading Text in Uncontrolled Conditions [`paper`](https://ai2-s2-pdfs.s3.amazonaws.com/31a8/803d7e2618bfa44c472d003055bb5961b9de.pdf)

#### Microsoft Inc

- [2010-CVPR][STL] SWT: Detecting Text in Natural Scenes with Stroke Width Transform [`paper`](http://www.math.tau.ac.il/~turkel/imagepapers/text_detection.pdf) [`code`](https://github.com/aperrau/DetectText)

#### Samsung R&D Institute China

- [2019-CVPR][STL] Arbitrary Shape Scene Text Detection With Adaptive Text Region Representation [`paper`](http://openaccess.thecvf.com/content_CVPR_2019/html/Wang_Arbitrary_Shape_Scene_Text_Detection_With_Adaptive_Text_Region_Representation_CVPR_2019_paper.html)

- [2017-arXiv][STL] R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection [`paper`](https://arxiv.org/ftp/arxiv/papers/1706/1706.09579.pdf)

- [2017-IAPR][STL] Deep Residual Text Detection Network for Scene Text [`paper`](https://ieeexplore.ieee.org/document/8270068)

#### Vicarious FPC Inc

- [2016-NIPS][TR] Generative Shape Models: Joint Text Recognition and Segmentation with Very Little Training Data [`paper`](https://arxiv.org/abs/1611.02788)

#### Chinese State Key Laboratory of Management and Control for Complex Systems

- [2013-CVPR][TR] Scene Text Recognition using Part-based Tree-structured Character Detection [`paper`](http://www.cv-foundation.org/openaccess/content_cvpr_2013/papers/Shi_Scene_Text_Recognition_2013_CVPR_paper.pdf)

#### Stanford University

- [2012-ICPR][TR] End-to-End Text Recognition with CNN [`paper`](http://www.cs.stanford.edu/~acoates/papers/wangwucoatesng_icpr2012.pdf) [`code`](http://cs.stanford.edu/people/twangcat/ICPR2012_code/SceneTextCNN_demo.tar)

#### Visual Computing Department, Institute for Infocomm Research

- [2017-ICCV][STL] WeText: Scene Text Detection under Weak Supervision [`paper`](http://openaccess.thecvf.com/content_ICCV_2017/papers/Tian_WeText_Scene_Text_ICCV_2017_paper.pdf)

#### University of Florida

- [2017-ICCV][STL] Single Shot Text Detector with Regional Attention [`paper`](http://openaccess.thecvf.com/content_ICCV_2017/papers/He_Single_Shot_Text_ICCV_2017_paper.pdf) [`code`](https://github.com/BestSonny/SSTD)

#### University of Southern California

- [2017-ICCV][STL] Self-organized Text Detection with Minimal Post-processing via Border Learning [`paper`](http://openaccess.thecvf.com/content_ICCV_2017/papers/Wu_Self-Organized_Text_Detection_ICCV_2017_paper.pdf)

#### Hikvision Research Institute

- [2021-AAAI][STL][TR] MANGO: A Mask Attention Guided One-Stage Scene Text Spotter [`paper`](https://arxiv.org/pdf/2012.04350.pdf)

- [2020-AAAI][STL][TR] Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting [`paper`](https://arxiv.org/pdf/2002.06820.pdf)

- [2018-CVPR][TR] AON: Towards Arbitrarily-Oriented Text Recognition [`paper`](https://arxiv.org/pdf/1711.04226.pdf) [`code`](https://github.com/huizhang0110/AON)

- [2017-ICCV][TR] Focusing Attention: Towards Accurate Text Recognition in Natural Images [`paper`](http://openaccess.thecvf.com/content_ICCV_2017/papers/Cheng_Focusing_Attention_Towards_ICCV_2017_paper.pdf)

#### University of Adelaide

- [2019-AAAI][TR] Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition [`paper`](https://arxiv.org/pdf/1811.00751.pdf) [`code`](https://github.com/Pay20Y/SAR_TF)

- [2017-ICCV][STL][TR] Towards End-to-end Text Spotting with Convolutional Recurrent Neural Networks [`paper`](http://openaccess.thecvf.com/content_ICCV_2017/papers/Li_Towards_End-To-End_Text_ICCV_2017_paper.pdf)

#### City University of New York

- [2017-CVPR][STL] Unambiguous Text Localization and Retrieval for Cluttered Scenes [`paper`](http://openaccess.thecvf.com/content_cvpr_2017/papers/Rong_Unambiguous_Text_Localization_CVPR_2017_paper.pdf)

#### The University of Hong Kong

- [2020-ECCV][STL][TR] AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting [`paper`](http://www.ecva.net/papers/eccv_2020/papers_ECCV/html/2183_ECCV_2020_paper.php)

- [2018-AAAI][TR] Char-Net: A Character-Aware Neural Network for Distorted Scene Text [`paper`](http://www.visionlab.cs.hku.hk/publications/wliu_aaai18.pdf)

#### Zhejiang University

- [2021-TIP][STL][TR] FREE: A Fast and Robust End-to-End Video Text Spotter [`paper`](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9266586)

- [2020-arxiv][TR] Refined Gate: A Simple and Effective Gating Mechanism for Recurrent Units [`paper`](https://arxiv.org/pdf/2002.11338.pdf)

- [2018-AAAI][STL] PixelLink: Detecting Scene Text via Instance Segmentation [`paper`](https://arxiv.org/pdf/1801.01315.pdf)

#### University of Potsdam

- [2018-AAAI][STL][TR] SEE: Towards Semi-Supervised End-to-End Scene Text Recognition [`paper`](https://arxiv.org/pdf/1712.05404.pdf) [`code`](https://github.com/Bartzi/see)

#### Arizona State Unviversity

- [2018-AAAI][TR] SqueezedText: A Real-time Scene Text Recognition by Binary Convolutional

Encoder-decoder Network [`paper`](https://pdfs.semanticscholar.org/9061/47e6eb8e963d9751dda18fb540ed7faeb9fb.pdf)

#### Stevens Institute of Technology

- [2018-CVPR][STL] Geometry-Aware Scene Text Detection with Instance Transformation Network [`paper`](http://openaccess.thecvf.com/content_cvpr_2018/papers/Wang_Geometry-Aware_Scene_Text_CVPR_2018_paper.pdf)

#### Nanyang Technological University

- [2020-IJCV][STL] Bottom-Up Scene Text Detection with Markov Clustering Networks [`paper`](https://link.springer.com/article/10.1007/s11263-020-01298-y)

- [2020-AAAI][STL][TR] GTC: Guided Training of CTC Towards Efficient and Accurate Scene Text Recognition [`paper`](https://arxiv.org/pdf/2002.01276.pdf)

- [2019-ICCV][STL][TR] GA-DAN: Geometry-Aware Domain Adaptation Network for Scene Text Detection and Recognition [`paper`](http://openaccess.thecvf.com/content_ICCV_2019/html/Zhan_GA-DAN_Geometry-Aware_Domain_Adaptation_Network_for_Scene_Text_Detection_and_ICCV_2019_paper.html)

- [2019-CVPR][STL] ESIR: End-To-End Scene Text Recognition via Iterative Image Rectification [`paper`](http://openaccess.thecvf.com/content_CVPR_2019/html/Zhan_ESIR_End-To-End_Scene_Text_Recognition_via_Iterative_Image_Rectification_CVPR_2019_paper.html)

- [2019-CVPR][STL] Towards Robust Curve Text Detection With Conditional Spatial Expansion [`paper`](http://openaccess.thecvf.com/content_CVPR_2019/html/)Liu_Towards_Robust_Curve_Text_Detection_With_Conditional_Spatial_Expansion_CVPR_2019_paper.html)

- [2018-ECCV][STL] Verisimilar Image Synthesis for Accurate Detection and Recognition of Texts in Scenes [`paper`](http://openaccess.thecvf.com/content_ECCV_2018/papers/Fangneng_Zhan_Verisimilar_Image_Synthesis_ECCV_2018_paper.pdf)

- [2018-ECCV][STL] Accurate Scene Text Detection through Border Semantics Awareness and Bootstrapping [`paper`](http://openaccess.thecvf.com/content_ECCV_2018/papers/Chuhui_Xue_Accurate_Scene_Text_ECCV_2018_paper.pdf)

- [2018-ECCV][STL] Using Object Information for Spotting Text [`paper`](http://openaccess.thecvf.com/content_ECCV_2018/papers/Shitala_Prasad_Using_Object_Information_ECCV_2018_paper.pdf)

- [2018-CVPR][STL] Learning Markov Clustering Networks for Scene Text Detection [`paper`](http://openaccess.thecvf.com/content_cvpr_2018/papers/Liu_Learning_Markov_Clustering_CVPR_2018_paper.pdf)

#### Alibaba Group 

- [2018-ICPR][STL][TR] A Novel Integrated Framework for Learning both Text Detection and Recognition [`paper`](https://arxiv.org/pdf/1811.08611.pdf)

- [2018-IJCAI][STL] IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection [`paper`](https://arxiv.org/pdf/1805.01167.pdf)

#### Chinese Academy of Sciences

- [2020-CVPR][STL][TR] Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text [`paper`](https://openaccess.thecvf.com/content_CVPR_2020/html/Gao_Multi-Modal_Graph_Neural_Network_for_Joint_Reasoning_on_Vision_and_CVPR_2020_paper.html)

- [2018-ICIP][STL] Focal Text: An Accurate Text Detection With Focal Loss [`paper`](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8451241)

- [2018-ICIP][STL] Dense Chained Attention Network for Scene Text Recognition [`paper`](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8451273)

#### University of Cambridge

- [2018-ECCV][STL] Synthetically Supervised Feature Learning for Scene Text Recognition [`paper`](http://openaccess.thecvf.com/content_ECCV_2018/papers/Yang_Liu_Synthetically_Supervised_Feature_ECCV_2018_paper.pdf)

#### Peking University

- [2021-NIPS][TR] CentripetalText: An Efficient Text Instance Representation for Scene Text Detection [`paper`](https://arxiv.org/pdf/2107.05945.pdf) [`code`](https://github.com/shengtao96/CentripetalText)

- [2020-ICASSP][TR] A New Perspective for Flexible Feature Gathering in Scene Text Recognition Via Character Anchor Pooling [`paper`](https://arxiv.org/pdf/2002.03509.pdf)

- [2020-ICASSP][STL] All you need is a second look: Towards Tighter Arbitrary shape text detection [`paper`](https://arxiv.org/pdf/2004.12436.pdf)

- [2019-WACV][STL] Mask R-CNN with Pyramid Attention Network for Scene Text Detection [`paper`](https://arxiv.org/pdf/1811.09058.pdf)

- [2018-ECCV][STL] TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes [`paper`](https://arxiv.org/pdf/1807.01544.pdf) [`code`](https://github.com/princewang1994/TextSnake.pytorch)

#### SenseTime Research

- [2021-WACV][STL] Disentangled Contour Learning for Quadrilateral Text Detection [`paper`](https://openaccess.thecvf.com/content/WACV2021/papers/Bi_Disentangled_Contour_Learning_for_Quadrilateral_Text_Detection_WACV_2021_paper.pdf) [`code`](https://github.com/SakuraRiven/DCLNet)

- [2020-ECCV][TR] RobustScanner: Dynamically Enhancing Positional Clues for Robust Text Recognition [`paper`](http://www.ecva.net/papers/eccv_2020/papers_ECCV/html/3160_ECCV_2020_paper.php)

- [2020-ECCV][TR] Scene Text Image Super-resolution in the wild [`paper`](http://www.ecva.net/papers/eccv_2020/papers_ECCV/html/1186_ECCV_2020_paper.php)

- [2019-arxiv][STL] Pyramid Mask Text Detector [`paper`](https://arxiv.org/pdf/1903.11800.pdf)

- [2019-ICCV][STL] Geometry Normalization Networks for Accurate Scene Text Detection [`paper`](http://openaccess.thecvf.com/content_ICCV_2019/html/Xu_Geometry_Normalization_Networks_for_Accurate_Scene_Text_Detection_ICCV_2019_paper.html)

- [2018-BMVC][STL] Boosting up Scene Text Detectors with Guided CNN [`paper`](http://bmvc2018.org/contents/papers/0633.pdf)

#### Naver Clova AI Research

- [2020-ECCV][STL] Character Region Attention For Text Spotting [`paper`](http://www.ecva.net/papers/eccv_2020/papers_ECCV/html/6775_ECCV_2020_paper.php)

- [2019-CVPR][STL][TR] Character Region Awareness for Text Detection [`paper`](https://arxiv.org/abs/1904.01941) [`code`](https://github.com/clovaai/CRAFT-pytorch)

#### Baidu

- [2020-arxiv][STL][TR] PP-OCR: A Practical Ultra Lightweight OCR System [`paper`](https://arxiv.org/pdf/2009.09941.pdf)

- [2019-ICCV][STL][TR] Chinese Street View Text: Large-Scale Chinese Text Reading With Partially Supervised Learning [`paper`](http://openaccess.thecvf.com/content_ICCV_2019/html/Sun_Chinese_Street_View_Text_Large-Scale_Chinese_Text_Reading_With_Partially_ICCV_2019_paper.html)

- [2019-CVPR][STL] Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes

[`paper`](https://arxiv.org/abs/1904.06535)

- [2018-arxiv][STL] Detecting Text in the Wild with Deep Character Embedding Network [`paper`](https://arxiv.org/abs/1801.01671)

- [2018-ACCV][STL][TR] TextNet: Irregular Text Reading from Images with an End-to-End Trainable Network [`paper`](https://arxiv.org/pdf/1812.09900.pdf)

#### University of Adelaide 

- [2018-CVPR][STL][TR] An End-to-End TextSpotter with Explicit Alignment and Attention [`paper`](http://openaccess.thecvf.com/content_cvpr_2018/papers/He_An_End-to-End_TextSpotter_CVPR_2018_paper.pdf) [`code`](https://github.com/tonghe90/textspotter)

#### Nanjing University

- [2020-BMVC][TR] Robust Scene Text Recognition Through Adaptive Image Enhancement [`paper`](https://www.bmvc2020-conference.com/assets/papers/0257.pdf)

- [2019-ICCV][STL] Efficient and Accurate Arbitrary-Shaped Text Detection With Pixel Aggregation Network [`paper`](http://openaccess.thecvf.com/content_ICCV_2019/html/Wang_Efficient_and_Accurate_Arbitrary-Shaped_Text_Detection_With_Pixel_Aggregation_Network_ICCV_2019_paper.html) [`code`](https://github.com/WenmuZhou/PAN.pytorch)

- [2019-CVPR][STL] Shape Robust Text Detection With Progressive Scale Expansion Network [`paper`](http://openaccess.thecvf.com/content_CVPR_2019/html/Wang_Shape_Robust_Text_Detection_With_Progressive_Scale_Expansion_Network_CVPR_2019_paper.html) [`code`](https://github.com/whai362/PSENet)

#### The Chinese University of Hong Kong

- [2022-AAAI][TR] Context-based Contrastive Learning for Scene Text Recognition [`paper`](https://www.cse.cuhk.edu.hk/~byu/papers/C139-AAAI2022-ConCLR.pdf)

- [2019-CVPR][STL] Learning Shape-Aware Embedding for Scene Text Detection [`paper`](http://openaccess.thecvf.com/content_CVPR_2019/html/Tian_Learning_Shape-Aware_Embedding_for_Scene_Text_Detection_CVPR_2019_paper.html)

#### Malong Technologies

- [2019-ICCV][STL][TR] Convolutional Character Networks [`paper`](http://openaccess.thecvf.com/content_ICCV_2019/html/Xing_Convolutional_Character_Networks_ICCV_2019_paper.html) [`code`](https://github.com/MalongTech/research-charnet)

#### University of Rochester

- [2019-ICCV][TR] Large-Scale Tag-Based Font Retrieval With Generative Feature Learning [`paper`](http://openaccess.thecvf.com/content_ICCV_2019/html/Chen_Large-Scale_Tag-Based_Font_Retrieval_With_Generative_Feature_Learning_ICCV_2019_paper.html)

#### Facebook AI Research

- [2021-CVPR][STL][TR] TextOCR: Towards Large-Scale End-to-End Reasoning for Arbitrary-Shaped Scene Text [`paper`](https://openaccess.thecvf.com/content/CVPR2021/papers/Singh_TextOCR_Towards_Large-Scale_End-to-End_Reasoning_for_Arbitrary-Shaped_Scene_Text_CVPR_2021_paper.pdf) [`code`](https://textvqa.org/textocr/code)

- [2020-CVPR][STL][TR] Iterative Answer Prediction with Pointer-Augmented Multimodal Transformers for TextVQA [`paper`](https://arxiv.org/pdf/1911.06258.pdf)

- [2018-arxiv][STL] Improving Rotated Text Detection with Rotation Region Proposal Networks [`paper`](https://arxiv.org/pdf/1811.07031.pdf) 

#### University of Marlyand

- [2020-WACV][TR] Adapting Style and Content for Attended Text Sequence Recognition [`paper`](http://openaccess.thecvf.com/content_WACV_2020/papers/Schwarcz_Adapting_Style_and_Content_for_Attended_Text_Sequence_Recognition_WACV_2020_paper.pdf)

#### Penta-AI

- [2020-WACV][STL] It’s All About The Scale - Efficient Text Detection Using Adaptive Scaling [`paper`](http://openaccess.thecvf.com/content_WACV_2020/papers/Richardson_Its_All_About_The_Scale_-_Efficient_Text_Detection_Using_WACV_2020_paper.pdf)

#### Central China Normal University

- [2020-ECCV][STL][TR] PlugNet: Degradation Aware Scene Text Recognition Supervised by a Pluggable Super-Resolution Unit [`paper`](http://www.ecva.net/papers/eccv_2020/papers_ECCV/html/2318_ECCV_2020_paper.php)

#### Tencent

- [2022-AAAI][TR] Perceiving Stroke-Semantic Context: Hierarchical Contrastive Learning for Robust Scene Text Recognition [`paper`](https://www.aaai.org/AAAI22Papers/AAAI-785.LiuH.pdf)

- [2020-arxiv][STL] PuzzleNet: Scene Text Detection by Segment Context Graph Learning [`paper`](https://arxiv.org/pdf/2002.11371.pdf)

- [2020-AAAI][STL][TR] Accurate Structured-Text Spotting for Arithmetical Exercise Correction [`paper`](https://www.researchgate.net/publication/341891992_Accurate_Structured-Text_Spotting_for_Arithmetical_Exercise_Correction)

- [2019-arxiv][TR] 2D Attentional Irregular Scene Text Recognizer [`paper`](https://arxiv.org/pdf/1906.05708.pdf) [`code`](https://github.com/chenjun2hao/Bert_OCR.pytorch)

#### Tsinghua University

- [2023-IJCAI][TR] Towards Robust Scene Text Image Super-resolution via Explicit Location Enhancement [`paper`](https://www.ijcai.org/proceedings/2023/0087.pdf) [`code`](https://github.com/csguoh/LEMMA)

- [2021-CVPR][STL] Primitive Representation Learning for Scene Text Recognition [`paper`](https://openaccess.thecvf.com/content/CVPR2021/papers/Yan_Primitive_Representation_Learning_for_Scene_Text_Recognition_CVPR_2021_paper.pdf)

- [2020-ECCV][STL] Sequential Deformation for Accurate Scene Text Detection [`paper`](http://www.ecva.net/papers/eccv_2020/papers_ECCV/html/6576_ECCV_2020_paper.php)

#### University of Science and Technology of China

- [2023-IJCAI][TR] Linguistic More: Taking a Further Step toward Effcient and Accurate Scene Text Recognition [`paper`](https://www.ijcai.org/proceedings/2023/0189.pdf) [`code`](https://github.com/CyrilSterling/LPV)

- [2021-ICCV][TR] From Two to One: A New Scene Text Recognizer With Visual Language Modeling Network [`paper`](https://openaccess.thecvf.com/content/ICCV2021/papers/Wang_From_Two_to_One_A_New_Scene_Text_Recognizer_With_ICCV_2021_paper.pdf)

- [2021-CVPR][STL] Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition [`paper`](https://openaccess.thecvf.com/content/CVPR2021/papers/Fang_Read_Like_Humans_Autonomous_Bidirectional_and_Iterative_Language_Modeling_for_CVPR_2021_paper.pdf) [`code`](https://github.com/FangShancheng/ABINet)

- [2020-CVPR][STL] ContourNet: Taking a Further Step Toward Accurate Arbitrary-Shaped Scene Text Detection [`paper`](https://openaccess.thecvf.com/content_CVPR_2020/html/Wang_ContourNet_Taking_a_Further_Step_Toward_Accurate_Arbitrary-Shaped_Scene_Text_CVPR_2020_paper.html) [`code`](https://github.com/wangyuxin87/ContourNet)

- [2020-arxiv][TR] Focus-Enhanced Scene Text Recognition with Deformable Convolutions [`paper`](https://arxiv.org/pdf/1908.10998.pdf) [`code`](https://github.com/Alpaca07/dtr)

- [2018-Pattern Recognition][STL] TextMountain: Accurate Scene Text Detection via Instance Segmentation [`paper`](https://arxiv.org/pdf/1811.12786.pdf)

#### University of Electronic Science and Technology of China

- [2020-CVPR][TR] What Machines See Is Not What They Get: Fooling Scene Text Recognition Models With Adversarial Text Images [`paper`](https://openaccess.thecvf.com/content_CVPR_2020/html/Xu_What_Machines_See_Is_Not_What_They_Get_Fooling_Scene_CVPR_2020_paper.html)

#### Indian Statistical Institute

- [2020-CVPR][STL][TR] STEFANN: Scene Text Editor Using Font Adaptive Neural Network [`paper`](https://openaccess.thecvf.com/content_CVPR_2020/html/Roy_STEFANN_Scene_Text_Editor_Using_Font_Adaptive_Neural_Network_CVPR_2020_paper.html)

#### Institute of Information Engineering, Chinese Academy of Sciences

- [2021-CVPR][STL] Progressive Contour Regression for Arbitrary-Shape Scene Text Detection [`paper`](https://openaccess.thecvf.com/content/CVPR2021/papers/Dai_Progressive_Contour_Regression_for_Arbitrary-Shape_Scene_Text_Detection_CVPR_2021_paper.pdf) [`code`](https://github.com/dpengwen/PCR)

- [2020-CVPR][TR] SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition [`paper`](https://openaccess.thecvf.com/content_CVPR_2020/html/Qiao_SEED_Semantics_Enhanced_Encoder-Decoder_Framework_for_Scene_Text_Recognition_CVPR_2020_paper.html)

- [2020-ICPR][TR] Gaussian Constrained Attention Network for Scene Text Recognition [`paper`](https://arxiv.org/pdf/2010.09169.pdf)

- [2020-arxiv][STL] Self-Training for Domain Adaptive Scene Text Detection [`paper`](https://arxiv.org/pdf/2005.11487.pdf)

- [2019-ICDAR][STL] Curved Text Detection in Natural Scene Images with Semi- and Weakly-Supervised Learning [`paper`](https://arxiv.org/pdf/1908.09990.pdf)

- [2019-BMVC][TR] Text Recognition using local correlation[`paper`](https://bmvc2019.org/wp-content/uploads/papers/0469-paper.pdf)

#### University of Chinese Academy of Sciences

- [2020-CVPR][STL][TR] Towards Accurate Scene Text Recognition With Semantic Reasoning Networks [`paper`](https://openaccess.thecvf.com/content_CVPR_2020/html/Yu_Towards_Accurate_Scene_Text_Recognition_With_Semantic_Reasoning_Networks_CVPR_2020_paper.html)

#### Amazon

- [2020-CVPR][STL] SCATTER: Selective Context Attentional Scene Text Recognizer [`paper`](https://openaccess.thecvf.com/content_CVPR_2020/html/Litman_SCATTER_Selective_Context_Attentional_Scene_Text_Recognizer_CVPR_2020_paper.html)

#### Heritage Institute of Technology

- [2020-ICIP][STL] Scale-invariant Multi-oriented Text Detection in Wild Scene Images [`paper`](https://arxiv.org/pdf/2002.06423.pdf)

#### Indian Institute of Technology

- [2020-arxiv][STL] NENET: An Edge Learnable Network for Link Prediction in Scene Text [`paper`](https://arxiv.org/pdf/2005.12147.pdf)

#### Xidian University

- [2021-AAAI][STL][TR] PGNet: Real-time Arbitrarily-Shaped Text Spotting with Point Gathering Network [`paper`](https://arxiv.org/pdf/2104.05458.pdf) [`code`](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/doc/doc_en/pgnet_en.md)

- [2020-ICASSP][STL] Efficient Scene Text Detection with Textual Attention Tower [`paper`](https://arxiv.org/pdf/2002.03741.pdf)

- [2019-ACM-MM][STL] A Single-Shot Arbitrarily-Shaped Text Detector based on Context Attended Multi-Task Learning [`paper`](https://arxiv.org/pdf/1908.05498.pdf)

#### Tongji University

- [2019-AAAI][STL] Scene Text Detection with Supervised Pyramid Context Network [`paper`](https://arxiv.org/pdf/1811.08605.pdf) [`code`](https://github.com/AirBernard/Scene-Text-Detection-with-SPCNET)

#### Harbin Institute of Technology

- [2017-TIP][STL] Scene text detection and segmentation based on cascaded convolution neural networks (`paper`)[https://ieeexplore.ieee.org/document/7828014]

#### Shanghai Jiao Tong University

- [2018-ICPR][STL] Fused Text Segmentation Networks for Multi-oriented Scene Text Detection [`paper`](https://arxiv.org/pdf/1709.03272.pdf)  

#### Ping An Property & Casualty Insurance

- [2020-arxiv][TR] Hamming OCR: A Locality Sensitive Hashing Neural Network for Scene Text Recognition [`paper`](https://arxiv.org/pdf/2009.10874.pdf)

#### Hefei University of Technology

- [2020-arxiv][TR] Fast Dense Residual Network: Enhancing Global Dense Feature Flow for Text Recognition [`paper`](https://arxiv.org/pdf/2001.09021v1.pdf)

#### Beihang University

- [2020-arxiv][TR] A Feasible Framework for Arbitrary-Shaped Scene Text Recognition [`paper`](https://arxiv.org/pdf/1912.04561.pdf) [`code`](https:

//github.com/zhang0jhon/AttentionOCR)

#### Boston University

- [2020-arxiv][TR] Deep Neural Network for Semantic-based Text Recognition in Images [`paper`](https://arxiv.org/pdf/1908.01403.pdf)

#### Carnegie Mellon University

- [2019-ICDAR][TR] Rethinking Irregular Scene Text Recognition [`paper`](https://arxiv.org/pdf/1908.11834.pdf) [`code`](https://github.com/Jyouhou/ICDAR2019-ArT-Recognition-Alchemy)

#### Northwestern Polytechnical University

- [2019-CVPR][STL][TR] Towards End-to-End Text Spotting in Natural Scenes [`paper`](https://arxiv.org/pdf/1906.06013.pdf)

#### VinAI Research

- [2021-CVPR][STL] Dictionary-Guided Scene Text Recognition [`paper`](https://openaccess.thecvf.com/content/CVPR2021/papers/Nguyen_Dictionary-Guided_Scene_Text_Recognition_CVPR_2021_paper.pdf) [`code`](https://github.com/VinAIResearch/dict-guided)

#### University of Tokyo

- [2021-CVPR][TR] What if We Only Use Real Datasets for Scene Text Recognition? Toward Scene Text Recognition With Fewer Labels [`paper`](https://openaccess.thecvf.com/content/CVPR2021/papers/Baek_What_if_We_Only_Use_Real_Datasets_for_Scene_Text_CVPR_2021_paper.pdf) [`code`](https://github.com/ku21fan/STR-Fewer-Labels)

#### University of Surrey

- [2021-ICCV][TR] Towards the Unseen: Iterative Text Recognition by Distilling from Errors [`paper`](https://openaccess.thecvf.com/content/ICCV2021/papers/Bhunia_Towards_the_Unseen_Iterative_Text_Recognition_by_Distilling_From_Errors_ICCV_2021_paper.pdf)

- [2021-ICCV][TR] Joint Visual Semantic Reasoning: Multi-Stage Decoder for Text Recognition [`paper`](https://openaccess.thecvf.com/content/ICCV2021/papers/Bhunia_Joint_Visual_Semantic_Reasoning_Multi-Stage_Decoder_for_Text_Recognition_ICCV_2021_paper.pdf)

- [2021-CVPR][TR] MetaHTR: Towards Writer-Adaptive Handwritten Text Recognition [`paper`](https://openaccess.thecvf.com/content/CVPR2021/papers/Bhunia_MetaHTR_Towards_Writer-Adaptive_Handwritten_Text_Recognition_CVPR_2021_paper.pdf)

#### The Technion – Israel Institute of Technology

- [2021-CVPR][TR] Sequence-to-Sequence Contrastive Learning for Text Recognition [`paper`](https://openaccess.thecvf.com/content/CVPR2021/papers/Aberdam_Sequence-to-Sequence_Contrastive_Learning_for_Text_Recognition_CVPR_2021_paper.pdf)

#### University of Illinois at Urbana-Champaign

- [2021-CVPR][TR] Rethinking Text Segmentation: A Novel Dataset and a Text-Specific Refinement Approach [`paper`](https://openaccess.thecvf.com/content/CVPR2021/papers/Xu_Rethinking_Text_Segmentation_A_Novel_Dataset_and_a_Text-Specific_Refinement_CVPR_2021_paper.pdf) [`code`](https://github.com/SHI-Labs/Rethinking-Text-Segmentation)

#### National Laboratory of Pattern Recognition

- [2021-CVPR][STL] Semantic-Aware Video Text Detection [`paper`](https://openaccess.thecvf.com/content/CVPR2021/papers/Feng_Semantic-Aware_Video_Text_Detection_CVPR_2021_paper.pdf)

#### Shenzhen University

- [2021-CVPR][STL][TR] Self-Attention Based Text Knowledge Mining for Text Detection [`paper`](https://openaccess.thecvf.com/content/CVPR2021/papers/Wan_Self-Attention_Based_Text_Knowledge_Mining_for_Text_Detection_CVPR_2021_paper.pdf) [`code`](https://github.com/CVI-SZU/STKM)

#### University of the Philippines

- [2021-ICDAR][TR] Vision Transformer for Fast and Efficient Scene Text Recognition [`paper`](https://arxiv.org/pdf/2105.08582.pdf) ['code'](https://github.com/roatienza/deep-text-recognition-benchmark)

#### Beijing Jiaotong University

- [2022-IJCAI][TR] SVTR: Scene Text Recognition with a Single Visual Model [`paper`](https://arxiv.org/pdf/2205.00159.pdf) [`code`](https://github.com/PaddlePaddle/PaddleOCR)

#### Wuhan University

- [2022-AAAI][TR] Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition [`paper`](https://arxiv.org/pdf/2112.12916.pdf) [`code`](https://github.com/adeline-cs/GTR)

#### Helsing AI

- [2022-WACV][TR] One-shot Compositional Data Generation for Low Resource Handwritten Text Recognition [`paper`](https://openaccess.thecvf.com/content/WACV2022/papers/Souibgui_One-Shot_Compositional_Data_Generation_for_Low_Resource_Handwritten_Text_Recognition_WACV_2022_paper.pdf)

#### Purdue University

- [2023-WACV][TR] Seq-UPS: Sequential Uncertainty-aware Pseudo-label Selection for Semi-Supervised Text Recognition [`paper`](https://openaccess.thecvf.com/content/WACV2023/papers/Patel_Seq-UPS_Sequential_Uncertainty-Aware_Pseudo-Label_Selection_for_Semi-Supervised_Text_Recognition_WACV_2023_paper.pdf)

## 2. Datasets

#### [`SCUT-CTW1500`](https://github.com/Yuliang-Liu/Curve-Text-Detector) `2018`

Task: text location(with different style) and recognition

[`download`](https://github.com/Yuliang-Liu/Curve-Text-Detector)

#### [`Total Text Dataset`](https://github.com/cs-chan/Total-Text-Dataset) `2017`

1,555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind

Task: text location(with different style) and recognition

[`download`](https://github.com/cs-chan/Total-Text-Dataset)

#### [`PowerPoint Text Detection and Recognition Dataset`](https://gitlab.com/rex-yue-wu/ISI-PPT-Dataset) `2017`

21,384 images, 21,384+ text instances

Task: text location and recognition

[`download`](https://gitlab.com/rex-yue-wu/ISI-PPT-Dataset)

#### [`COCO-Text (Computer Vision Group, Cornell)`](http://vision.cornell.edu/se3/coco-text/)   `2016`

63,686 images, 173,589 text instances, 3 fine-grained text attributes.

Task: text location and recognition

[`download`](https://github.com/andreasveit/coco-text)

#### [`Synthetic Word Dataset (Oxford, VGG)`](http://www.robots.ox.ac.uk/~vgg/data/text/)   `2014`

9 million images covering 90k English words

Task: text recognition, segmantation

[`download`](http://www.robots.ox.ac.uk/~vgg/data/text/mjsynth.tar.gz)

#### [`The Street View House Number Dataset (SVHN)`](http://ufldl.stanford.edu/housenumbers)   `2012`

Real-world street view number image with its position and classification tags.

Task: number location detection, text recognition

[`download`](http://ufldl.stanford.edu/housenumbers)

#### [`IIIT 5K-Words`](http://cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K.html)   `2012`

5000 images from Scene Texts and born-digital (2k training and 3k testing images)

Each image is a cropped word image of scene text with case-insensitive labels

Task: text recognition

[`download`](http://cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz)

#### [`StanfordSynth(Stanford, AI Group)`](http://cs.stanford.edu/people/twangcat/#research)   `2012`

Small single-character images of 62 characters (0-9, a-z, A-Z)

Task: text recognition

[`download`](http://cs.stanford.edu/people/twangcat/ICPR2012_code/syntheticData.tar)

#### [`MSRA Text Detection 500 Database (MSRA-TD500)`](http://www.iapr-tc11.org/mediawiki/index.php/MSRA_Text_Detection_500_Database_(MSRA-TD500))   `2012`

500 natural images(resolutions of the images vary from 1296x864 to 1920x1280)

Chinese, English or mixture of both

Task: text detection

#### [`Street View Text (SVT)`](http://tc11.cvc.uab.es/datasets/SVT_1)   `2010`

350 high resolution images (average size 1260 × 860) (100 images for training and 250 images for testing)

Only word level bounding boxes are provided with case-insensitive labels

Task: text location

#### [`KAIST Scene_Text Database`](http://www.iapr-tc11.org/mediawiki/index.php/KAIST_Scene_Text_Database)   `2010`

3000 images of indoor and outdoor scenes containing text

Korean, English (Number), and Mixed (Korean + English + Number)

Task: text location, segmantation and recognition

#### [`Chars74k`](http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/)   `2009`

Over 74K images from natural images, as well as a set of synthetically generated characters

Small single-character images of 62 characters (0-9, a-z, A-Z)

Task: text recognition

#### `ICDAR Benchmark Datasets`

|Dataset| Description | Competition Paper |

|---|---|----|

|[ICDAR 2017](http://rrc.cvc.uab.es/)| over 173,589 labeled text regions in over 63,686 images |`paper`  [![link](https://www.lds.org/bc/content/shared/content/images/gospel-library/manual/10735/paper-icon_1150845_tmb.jpg)](https://arxiv.org/abs/1601.07140)|

|[ICDAR 2015](http://rrc.cvc.uab.es/)| 1000 training images and 500 testing images|`paper`  [![link](https://www.lds.org/bc/content/shared/content/images/gospel-library/manual/10735/paper-icon_1150845_tmb.jpg)](http://rrc.cvc.uab.es/files/Robust-Reading-Competition-Karatzas.pdf)|

|[ICDAR 2013](http://dagdata.cvc.uab.es/icdar2013competition/)| 229 training images and 233 testing images |`paper`  [![link](https://www.lds.org/bc/content/shared/content/images/gospel-library/manual/10735/paper-icon_1150845_tmb.jpg)](http://dagdata.cvc.uab.es/icdar2013competition/files/icdar2013_competition_report.pdf)|

|[ICDAR 2011](http://robustreading.opendfki.de/trac/)| 229 training images and 255 testing images |`paper`  [![link](https://www.lds.org/bc/content/shared/content/images/gospel-library/manual/10735/paper-icon_1150845_tmb.jpg)](http://www.iapr-tc11.org/archive/icdar2011/fileup/PDF/4520b491.pdf)|

|[ICDAR 2005](http://www.iapr-tc11.org/mediawiki/index.php/ICDAR_2005_Robust_Reading_Competitions)| 1001 training images and 489 testing images |`paper`  [![link](https://www.lds.org/bc/content/shared/content/images/gospel-library/manual/10735/paper-icon_1150845_tmb.jpg)](http://www.academia.edu/download/30700479/10.1.1.96.4332.pdf)|

|[ICDAR 2003](http://www.iapr-tc11.org/mediawiki/index.php/ICDAR_2003_Robust_Reading_Competitions)| 181 training images and 251 testing images(word level and character level) |`paper`  [![link](https://www.lds.org/bc/content/shared/content/images/gospel-library/manual/10735/paper-icon_1150845_tmb.jpg)](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.332.3461&rep=rep1&type=pdf)|

## 3. Competitions

- [ICDAR - Robust Reading Competitions](http://rrc.cvc.uab.es/?com=introduction)

## 4. Online OCR Service

| Name | Description |

|---|----

|[Tesseract OCR](https://github.com/tesseract-ocr/tesseract)| API，free |

|[Online OCR](https://www.onlineocr.net/)| API，free |

|[Free OCR](http://www.free-ocr.com/)| API，free |

|[New OCR](http://www.newocr.com/)| API，free |

|[ABBYY FineReader Online](https://finereaderonline.com)| No API，Not free |

|[Super Online Transfer Tools (Chinese)](http://www.wdku.net/)| API，free |

|[Online Chinese Recognition](http://chongdata.com/ocr/)| API，free |

## 5. Blogs

- [Scene Text Detection with OpenCV 3](http://docs.opencv.org/3.0-beta/modules/text/doc/erfilter.html)

- [Handwritten numbers detection and recognition](https://medium.com/@o.kroeger/recognize-your-handwritten-numbers-3f007cbe46ff#.8hg7vl6mo)

- [Applying OCR Technology for Receipt Recognition](http://rnd.azoft.com/applying-ocr-technology-receipt-recognition/)

- [Convolutional Neural Networks for Object(Car License) Detection](http://rnd.azoft.com/convolutional-neural-networks-object-detection/)

- [Extracting text from an image using Ocropus](http://www.danvk.org/2015/01/09/extracting-text-from-an-image-using-ocropus.html)

- [Number plate recognition with Tensorflow](http://matthewearl.github.io/2016/05/06/cnn-anpr/) [`github`](https://github.com/matthewearl/deep-anpr)

- [Using deep learning to break a Captcha system](https://deepmlblog.wordpress.com/2016/01/03/how-to-break-a-captcha-system/) [`report`](http://web.stanford.edu/~jurafsky/burszstein_2010_captcha.pdf) [`github`](https://github.com/arunpatala/captcha)

- [Breaking reddit captcha with 96% accuracy](https://deepmlblog.wordpress.com/2016/01/05/breaking-reddit-captcha-with-96-accuracy/) [`github`](https://github.com/arunpatala/reddit.captcha)

- [文字检测与识别资源-1](http://blog.csdn.net/peaceinmind/article/details/51387367)

- [文字的检测与识别资源-2](http://blog.csdn.net/u010183397/article/details/56497303?locationNum=12&fps=1)

- Scene Text Recognition in iOS [`blog`](https://medium.com/@khurram.pak522/scene-text-recognition-in-ios-11-2d0df8412151) [`github`](https://github.com/khurram18/SceneTextRecognitioniOS)
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/whitelok/image-text-localization-recognition

Awesome Lists containing this project

README