{"id":21971101,"url":"https://github.com/lsvih/sliding-convolution","last_synced_at":"2025-04-28T11:42:10.438Z","repository":{"id":39626433,"uuid":"182108360","full_name":"lsvih/Sliding-Convolution","owner":"lsvih","description":"Pytorch implementation of \"Scene Text Recognition with Sliding Convolutional Character Models\"","archived":false,"fork":false,"pushed_at":"2019-04-22T05:19:52.000Z","size":27667,"stargazers_count":14,"open_issues_count":1,"forks_count":8,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-30T09:11:43.127Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lsvih.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-04-18T15:00:43.000Z","updated_at":"2022-10-25T19:16:28.000Z","dependencies_parsed_at":"2022-09-06T02:42:41.954Z","dependency_job_id":null,"html_url":"https://github.com/lsvih/Sliding-Convolution","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lsvih%2FSliding-Convolution","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lsvih%2FSliding-Convolution/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lsvih%2FSliding-Convolution/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lsvih%2FSliding-Convolution/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lsvih","download_url":"https://codeload.github.com/lsvih/Sliding-Convolution/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251306906,"owners_count":21568353,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-29T14:45:46.548Z","updated_at":"2025-04-28T11:42:10.413Z","avatar_url":"https://github.com/lsvih.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Sliding Convolution CTC for Scene Text Recognition\n\nImplementation of 'Scene Text Recognition with Sliding Convolutional Character Models'([pdf](https://arxiv.org/pdf/1709.01727))\n\n### Model\n\nSliding windows + CNN + CTC\n\n\u003cdiv align=center\u003e\n\u003cimg src=\"./resource/architecture.png\" width=\"500px\" /\u003e\n\u003c/div\u003e\n\n\n### Dependency\n\nWhile this implement might work for many cases, it is only tested for environment below:\n\n```\npython == 3.7.0\ntorch == 0.4.1\ntqdm\nnumpy\n```\n\n```\nwarp-ctc(for pytorch 0.4)\n```\n\n```\nCUDA 9.0.1\nCUDNN 7.0.5\n```\n\n#### Install warp-ctc\n\nFollow this [instruction](https://github.com/SeanNaren/warp-ctc/tree/0.4.1)\n\n\u003e **Note**:Version of warp-ctc should be corresponding with pytorch. [Related issue](https://github.com/SeanNaren/warp-ctc/issues/101)\n\n### Usage\n\nDownload [IIIT5K dataset](https://cdn.iiit.ac.in/cdn/cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz) and release files to dataset folder.\n\nPreprocess IIIT5K dataset\n```bash\npython3 prepare_IIIT5K_dataset.py\n```\n\nTrain model:\n```bash\npython3 main.py --cuda=True --mode=train\n```\nResume training:\n```bash\npython3 main.py --cuda=True --wram-up=True --mode=train\n```\nTest model:\n```bash\npython3 main.py --cuda=True --mode=test\n```\n\n\u003e **Note**: `model.bin` file is a pre-trained model which could achieve about 53% accuracy. (Due to the small training dataset)\n\n### Citation\n\nIf you find this work is useful in your research, please consider citing:\n\n```\n@article{yin2017scene,\n  title={Scene text recognition with sliding convolutional character models},\n  author={Yin, Fei and Wu, Yi-Chao and Zhang, Xu-Yao and Liu, Cheng-Lin},\n  journal={arXiv preprint arXiv:1709.01727},\n  year={2017}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flsvih%2Fsliding-convolution","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flsvih%2Fsliding-convolution","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flsvih%2Fsliding-convolution/lists"}