{"id":15787243,"url":"https://github.com/ysig/learnable-typewriter","last_synced_at":"2025-04-26T15:10:43.261Z","repository":{"id":120701483,"uuid":"585110573","full_name":"ysig/learnable-typewriter","owner":"ysig","description":"The Learnable Typewriter: A Generative Approach to Text Line Analysis","archived":false,"fork":false,"pushed_at":"2024-10-31T14:12:15.000Z","size":383,"stargazers_count":33,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-26T10:11:40.300Z","etag":null,"topics":["computer-vision","ctc-loss","deep-learning","htr","image-decomposition","ocr","paleography","sprites","supervised-learning","unsupervised-learning"],"latest_commit_sha":null,"homepage":"http://imagine.enpc.fr/~siglidii/learnable-typewriter/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ysig.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-01-04T10:47:58.000Z","updated_at":"2025-04-19T17:54:25.000Z","dependencies_parsed_at":null,"dependency_job_id":"db7711d2-45c3-42ac-a0a1-df2fe46bd8a8","html_url":"https://github.com/ysig/learnable-typewriter","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ysig%2Flearnable-typewriter","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ysig%2Flearnable-typewriter/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ysig%2Flearnable-typewriter/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ysig%2Flearnable-typewriter/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ysig","download_url":"https://codeload.github.com/ysig/learnable-typewriter/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250976100,"owners_count":21516878,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","ctc-loss","deep-learning","htr","image-decomposition","ocr","paleography","sprites","supervised-learning","unsupervised-learning"],"created_at":"2024-10-04T21:06:46.256Z","updated_at":"2025-04-26T15:10:43.234Z","avatar_url":"https://github.com/ysig.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"![teaser.png](./.media/teaser.png)\n\n# The Learnable Typewriter \u003cbr\u003e\u003csub\u003eA Generative Approach to Text Analysis\u003c/sub\u003e\nOfficial PyTorch implementation of [The Learnable Typewriter: A Generative Approach to Text Αnalysis](https://imagine.enpc.fr/~siglidii/learnable-typewriter/).  \nAuthors: [Yannis Siglidis](https://imagine.enpc.fr/~siglidii/), [Nicolas Gonthier](https://perso.telecom-paristech.fr/gonthier/), [Julien Gaubil](https://juliengaubil.github.io/), [Tom Monnier](https://www.tmonnier.com/), [Mathieu Aubry](http://imagine.enpc.fr/~aubrym/).  \nResearch Institute: [Imagine](https://imagine.enpc.fr/), _LIGM, Ecole des Ponts, Univ Gustave Eiffel, CNRS, Marne-la-Vallée, France_  \n[ICDAR 2024 (Best Paper Award)](https://icdar2024.net/).\n\n## Install :seedling:\n```shell\nconda create --name ltw pytorch==1.9.1 torchvision==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge\nconda activate ltw\npython -m pip install -r requirements.txt\n```\n\n### Datasets :sunny: Models :hammer:\n**Dropbox**: Download \u0026 extract [datasets.zip](https://www.dropbox.com/s/0fa9hcbfu9vr3t2/datasets.zip?dl=0) and [runs.zip](https://www.dropbox.com/s/c4c7lbp1ydqs9dj/runs.zip?dl=0) in the parent folder.  \n**Huggingface**: `python scripts/download-hf.py`\n\n## Inference :peach:\nFor minimal inference and plotting we provide a [standalone notebook. ![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1yDL_HGncDiMzShA7c-OZYOgYrb1_qqGf)\n\nTo reproduce the figures of the paper run the `scripts/figures.ipynb` notebook.\n\nHelper scripts are also provided to perform evaluation on the corresponding datasets:\n\n```python\npython scripts/eval.py -i \u003cMODEL-PATH\u003e {--eval, --eval_best}\n```\n\nand produce figures and sprites for certain samples:\n\n```python\npython scripts/eval.py -i \u003cMODEL-PATH\u003e -s {train, val, test} -id 0 0 0 -is 1 2 3 --plot_sprites\n```\n\n## Training :blossom:\nTraining and model configure is performed though hydra.\nWe supply the corresponding config files for all our baseline experiments.\n\n### Google :newspaper:\n```python\npython scripts/train.py supervised-google.yaml\npython scripts/train.py unsupervised-google.yaml\n```\n\n### Copiale :scroll:\n```python \npython scripts/train.py supervised-copiale.yaml\npython scripts/train.py unsupervised-copiale.yaml\n```\n\n### Fontenay :church:\n```python\npython scripts/train.py supervised-fontenay.yaml\n```\n\nand finetune with:\n\n```python\npython scripts/fontenay.py -i fontenay/fontenay/\u003cMODEL_NAME\u003e -o fontenay/fontenay-ft/ --max_epochs 150 -k \"training.optimizer.lr=0.001\"\n```\n\n\u003e To all of the above experiment config files, additional command line overrides could be applied to further modify them using the [hydra syntax](https://hydra.cc/docs/advanced/override_grammar/basic/).\n\n### Custom Dataset :floppy_disk:\nTrying the LT on a new dataset is dead easy.\n\nFirst create a config file:\n\n```\nconfigs/\u003cDATASET_ID\u003e.yaml\n\n...\n\nDATASET-TAG:\n  path: \u003cDATASET-NAME\u003e/\n  sep: ''                    # How the character separator is denoted in the annotation. \n  space: ' '                 # How the space is denoted in the annotation.\n```\n\nThen create the dataset folder:\n\n```\ndatasets/\u003cDATASET-NAME\u003e\n├── annotation.json\n└── images\n  ├── \u003cimage_id\u003e.jpg\n  └── ...\n```\n\nThe annotation.json file should be a dictionary with entries of the form:\n```\n    \"\u003cimage_id\u003e\": {\n        \"split\": \"train\",                            # {\"train\", \"val\", \"test\"} - \"val\" is ignored in the unsupervised case.\n        \"label\": \"A beautiful calico cat.\"           # The text that corresponds to this line.\n    },\n```\n\nYou can completely ignore the annotation.json file in the case of unsupervised training without evaluation.\n\n\n### Logging :chart_with_downwards_trend:\nLogging is done through tensorboard. To visualize results run:\n\n```bash\ntensorboard --logdir ./\u003crun_dir\u003e/\n```\n\n_If you want to dive in deeper, check out our [experimental features](https://github.com/ysig/learnable-typewriter/blob/main/EXPERIMENTAL.md)._\n\n### Citing :dizzy:\n\n```bibtex\n@misc{the-learnable-typewriter,\n\ttitle = {The Learnable Typewriter: A Generative Approach to Text Line Analysis},\n\tauthor = {Siglidis, Ioannis and Gonthier, Nicolas and Gaubil, Julien and Monnier, Tom and Aubry, Mathieu},\n\tpublisher = {arXiv},\n\tyear = {2023},\n\turl = {https://arxiv.org/abs/2302.01660},\n\tkeywords = {Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences},\n\tdoi = {10.48550/ARXIV.2302.01660},\n\tcopyright = {Creative Commons Attribution 4.0 International}\n}\n```\n\n## Also check out :rainbow:\nIf you like this project, have also a look to related work produced by our team:\n\n- [Efstathiou et al. - An Interpretable Deep Learning Approach for Morphological Script Type Analysis (ICWP 2024)](https://learnable-typewriter-pal.github.io/)\n- [Monnier et al. - Share With Thy Neighbors: Single-View Reconstruction by Cross-Instance Consistency (ECCV 2022)](https://www.tmonnier.com/UNICORN/)\n- [Loiseau et al. - Representing Shape Collections with Alignment-Aware Linear Models (3DV 2021)](https://romainloiseau.github.io/deep-linear-shapes/)\n- [Monnier et al. - Unsupervised Layered Image Decomposition into Object Prototypes (ICCV 2021)](https://arxiv.org/abs/2006.11132)\n- [Monnier et al. - Deep Transformation Invariant Clustering (NeurIPS 2020)](https://arxiv.org/abs/2006.11132)\n- [Deprelle et al. - Learning elementary structures for 3D shape generation and matching (NeurIPS 2019)](https://arxiv.org/abs/1908.04725)\n- [Groueix et al. - 3D-CODED: 3D Correspondences by Deep Deformation (ECCV 2018)](https://arxiv.org/abs/1806.05228)\n- [Groueix et al. - AtlasNet: A Papier-Mache Approach to Learning 3D Surface Generation (CVPR 2018)](https://arxiv.org/abs/1802.05384)\n\n\n## Acknowledgements :sparkles:\nWe would like to thank Malamatenia Vlachou and Dominique Stutzmann for sharing ideas, insights and data for applying our method in paleography; Vickie Ye and Dmitriy Smirnov for useful insights and discussions; Romain Loiseau, Mathis Petrovich, Elliot Vincent, Sonat Baltacı for manuscript feedback and constructive insights. This work was partly supported by the European Research Council (ERC project DISCOVER, number 101076028), ANR project EnHerit ANR-17-CE23-0008, ANR project VHS ANR-21-CE38-0008 and HPC resources from GENCI-IDRIS (2022-AD011012780R1, AD011012905).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fysig%2Flearnable-typewriter","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fysig%2Flearnable-typewriter","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fysig%2Flearnable-typewriter/lists"}