{"id":24618352,"url":"https://github.com/ryanlinjui/menu-text-detection","last_synced_at":"2026-04-20T14:03:26.376Z","repository":{"id":209775096,"uuid":"691093603","full_name":"ryanlinjui/menu-text-detection","owner":"ryanlinjui","description":"Extract structured menu information from images into JSON by E2E Vision-Language model fine-tuning pipeline or LLM.","archived":false,"fork":false,"pushed_at":"2026-04-12T06:18:32.000Z","size":5693,"stargazers_count":0,"open_issues_count":2,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-04-12T08:15:14.071Z","etag":null,"topics":["document-understanding","donut","fine-tuning","image-text-to-text","transformer"],"latest_commit_sha":null,"homepage":"https://huggingface.co/spaces/ryanlinjui/menu-text-detection","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ryanlinjui.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2023-09-13T13:40:46.000Z","updated_at":"2026-04-12T06:18:37.000Z","dependencies_parsed_at":"2023-11-29T04:33:15.376Z","dependency_job_id":"7c706647-a52a-4e30-9538-8bf858aebd0b","html_url":"https://github.com/ryanlinjui/menu-text-detection","commit_stats":null,"previous_names":["ryanlinjui/menu-text-detection"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ryanlinjui/menu-text-detection","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ryanlinjui%2Fmenu-text-detection","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ryanlinjui%2Fmenu-text-detection/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ryanlinjui%2Fmenu-text-detection/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ryanlinjui%2Fmenu-text-detection/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ryanlinjui","download_url":"https://codeload.github.com/ryanlinjui/menu-text-detection/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ryanlinjui%2Fmenu-text-detection/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32050452,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-20T11:35:06.609Z","status":"ssl_error","status_checked_at":"2026-04-20T11:34:48.899Z","response_time":94,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["document-understanding","donut","fine-tuning","image-text-to-text","transformer"],"created_at":"2025-01-24T23:51:43.930Z","updated_at":"2026-04-20T14:03:26.314Z","avatar_url":"https://github.com/ryanlinjui.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Menu Text Detection System\n\nExtract structured menu information from images into JSON using a fine-tuned E2E model or LLM.  \n\n[![Gradio Space Demo](https://img.shields.io/badge/GradioSpace-Demo-important?logo=huggingface)](https://huggingface.co/spaces/ryanlinjui/menu-text-detection)\n[![Hugging Face Models \u0026 Datasets](https://img.shields.io/badge/HuggingFace-Models_\u0026_Datasets-important?logo=huggingface)](https://huggingface.co/collections/ryanlinjui/menu-text-detection-670ccf527626bb004bbfb39b)\n\nhttps://github.com/user-attachments/assets/80e5d54c-f2c8-4593-ad9b-499e5b71d8f6\n\n## 🚀 Features\n### Overview\nCurrently supports the following information from menu images:\n\n- **Restaurant Name**  \n- **Business Hours**  \n- **Address**  \n- **Phone Number**\n- **Dish Information**\n  - Name  \n  - Price  \n\n\u003e For the JSON schema, see [tools directory](./tools).\n\n### Supported Methods to Extract Menu Information\n#### Fine-tuned E2E model and Training metrics\n- [**Donut (Document Parsing Task)**](https://huggingface.co/ryanlinjui/donut-base-finetuned-menu) - Base model by [Clova AI (ECCV ’22)](https://github.com/clovaai/donut)\n\n#### LLM Function Calling\n- Google Gemini API\n- OpenAI GPT API\n\n## 💻 Training / Fine-Tuning\n### Setup\nUse [uv](https://github.com/astral-sh/uv) to set up the development environment:\n\n```bash\nuv sync\n```\n\n\u003e or use `pip install -r requirements.txt` if it has any problems\n\n### Training Script (Datasets collecting, Fine-Tuning)\nPlease refer [`train.ipynb`](./train.ipynb). Use Jupyter Notebook for training:\n\n```bash\nuv run jupyter-notebook\n```\n\n\u003e For VSCode users, please install Jupyter extension, then select `.venv/bin/python` as your kernel.\n\n### Run Demo Locally\n```bash\nuv run python app.py\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fryanlinjui%2Fmenu-text-detection","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fryanlinjui%2Fmenu-text-detection","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fryanlinjui%2Fmenu-text-detection/lists"}