{"id":30743717,"url":"https://github.com/dmitriiweb/recipes-scanner","last_synced_at":"2025-09-04T02:48:41.627Z","repository":{"id":310394239,"uuid":"1039687909","full_name":"dmitriiweb/recipes-scanner","owner":"dmitriiweb","description":"Convert cookbook PDFs (one recipe per page) into clean Markdown with a local vision LLM (Ollama + pydantic-ai). uv-powered CLI.","archived":false,"fork":false,"pushed_at":"2025-08-17T19:22:45.000Z","size":89,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-08-17T21:20:36.747Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dmitriiweb.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-08-17T19:14:45.000Z","updated_at":"2025-08-17T19:22:48.000Z","dependencies_parsed_at":"2025-08-17T21:20:39.264Z","dependency_job_id":"70a4286c-14af-4ed7-922e-9a98b76cd2c4","html_url":"https://github.com/dmitriiweb/recipes-scanner","commit_stats":null,"previous_names":["dmitriiweb/recipes-scanner"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/dmitriiweb/recipes-scanner","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dmitriiweb%2Frecipes-scanner","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dmitriiweb%2Frecipes-scanner/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dmitriiweb%2Frecipes-scanner/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dmitriiweb%2Frecipes-scanner/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dmitriiweb","download_url":"https://codeload.github.com/dmitriiweb/recipes-scanner/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dmitriiweb%2Frecipes-scanner/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":273543836,"owners_count":25124340,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-04T02:00:08.968Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-09-04T02:48:38.418Z","updated_at":"2025-09-04T02:48:41.609Z","avatar_url":"https://github.com/dmitriiweb.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"### recipes-scanner\n\nConvert one-recipe-per-page PDF cookbooks into clean Markdown using a local vision LLM via Ollama + pydantic-ai.\n\n- **Input**: PDF (each page is a single recipe)\n- **Output**: one Markdown file per page in your chosen language\n- **Stack**: uv, pydantic-ai, Ollama (OpenAI-compatible), pdf2image, Pillow\n\n---\n\n## Quick start\n\n1) Install prerequisites\n- **Python**: 3.12+\n- **uv**: see `https://docs.astral.sh/uv/getting-started/`\n- **Ollama**: see `https://ollama.com`, then start the server: `ollama serve`\n\n2) Pull a vision model in Ollama (must support image input)\n- Recommended: `ollama pull gemma3:latest`\n\n3) Run the CLI with uv\n```bash\nuv run recipes-scanner \\\n  -i recipes-example.pdf \\\n  -o output \\\n  -m gemma3:latest\\\n  -l English \\\n  -t 0.2 \\\n  --api-base-url http://localhost:11434 \\\n```\n- Output Markdown files are written into `output/`. Each file name is taken from the first `# Heading` line.\n\n---\n\n## CLI\n```text\n-i, --input            Path to the input pdf file (required)\n-o, --output-folder    Path to the output folder (required)\n-m, --model            Model to use (default: gemma3:latest)\n-l, --language         Output language (default: English)\n-t, --temperature      Sampling temperature (default: 0.2)\n--api-key              API key (optional; not required for Ollama)\n--api-base-url         OpenAI-compatible base URL (default: http://localhost:11434)\n```\nNotes:\n- Use a **vision-capable** model. Text-only models will fail to parse images.\n- `--api-key` is accepted for compatibility; Ollama doesn’t need a real key.\n\n---\n\n## How it works\n- `pdf2image` converts each PDF page to a high-resolution image (500 DPI).\n- `pydantic-ai` `Agent` calls the Ollama OpenAI-compatible endpoint, sending the page image as `BinaryContent`.\n- The system prompt instructs the model to return Markdown:\n  - `# Dish Name`\n  - `## Ingredients` (bullet list)\n  - `## Cooking Method` (numbered steps)\n  - `## Notes`\n- The first Markdown heading becomes the output filename.\n\n---\n\n## Example output\n```markdown\n# Margherita Pizza\n\n## Ingredients\n- 250 g pizza dough\n- 100 g tomato sauce\n- 125 g mozzarella\n- Fresh basil leaves\n\n## Cooking Method\n1. Preheat oven to 250°C.\n2. Spread sauce on dough, add mozzarella.\n3. Bake 8–10 min, garnish with basil.\n\n## Notes\nFor a crisp base, preheat a pizza stone.\n```\n\n---\n\n## Tips and troubleshooting\n- **Model not found / image unsupported**: Ensure you pulled a vision model (e.g., `llama3.2-vision:11b`).\n- **Poppler missing**: Install Poppler so `pdftoppm` is available.\n- **Slow processing**: 500 DPI yields better OCR but is heavier. Consider downscaling pages before `recognize_image` if you customize the code.\n- **Weird filenames**: Filenames are derived from the first `# Heading` and not sanitized. If your recipes include characters invalid on your OS, rename files or sanitize in code.\n- **Empty outputs**: The agent returns an empty string if it can’t find ingredients and method; such pages are skipped.\n\n---\n\n## Development\n- Run directly with uv (no virtualenv activation needed):\n  - `uv run recipes-scanner -i input.pdf -o output -m llama3.2-vision:11b`\n- Or create a local environment:\n  ```bash\n  uv sync\n  uv run recipes-scanner -i input.pdf -o output -m llama3.2-vision:11b\n  ```\n- Entry point: `recipes_scanner.__main__:main`\n- Key modules:\n  - `recipes_scanner/pdf_scanner.py`: PDF → images via `pdf2image`\n  - `recipes_scanner/ocr.py`: `OcrAgent` using `pydantic-ai` with Ollama (OpenAI-compatible)\n  - `recipes_scanner/args.py`: CLI parsing\n\n---\n\n## License\nMIT \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdmitriiweb%2Frecipes-scanner","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdmitriiweb%2Frecipes-scanner","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdmitriiweb%2Frecipes-scanner/lists"}