{"id":24576141,"url":"https://github.com/y1d1r/pyfacture","last_synced_at":"2026-05-19T00:35:21.334Z","repository":{"id":272987820,"uuid":"909693928","full_name":"Y1D1R/PyFacture","owner":"Y1D1R","description":"PyFacture is a Python project designed to automate expense management from receipts. The application utilizes image processing techniques and Optical Character Recognition (OCR) using Tesseract and Llama3.2-vision to extract relevant information from a photo of a receipt.","archived":false,"fork":false,"pushed_at":"2025-01-17T20:47:59.000Z","size":833,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-01-17T21:29:30.288Z","etag":null,"topics":["image-procesing","lamma","ocr-python","opencv","python","tesseract"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Y1D1R.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-12-29T14:05:12.000Z","updated_at":"2025-01-17T20:48:01.000Z","dependencies_parsed_at":"2025-01-17T21:39:37.951Z","dependency_job_id":null,"html_url":"https://github.com/Y1D1R/PyFacture","commit_stats":null,"previous_names":["y1d1r/pyfacture"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Y1D1R%2FPyFacture","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Y1D1R%2FPyFacture/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Y1D1R%2FPyFacture/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Y1D1R%2FPyFacture/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Y1D1R","download_url":"https://codeload.github.com/Y1D1R/PyFacture/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244036253,"owners_count":20387483,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["image-procesing","lamma","ocr-python","opencv","python","tesseract"],"created_at":"2025-01-23T22:21:50.491Z","updated_at":"2026-05-19T00:35:21.253Z","avatar_url":"https://github.com/Y1D1R.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# PyFacture\n\nPyFacture is a Python project designed to automate expense management from receipts. The application utilizes image processing techniques and Optical Character Recognition (OCR) to extract relevant information from a photo of a receipt, such as purchased products, their prices, and the date of purchase.\n\n## Features\n\n- **Image Processing:** Enhances receipt images for better OCR accuracy.\n- **Optical Character Recognition (OCR):** Extracts text from receipt images using Tesseract or Llama.\n- **Data Extraction:** Analyzes OCR text to identify products, prices, and dates.\n- **Excel File Management:** Creates and updates Excel files to store extracted data.\n\n## Installation\n\n### 1. Clone the Repository\n\n```bash\ngit clone https://github.com/Y1D1R/PyFacture.git\ncd PyFacture\n```\n\n### 2. Install Dependencies\nInstall the required Python packages using pip:\n\n```bash\npip install -r requirements.txt\n```\n\n### 3. Install Tesseract OCR and Ollama\nPyFacture relies on Tesseract OCR for text extraction.\u003cbr\u003e\nFollow the instructions below based on your operating system.\n\nOnce you have Ollama installed, install the Llama 3.2-Vision model(6 GB):\u003cbr\u003e\n```bash\nollama run llama3.2-vision\n```\nMore information here : https://sebastian-petrus.medium.com/build-a-local-ollama-ocr-application-using-llama-3-2-vision-bfc3014e3ad6  \n\n### 4. Usage\n#### 4.1. Prepare Your Data\nPlace your receipt images in the \"data/input/\" directory.\u003cbr\u003e \nEnsure that the images are clear, well-lit, and free from distortions for optimal OCR results.\n\n\n#### 4.2. Run the Application\nExecute the main script, then choose the method from the menu to process the receipts and extract data:\n\n```bash\npython pyfacture/main.py\n```\n\u003cimg src=\"pyfacture/img/Figure_4.png\" alt=\"Menu\" width=\"900\"\u003e\n\n\n#### 4.3. View the Results\n##### 4.3.1 Tesseract OCR\n\u003cimg src=\"pyfacture/img/Figure_1.png\" alt=\"Original Receipt\" width=\"400\"\u003e\n\u003cimg src=\"pyfacture/img/Figure_2.png\" alt=\"Thresholded Receipt\" width=\"400\"\u003e\nThe extracted data will be saved as Excel files in the \"data/output/\" directory. \n\u003cimg src=\"pyfacture/img/Figure_3.png\" alt=\"OCR Result\" width=\"700\"\u003e\n\n##### 4.3.1 Llama OCR\n\u003cimg src=\"pyfacture/img/Figure_5.png\" alt=\"Llam OCR\" width=\"300\"\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fy1d1r%2Fpyfacture","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fy1d1r%2Fpyfacture","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fy1d1r%2Fpyfacture/lists"}