{"id":24041630,"url":"https://github.com/livewithcodeankit/ai-ocr","last_synced_at":"2026-04-16T17:02:50.106Z","repository":{"id":270646298,"uuid":"911031214","full_name":"LiveWithCodeAnkit/AI-OCR","owner":"LiveWithCodeAnkit","description":"Advanced AI-OCR with FastAPI and OpenAI Integration","archived":false,"fork":false,"pushed_at":"2025-05-09T09:26:58.000Z","size":4851,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-05-09T10:35:46.777Z","etag":null,"topics":["ai","ml","ocr","ocr-text-reader","openai","pydantic","pytesseract-ocr","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/LiveWithCodeAnkit.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-01-02T04:57:13.000Z","updated_at":"2025-05-09T09:27:01.000Z","dependencies_parsed_at":"2025-01-02T05:35:59.815Z","dependency_job_id":null,"html_url":"https://github.com/LiveWithCodeAnkit/AI-OCR","commit_stats":null,"previous_names":["livewithcodeankit/ai-ocr"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/LiveWithCodeAnkit/AI-OCR","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LiveWithCodeAnkit%2FAI-OCR","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LiveWithCodeAnkit%2FAI-OCR/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LiveWithCodeAnkit%2FAI-OCR/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LiveWithCodeAnkit%2FAI-OCR/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/LiveWithCodeAnkit","download_url":"https://codeload.github.com/LiveWithCodeAnkit/AI-OCR/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LiveWithCodeAnkit%2FAI-OCR/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31895650,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-16T11:36:10.202Z","status":"ssl_error","status_checked_at":"2026-04-16T11:36:09.652Z","response_time":69,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","ml","ocr","ocr-text-reader","openai","pydantic","pytesseract-ocr","python"],"created_at":"2025-01-08T22:11:44.313Z","updated_at":"2026-04-16T17:02:50.096Z","avatar_url":"https://github.com/LiveWithCodeAnkit.png","language":"Python","readme":"# 📄 **Advanced AI-OCR  with FastAPI and OpenAI Integration**\n\n## 🚀 **Project Overview**\nThis project is an advanced Optical Character Recognition (OCR) API built using **FastAPI**. It leverages powerful image processing libraries such as **OpenCV**, **Pillow (PIL)**, and **pytesseract** to extract accurate text from images and PDFs, even with challenges like skewed, rotated, or noisy inputs. Additionally, **OpenAI** integration enhances text formatting and intelligent post-processing.\n\n---\n\n## 🛠️ **Key Features**\n1. **Multi-File Support:** Upload and process multiple images or PDFs simultaneously.\n2. **Preprocessing Pipelines:** Noise reduction, deskewing, thresholding, and edge detection for improved OCR accuracy.\n3. **Rotation \u0026 Skew Correction:** Automatically detect and fix image rotation and skewness.\n4. **High OCR Accuracy:** Configurable `pytesseract` parameters for optimal text extraction.\n5. **OpenAI Integration:** Intelligent text formatting and validation using OpenAI APIs.\n6. **FastAPI Framework:** Efficient, scalable, and production-ready API services.\n\n---\n\n## 📚 **Technologies Used**\n- **FastAPI:** Backend framework for API development.\n- **OpenCV:** Image processing and preprocessing.\n- **Pillow (PIL):** Image manipulation.\n- **pytesseract:** OCR engine.\n- **NumPy:** Numerical operations.\n- **OpenAI API:** Intelligent text formatting.\n- **Pydantic:** Data validation.\n\n---\n\n\n## 📥 **Installation \u0026 Setup**\n1. Clone the repository:\n   ```bash\n   git clone https://github.com/your-repo/ocr-api.git\n   cd ocr-api\n\npython -m venv venv\npip install -r requirements.txt\nvenv\\Scripts\\activate\nuvicorn main:app --reload\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flivewithcodeankit%2Fai-ocr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flivewithcodeankit%2Fai-ocr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flivewithcodeankit%2Fai-ocr/lists"}