{"id":21064136,"url":"https://github.com/somenath203/image-text-extractor","last_synced_at":"2025-05-16T02:32:25.413Z","repository":{"id":239327523,"uuid":"799224982","full_name":"somenath203/Image-Text-Extractor","owner":"somenath203","description":"Click here to checkout the website","archived":false,"fork":false,"pushed_at":"2024-07-28T14:17:27.000Z","size":868,"stargazers_count":2,"open_issues_count":0,"forks_count":2,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-03T18:52:49.099Z","etag":null,"topics":["daisyui","fastapi","huggingface","image-text-extractor","next","nextjs","pytesseract","tailwindcss","vercel"],"latest_commit_sha":null,"homepage":"https://image-text-generator.vercel.app/","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/somenath203.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-05-11T14:03:00.000Z","updated_at":"2024-07-31T20:30:40.000Z","dependencies_parsed_at":null,"dependency_job_id":"87e5e03d-3553-495a-b40e-1e094d1e5c3e","html_url":"https://github.com/somenath203/Image-Text-Extractor","commit_stats":null,"previous_names":["somenath203/image-text-extractor"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/somenath203%2FImage-Text-Extractor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/somenath203%2FImage-Text-Extractor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/somenath203%2FImage-Text-Extractor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/somenath203%2FImage-Text-Extractor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/somenath203","download_url":"https://codeload.github.com/somenath203/Image-Text-Extractor/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254456075,"owners_count":22074096,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["daisyui","fastapi","huggingface","image-text-extractor","next","nextjs","pytesseract","tailwindcss","vercel"],"created_at":"2024-11-19T17:48:24.294Z","updated_at":"2025-05-16T02:32:21.096Z","avatar_url":"https://github.com/somenath203.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Image Text Extractor\n\n## Demo video of the whole application\n\nhttps://github.com/user-attachments/assets/d48dcb33-7d6a-40db-b924-d9649279669f\n\n## introduction\n\nThis is a web app that lets users extract text from any image and also the user has the option to copy the extracted text in the clipboard as well.\n\n## Tech Stack Used\n\nNextJS along with TailwindCSS and daisyUI is used to develop the complete frontend whereas FastAPI along with pytesseract package is used to extract text from uploaded image as part of the backend.\n\n## Links\n\n1) Live Preview of the whole application: https://image-text-generator.vercel.app/\n2) Deployed FastAPI backend API: https://som11-image-text-extract.hf.space/\n3) Swagger Documentation of the FastAPI: https://som11-image-text-extract.hf.space/docs\n\n## NOTE\n\nIf you want to deploy the backend FASTAPI with the help of a service that runs on top of linux then, you need to make some changes in both the Dockerfile and the app.py file because `tesseract.exe` is a windows executable for which it will not work on linux servers.\n\nYour Dockerfile should look like this:\n```Dockerfile\n# Use the official Python image\nFROM python:3.9.7\n\n# Set the working directory in the container\nWORKDIR /code\n\n# Copy the requirements file into the container\nCOPY ./requirements.txt /code/requirements.txt\n\n# Install the dependencies\nRUN pip install --no-cache-dir --upgrade -r /code/requirements.txt\n\n# Install Tesseract OCR via the package manager\nRUN apt-get update \u0026\u0026 apt-get install -y tesseract-ocr\n\n# Copy the entire project directory into the container\nCOPY . /code\n\n# Command to run the FastAPI server\nCMD [\"uvicorn\", \"app:app\", \"--host\", \"0.0.0.0\", \"--port\", \"7860\"]\n```\n\nand app.py should look like this\n```py\nfrom PIL import Image\nimport pytesseract\nfrom fastapi import FastAPI, UploadFile, File\nfrom fastapi.middleware.cors import CORSMiddleware\nfrom io import BytesIO\n\n\napp = FastAPI()\n\n\norigins = [\"*\"]\n\napp.add_middleware(\n    CORSMiddleware,\n    allow_origins=origins,\n    allow_credentials=True,\n    allow_methods=[\"*\"],\n    allow_headers=[\"*\"],\n)\n\n\n@app.get('/')\ndef welcome():\n    return {\n        'success': True,\n        'message': 'server of \"image text extractor\" is up and running successfully.'\n    }\n\n@app.post('/extract-text-from-image')\nasync def extract_text_from_img(imageUploadedByUser: UploadFile = File(...)):\n    \n    img = await imageUploadedByUser.read()  \n\n    img_bytes_io = Image.open(BytesIO(img))\n\n    gray_scale_img = img_bytes_io.convert('L')\n\n    text = pytesseract.image_to_string(gray_scale_img)\n\n    text_cleaned = ' '.join(text.split())\n\n    return {\n        'success': True,\n        'message': 'Text has been successfully extracted from the uploaded image',\n        'extracted_text': text_cleaned\n    }\n```\n\nNow, you are ready to deploy the backend FastAPI application on linux servers.\n## Warning\n\nWhile this application is able to extract text quite accurately from image, there are occasions when it may produce incorrect text or fail to extract any text from image at all.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsomenath203%2Fimage-text-extractor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsomenath203%2Fimage-text-extractor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsomenath203%2Fimage-text-extractor/lists"}