{"id":23428422,"url":"https://github.com/xredax/xtract-text","last_synced_at":"2026-01-04T10:31:22.083Z","repository":{"id":257011580,"uuid":"854694960","full_name":"XredaX/Xtract-Text","owner":"XredaX","description":"A Telegram bot that extracts text from images using pytesseract (OCR). Simply send an image, and the bot will respond with the extracted text","archived":false,"fork":false,"pushed_at":"2024-09-13T18:07:40.000Z","size":9,"stargazers_count":1,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-12-23T07:12:32.274Z","etag":null,"topics":["api","pytesseract-ocr","python","telegram","telegram-bot","text-recognition"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/XredaX.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-09-09T16:09:01.000Z","updated_at":"2024-10-15T10:37:40.000Z","dependencies_parsed_at":null,"dependency_job_id":"76c5f3cb-aac6-4622-b2a0-8a00a5f3a7ad","html_url":"https://github.com/XredaX/Xtract-Text","commit_stats":null,"previous_names":["xredax/xtract-text"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/XredaX%2FXtract-Text","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/XredaX%2FXtract-Text/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/XredaX%2FXtract-Text/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/XredaX%2FXtract-Text/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/XredaX","download_url":"https://codeload.github.com/XredaX/Xtract-Text/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":238957335,"owners_count":19558638,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api","pytesseract-ocr","python","telegram","telegram-bot","text-recognition"],"created_at":"2024-12-23T07:12:28.985Z","updated_at":"2025-10-30T11:30:56.332Z","avatar_url":"https://github.com/XredaX.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 📸 Telegram Image-to-Text OCR Bot\n\nThis is a **Telegram Bot** that extracts text from images using **pytesseract** (Tesseract OCR) and the **Python Telegram API**. Users can send an image to the bot, and it will respond with the extracted text from the image.\n\n## ✨ Features\n- **Image-to-text conversion**: Use `pytesseract` to convert images into text.\n- **Error handling**: Catches and logs errors during image processing.\n- **Instant response**: Quickly processes images and returns extracted text via Telegram.\n\n## 🚀 Getting Started\n\nFollow these steps to set up and run the bot locally.\n\n### 1. Clone the repository\n```bash\ngit clone https://github.com/XredaX/Xtract-Text\ncd Xtract-Text\n```\n\n### 2. Install dependencies\nMake sure you have Python and the necessary libraries installed. Run:\n```bash\npip install -r requirements.txt\n```\n\n### 3. Install Tesseract\nEnsure that Tesseract is installed on your machine. You can install it via:\n\n**Linux:**\n```bash\nsudo apt update\nsudo apt install tesseract-ocr\n```\n\nMake sure to set the `TESSDATA_PREFIX` to point to your Tesseract data files (typically for language support).\n\n### 4. Set environment variables\nYou need to set two environment variables for the bot to work:\n\n- `BOT_TOKEN`: Your Telegram bot token.\n- `TESSDATA_PREFIX`: Path to the Tesseract data directory.\n\nYou can set these in your terminal or use a `.env` file.\n\nFor terminal:\n```bash\nexport BOT_TOKEN=\"your_telegram_bot_token\"\nexport TESSDATA_PREFIX=\"/usr/share/tesseract-ocr/5/tessdata\"\n```\n\nFor `.env` file (create this in the root directory):\n```\nBOT_TOKEN=your_telegram_bot_token\nTESSDATA_PREFIX=/usr/share/tesseract-ocr/5/tessdata\n```\n\n### 5. Run the bot\nAfter setting the environment variables, you can run the bot:\n```bash\npython bot.py\n```\n\nThe bot will now be up and running. Send an image to the bot in Telegram, and it will reply with the extracted text.\n\n### Example Commands\n- `/start`: Starts the bot and welcomes the user.\n- Send an image: The bot will reply with the extracted text from the image.\n\n## 🤝 Contributing\nFeel free to open issues or submit pull requests for improvements!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxredax%2Fxtract-text","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fxredax%2Fxtract-text","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxredax%2Fxtract-text/lists"}