Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with tesseract

A curated list of projects in awesome lists tagged with tesseract .

https://github.com/tesseract-ocr/tesseract

Tesseract Open Source OCR Engine (main repository)

hacktoberfest lstm machine-learning ocr ocr-engine tesseract tesseract-ocr

Last synced: 29 Sep 2024

https://github.com/naptha/tesseract.js

Pure Javascript OCR for more than 100 Languages ๐Ÿ“–๐ŸŽ‰๐Ÿ–ฅ

deep-learning javascript ocr tesseract webassembly

Last synced: 29 Sep 2024

https://github.com/ocrmypdf/ocrmypdf

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

image-processing ocr pdf python tesseract

Last synced: 29 Sep 2024

https://github.com/jbarlow83/OCRmyPDF

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

image-processing ocr pdf python tesseract

Last synced: 02 Aug 2024

https://github.com/ocrmypdf/OCRmyPDF

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

image-processing ocr pdf python tesseract

Last synced: 30 Jul 2024

https://github.com/tesseract-ocr/tessdata

Trained models with fast variant of the "best" LSTM models + legacy models

ocr tesseract

Last synced: 30 Sep 2024

https://github.com/kelaberetiv/TagUI

Free RPA tool by AI Singapore

ai nlp opencv rpa tesseract

Last synced: 04 Aug 2024

https://github.com/aisingapore/tagui

Free RPA tool by AI Singapore

ai nlp opencv rpa tesseract

Last synced: 30 Sep 2024

https://github.com/aisingapore/TagUI

Free RPA tool by AI Singapore

ai nlp opencv rpa tesseract

Last synced: 30 Jul 2024

https://github.com/pymupdf/pymupdf

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

data-science epub extract-data font mupdf ocr pdf pdf-documents pymupdf python table-extraction tesseract text-processing text-shaping xps

Last synced: 29 Sep 2024

https://github.com/tebelorg/rpa-python

Python package for doing RPA

cross-platform opencv python rpa sikuli tagui tesseract

Last synced: 01 Oct 2024

https://github.com/tebelorg/RPA-Python

Python package for doing RPA

cross-platform opencv python rpa sikuli tagui tesseract

Last synced: 31 Jul 2024

https://github.com/pymupdf/PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

data-science epub extract-data font mupdf ocr pdf pdf-documents pymupdf python table-extraction tesseract text-processing text-shaping xps

Last synced: 01 Aug 2024

https://github.com/thiagoalessio/tesseract-ocr-for-php

A wrapper to work with Tesseract OCR inside PHP.

image-to-text ocr php tesseract text-recognition

Last synced: 30 Sep 2024

https://github.com/otiai10/gosseract

Go package for OCR (Optical Character Recognition), by using Tesseract C++ library

go ocr ocr-server tesseract tesseract-ocr

Last synced: 30 Sep 2024

https://github.com/rmtheis/android-ocr

Experimental optical character recognition app

android ocr optical-character-recognition tesseract

Last synced: 25 Sep 2024

https://github.com/dicklesworthstone/llm_aided_ocr

Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.

ai-assist llama2 llm ocr ocr-correction tesseract

Last synced: 27 Sep 2024

https://github.com/sirfz/tesserocr

A Python wrapper for the tesseract-ocr API

cython ocr optical-character-recognition python-library tesseract

Last synced: 30 Sep 2024

https://github.com/Pulover/PuloversMacroCreator

Automation Utility - Recorder & Script Generator

ahk autohotkey automation robot rpa tesseract

Last synced: 31 Jul 2024

https://github.com/Dicklesworthstone/llm_aided_ocr

Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.

ai-assist llama2 llm ocr ocr-correction tesseract

Last synced: 31 Aug 2024

https://github.com/gauravsingh9356/j.a.r.v.i.s

Personal Assistant built using python libraries. It does almost anything which includes sending emails, Optical Text Recognition, Dynamic News Reporting at any time with API integration, Todo list generator, Opens any website with just a voice command, Plays Music, Wikipedia searching, Dictionary with Intelligent Sensing i.e. auto spell checking, Weather Reporting i.e. temp, wind speed, humidity, YouTube searching, Google Map searching, Youtube Downloading, etc.

chatgpt dictionary-application difflib hacktoberfest hactober-accepted hactoberfest2021 jarvis jarvis-ai newsapi opencv optical-character-recognition optical-text-recognition python python3 pyttsx3 speech-recognition tesseract tesseract-ocr weather-api webbrowser

Last synced: 01 Oct 2024

https://github.com/GauravSingh9356/J.A.R.V.I.S

Personal Assistant built using python libraries. It does almost anything which includes sending emails, Optical Text Recognition, Dynamic News Reporting at any time with API integration, Todo list generator, Opens any website with just a voice command, Plays Music, Wikipedia searching, Dictionary with Intelligent Sensing i.e. auto spell checking, Weather Reporting i.e. temp, wind speed, humidity, YouTube searching, Google Map searching, Youtube Downloading, etc.

chatgpt dictionary-application difflib hacktoberfest hactober-accepted hactoberfest2021 jarvis jarvis-ai newsapi opencv optical-character-recognition optical-text-recognition python python3 pyttsx3 speech-recognition tesseract tesseract-ocr weather-api webbrowser

Last synced: 03 Aug 2024

https://github.com/tesseract-ocr/tessdata_fast

Fast integer versions of trained LSTM models

ocr tesseract

Last synced: 01 Aug 2024

https://github.com/junhoyeo/betterocr

๐Ÿ” Better text detection by combining multiple OCR engines (EasyOCR, Tesseract, and Pororo) with ๐Ÿง  LLM.

ai chatgpt chatgpt-api easyocr llm ocr openai openai-api tesseract tesseract-ocr

Last synced: 02 Aug 2024

https://github.com/cseas/ocr-table

Extract tables from scanned image PDFs using Optical Character Recognition.

extract-tables ocr ocr-table optical-character-recognition pdfminer python scanned-image-pdfs shell tesseract

Last synced: 01 Aug 2024

https://github.com/LeoFCardoso/pdf2pdfocr

A free tool to OCR a PDF and add a text "layer" in the original file, making a searchable PDF. Use only open source tools. Please tip!

docker ocr pdf pdftk python tesseract

Last synced: 30 Jul 2024

https://github.com/ropensci/tesseract

Bindings to Tesseract OCR engine for R

ocr r r-package rstats tesseract tesseract-ocr

Last synced: 02 Aug 2024

https://github.com/SwiftyTesseract/SwiftyTesseract

A Swift wrapper around Tesseract for use in iOS, macOS, and Linux applications

ios ocr optical-character-recognition swift tesseract

Last synced: 09 Aug 2024

https://github.com/scott0123/Tesseract-macOS

Objective C wrapper for the open source OCR Engine Tesseract (macOS)

mac macos objective-c ocr screenshot tesseract tesseract-mac tesseract-macos xcode

Last synced: 31 Jul 2024

https://github.com/Dicklesworthstone/llama2_aided_tesseract

Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections, complete with options for text validation and hallucination filtering.

ai-assist hallucinations llama2 llm ocr tesseract

Last synced: 01 Aug 2024

https://github.com/the-black-knight-01/Tabulo

Table Detection and Extraction Using Deep Learning ( It is built in Python, using Luminoth, TensorFlow<2.0 and Sonnet.)

deep-learning detection faster-r-cnn luminoth ocr pdf-table-extraction python sonnet ssd table-data-extraction table-detection table-detection-using-deep-learning table-recognition tabulo tensorflow tesseract

Last synced: 03 Aug 2024

https://github.com/trekhleb/links-detector

๐Ÿ“– ๐Ÿ‘†๐Ÿป Links Detector makes printed links clickable via your smartphone camera. No need to type a link in, just scan and click on it.

computer-vision javascript machine-learning object-detection ocr tensorflowjs tesnorflow tesseract typescript

Last synced: 03 Oct 2024

https://github.com/skylander86/lambda-text-extractor

AWS Lambda functions to extract text from various binary formats.

aws-lambda lambda-functions ocr pdf pdf-ocr-extraction searchable-pdfs tesseract text-extraction

Last synced: 04 Aug 2024

https://github.com/alimranahmed/laraocr

Laravel Optical Character Reader(OCR) package using ocr engines(Tesseract)

laravel55 ocr tesseract

Last synced: 27 Sep 2024

https://github.com/koreader/koreader-base

Base framework offering a Lua scriptable environment for creating document readers

djvu emulator epub ffi koreader leptonica lua luajit mupdf pdf sdl tesseract ubuntu

Last synced: 03 Aug 2024

https://github.com/ndavd/ncube

Generalized Hypercube Visualizer

bevy hypercube mathematics rust simulation tesseract

Last synced: 28 Sep 2024

https://github.com/bweigel/aws-lambda-tesseract-layer

A layer for AWS Lambda containing the tesseract C libraries and tesseract executable.

amazon-linux aws-lambda lambda lambda-layer serverless serverless-framework tesseract

Last synced: 01 Aug 2024

https://github.com/008karan/PAN_OCR

Building OCR using YOLO and Tesseract

deep-learning ocr tesseract yolo

Last synced: 02 Aug 2024

https://github.com/SkeathyTomas/genshin_artifact_auxiliary

A Genshin Impact artifact rater sticking upon artifacts inside the game window. ๅˆปๆ™ดๅŠžๅ…ฌๆกŒ | ๅŽŸ็ฅž | ๅœฃ้—็‰ฉ่ฏ„ๅˆ†ใ€‚้›†ๆˆๅœจๆธธๆˆ็ช—ๅฃไน‹ไธŠ็š„ๅŽŸ็ฅžๅœฃ้—็‰ฉๅฏผๅ‡บใ€่ฏ„ๅˆ†ๅทฅๅ…ท๏ผŒๆ— ้œ€ๆธธๆˆๅ†…ๅค–ๆฅๅ›žๅˆ‡ๆขๅฏนๆฏ”๏ผŒๆธธๆˆไธญๅฟซ้€Ÿ่ฎก็ฎ—ไธŽๆŸฅ้˜…็ป“ๆžœใ€‚

genshin-impact ocr paddleocr pyside6 python rapidocr tesseract

Last synced: 02 Aug 2024

https://github.com/shelfio/aws-lambda-tesseract

6 MB Tesseract (with English training data) to fit inside AWS Lambda

aws-lambda node-module nodejs npm-package ocr optical-character-recognition serverless tesseract

Last synced: 02 Oct 2024

https://github.com/lexmartinez/ocr-electron-vue

:card_index: A Simple OCR Application built on Electron, Vue.js & Tesseract.js

electron ocr tesseract vuejs

Last synced: 01 Aug 2024

https://github.com/Monogramm/erpnext_ocr

:snake: :alembic: Optical Character Recognition using tesseract within Frappe.

erpnext frappe ocr python tesseract

Last synced: 01 Aug 2024

https://github.com/Arthelon/imgclip

Command line utility that extracts text from an image into the system clipboard.

cli command-line javascript ocr tesseract

Last synced: 01 Aug 2024

https://github.com/nmapx/revolut-stocks-list

Extract Revolut stocks list from the list screenshot(s).

extract image list ocr revolut screenshot stocks tesseract

Last synced: 09 Aug 2024

https://github.com/farhanchoudhary/PAN_Card_OCR_Project

To extract details from Indian National Identification Cards such as PAN (completed) & Aadhar, Passport, Driving License (WIP) in a structured format

image-processing image-to-text ocr optical-character-recognition pan pan-card pytesseract tesseract

Last synced: 12 Aug 2024

https://github.com/hertzg/tesseract-server

A small lightweight HTTP server that converts photos, images and scanned documents to text using optical character recognition by utilizing the power of Google Tesseract.

api container containers docker docker-compose docker-image hacktoberfest http-server image-processing ocr rest-api tesseract tesseract-server typescript

Last synced: 01 Aug 2024

https://github.com/Lynnesbian/OCRbot

An OCR (Optical Character Recognition) bot for Mastodon (and compatible) instances

mastodon ocr python python3 tesseract

Last synced: 01 Aug 2024

https://github.com/aryaminus/saram

Get OCR in txt form from an image or pdf extension supporting multiple files from directory using pytesseract with auto rotation for wrong orientation. PYPI:

character-recognition chmod image ocr orientation-detection pdf pillow pyocr pytesseract python tesseract wand

Last synced: 07 Aug 2024

https://github.com/GerHobbelt/qiqqa-open-source

The open-sourced version of the award-winning Qiqqa research management tool for Windows (a bleeding edge dev fork) ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป ใƒป โ˜žโ˜žโ˜ž File any issues you find in the main repo issue tracker at https://github.com/jimmejardine/qiqqa-open-source/issues

citations document-classification document-management meta-analysis metadata mupdf pdf qiqqa tesseract

Last synced: 01 Aug 2024

https://github.com/leverylteam/leveryl

An Advanced & Feature Rich Server Software for MC:PE 1.1.x

mcpe mcpe-server php php7 pmmp pocketmine-mp spoon tesseract

Last synced: 26 Sep 2024

https://github.com/mauvilsa/tesseract-recognize

Tool that does layout analysis and/or text recognition using tesseract and outputs the result in Page XML format

cli docker-image document-recognition ocr optical-character-recognition pagexml tesseract text-detection

Last synced: 30 Jul 2024

https://github.com/bandrel/ocyara

Performs OCR on image files and scans them for matches to YARA rules

ocr optical-character-recognition python python-3 tesseract tesseract-ocr-api yara yara-rules

Last synced: 28 Sep 2024

https://github.com/eddyverbruggen/nativescript-ocr

:newspaper: :mag: Tesseract-powered OCR plugin for NativeScript

nativescript nativescript-plugin ocr optical-character-recognition tesseract tesseract-ocr

Last synced: 01 Oct 2024

https://github.com/bandrel/OCyara

Performs OCR on image files and scans them for matches to YARA rules

ocr optical-character-recognition python python-3 tesseract tesseract-ocr-api yara yara-rules

Last synced: 02 Aug 2024

https://github.com/rinormaloku/textocry

Textocry - Copy text from Images (chrome extension)

chrome-extension javascript ocr tesseract

Last synced: 31 Jul 2024

https://github.com/googlecloudplatform/dlp-pdf-redaction

This solution provides an automated, serverless way to redact sensitive data from PDF files using Google Cloud Services like Data Loss Prevention (DLP), Cloud Workflows, and Cloud Run.

bigquery cloud cloudfunctions cloudrun cloudstorage cloudworkflows datalossprevention dlp documents gcp mask ocr pdf redaction serverless terraform tesseract workflows

Last synced: 28 Sep 2024

https://github.com/schwarzkopfb/tesseract-ocr

Node.js wrapper for Tesseract OCR CLI.

javascript nodejs ocr tesseract

Last synced: 01 Aug 2024

https://github.com/fourdigits/wagtail_textract

Text extraction for Wagtail document search

django search tesseract text-extraction textract wagtail

Last synced: 01 Aug 2024

https://github.com/maxim2266/ocr

A collection of tools for OCR (optical character recognition).

bash-script c extract-text linux ocr ocr-recognition tesseract

Last synced: 26 Sep 2024

https://github.com/SimformSolutionsPvtLtd/tesseract-OCR-iOS-demo

This prototype is to recognize text inside the image and for that it uses Tesseract OCR. The underlying Tesseract engine will process the picture and return anything that it believes is text.

demo example-project ios ocr ocr-library optical-character-recognition sample swift tesseract

Last synced: 31 Jul 2024

https://github.com/incubated-geek-cc/Text-To-Speech-App

A Fusion of OCR Technology (Tesseract.js) & Web Speech API. Standalone, portable and works offline.

data-science javascript machine-learning ocr ocr-recognition tesseract tesseract-ocr tesseract-ocr-api tesseractjs webapp

Last synced: 01 Aug 2024

https://github.com/sivakumar-mahalingam/fastmrz

โšกExtracting the Machine Readable Zone (MRZ) from passport or any document images

identity-document mrz mrz-scanner ocr opencv opencv-python passport passport-mrz python tesseract tesseract-ocr text-recognition

Last synced: 27 Sep 2024

https://github.com/jeenyuhs/vesseract

A V wrapper for Tesseract-OCR

ocr tesseract v wrapper

Last synced: 04 Aug 2024

https://github.com/JKamlah/tesseractXplore

tesseractXplore a tesseract ease of use gui with full control

gui gui-application ocr os-independent tesseract

Last synced: 01 Aug 2024

https://github.com/t0mer/ocr-docker

ocr-docker is small, Flask powerd web app, helps us to extract text from images and pdf document using OCR

docker flask ocr python tesseract

Last synced: 01 Oct 2024

https://github.com/egemenzeytinci/readmrz

Machine readable zone reader on ID cards https://pypi.org/project/readmrz

mrz-scanner opencv pip python tesseract

Last synced: 02 Oct 2024

https://github.com/andrealenzi11/py-poppleract

Python library and Web service based on Poppler Pdftotext utility and Tesseract OCR for extracting text from PDF documents

ocr optical-character-recognition pdf-reader pdf-splitting pdf-to-text pdf2text pdftotext poppler poppleract py-poppleract tesseract tesseract-ocr text-extraction

Last synced: 31 Jul 2024

https://github.com/rogeraabbccdd/Rekordbox-NowPlaying

Get rekordbox now playing master track for OBS and Discord Rich Presence.

discord ocr python rekordbox tesseract

Last synced: 05 Aug 2024

https://github.com/olololoe110399/fast_screenshot_ai

Fast Screenshot AI: A sleek and intuitive desktop app for capturing screen areas, extracting text with OCR, and querying LLM for AI-driven insightsโ€”all in a polished.

ai desktopapp fast-screenshot-ai gpt-4 ocr ollama productivity pyqt5 python screencapture tesseract ui-ux

Last synced: 27 Sep 2024

https://github.com/richardarpanet/python-tesseract-alpine

๐Ÿ‹ Alpine Linux, Python (latest stable version) and Tesseract (latest version from Git).

docker docker-image ocr python3 tesseract tesseract-ocr

Last synced: 02 Oct 2024

https://github.com/junioralive/discordbotlab

DiscordBotLab is a repository focused on hosting and managing a variety of utility-driven Discord bots.

ai automated-bots bot-development discord discord-bot google-colab langchain llama ocr quiz-bot tesseract text-extraction utility-bots

Last synced: 27 Sep 2024

https://github.com/jyothish-ram/invoice_ocr_api

Invoice OCR Extraction Flask API

flask-api gemma-2b nlp ocr ollama tensorflow tesseract

Last synced: 01 Oct 2024

https://github.com/nenadjakic/ocr-studio

This application is designed for managing OCR (Optical Character Recognition) tasks. It allows users to define, schedule, and execute OCR tasks through a REST API. The core technologies used are Spring Framework, MongoDB, and Tesseract OCR.

apache-tika mongodb ocr restful-api spring-boot spring-framework spring-framework-6 tess4j tesseract

Last synced: 28 Sep 2024

https://github.com/07rinat07/laravel-tesseract-parsing-text-image

Laravel Tesseract. Parsing text from images in laravel and php. Extracting text from images

laravel-framework php82 tesseract

Last synced: 28 Sep 2024

https://github.com/zaidkhalid44/irctc

Technologies used: JavaScript, Puppeteer, Tesseract.js, Jimp Description: Developed an automated system using Puppeteer and Tesseract.js to streamline IRCTC train ticket booking, handling login, captcha solving, and form filling. Automated login and form-filling tasks, reducing manual effort by approximately 90%.

jimp puppeteer tesseract

Last synced: 27 Sep 2024