Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/adilsezer/pdftextextractor
PdfTextExtractor is a Windows OS based VB.NET WinForms application that enable users to specify a directory to watch for new PDF files and automatically extract their text, using Tesseract OCR.
https://github.com/adilsezer/pdftextextractor
pdf tesseract-ocr
Last synced: about 2 months ago
JSON representation
PdfTextExtractor is a Windows OS based VB.NET WinForms application that enable users to specify a directory to watch for new PDF files and automatically extract their text, using Tesseract OCR.
- Host: GitHub
- URL: https://github.com/adilsezer/pdftextextractor
- Owner: adilsezer
- License: mit
- Created: 2022-02-14T08:52:54.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2023-05-23T23:58:18.000Z (over 1 year ago)
- Last Synced: 2024-01-07T01:57:22.907Z (12 months ago)
- Topics: pdf, tesseract-ocr
- Language: Visual Basic .NET
- Homepage:
- Size: 10.2 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# PdfTextExtractor
PdfTextExtractor is a Windows OS based VB.NET WinForms application that enable users to specify a directory to watch for new PDF files and automatically extract their text, using Tesseract OCR.## Basic Features
* Select a folder to monitor new PDF files
* Start and Stop folder watcher
* Extract text from scanned or digital documents
* Create a txt file, containing extracted text## Requirements
* .NET Framework 4.8## Screenshot
Main Application GUI![Alt text](https://github.com/sezerad/PdfTextExtractor/blob/main/PdfTextExtractor/Screenshots/PdfTextExtractorGUI.PNG?raw=true "PdfTextExtractor")