An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with pdftotext

A curated list of projects in awesome lists tagged with pdftotext .

https://github.com/lu4p/cat

Extract text from plaintext, .docx, .odt and .rtf files. Pure go.

cat cross-platform docx2txt extract-text go golang odt2txt pdf2txt pdftotext rtf-to-text text-extraction textextracting

Last synced: 19 Dec 2024

https://github.com/iron-software/iron-ocr-image-to-text-in-csharp

Image to Text Tutorial in C# - See https://ironsoftware.com/csharp/ocr/tutorials/how-to-read-text-from-an-image-in-csharp-net/

csharp csharp-code imagetotext ocr pdftotext

Last synced: 09 Apr 2025

https://github.com/iron-software/Iron-OCR-Image-to-Text-in-CSharp

Image to Text Tutorial in C# - See https://ironsoftware.com/csharp/ocr/tutorials/how-to-read-text-from-an-image-in-csharp-net/

csharp csharp-code imagetotext ocr pdftotext

Last synced: 04 May 2025

https://github.com/ashutoshvarma/pyxpdf

Fast and memory-efficient Python PDF Parser based on xpdf sources

cython pdf pdf-converter pdf-parser pdfparser pdftohtml pdftopng pdftotext python xpdf xpdf-reader

Last synced: 12 Apr 2025

https://github.com/icaropires/pdf2dataset

Converts a whole subdirectory with a big (or small) volume of PDF documents to a dataset (pandas DataFrame) with error tracking and choice of features

data-science distributed-computing distributed-systems ocr pandas-dataframe parallel parquet pdf pdf2image pdftotext pyarrow pytesseract pytesseract-ocr python python3 ray tesseract tesseract-ocr

Last synced: 13 Apr 2025

https://github.com/yedhink/covid19-kerala-api-deprecated

Deprecated - A fast API service for retrieving day to day stats about Coronavirus(COVID-19, SARS-CoV-2) outbreak in Kerala(India).

api coronavirus coronavirus-real-time coronavirus-tracking covid-19-india covid-data covid19 covid19-data covid19dataindia covid19datakerala covid19india covid19kerala gin pdftotext

Last synced: 06 Dec 2024

https://github.com/tecosaur/pdftotext.el

A mirror of https://git.tecosaur.net/tec/pdftotext.el

emacs emacs-pacakge mirror pdftotext

Last synced: 14 Feb 2025

https://github.com/andrealenzi11/py-poppleract

Python library and Web service based on Poppler Pdftotext utility and Tesseract OCR for extracting text from PDF documents

ocr optical-character-recognition pdf-reader pdf-splitting pdf-to-text pdf2text pdftotext poppler poppleract py-poppleract tesseract tesseract-ocr text-extraction

Last synced: 26 Mar 2025

https://github.com/tmsincomb/imagetocsv

Converts an image to a CSV. This exists because Chorus 3.0 is bat-shit and only show images for vital metadata.

csv image2csv imagetocsv opencv pdftotext pytesseract python tesseract

Last synced: 17 Jan 2025

https://github.com/chanmo/docker-poppler

A simple RESTFul API service for poppler

pdftocairo pdftohtml pdftoppm pdftotext poppler

Last synced: 12 May 2025

https://github.com/euyogi/projeto-anceu-cs50

Meu projeto do curso CS50: Um analisador de pdfs que processa as notas dos aprovados pelo Acesso Enem e organiza tudo. Agora em C++

acesso-enem-unb cpp cs50 cs50course cs50x customtkinter enem exe extract-text-from-pdf imgui pdftotext portuguese-brazilian project python unb yogi zlib

Last synced: 14 Apr 2025

https://github.com/amitsuthar69/pdf2text

A pdf to text extractor web service written in Go.

golang pdftotext

Last synced: 20 Feb 2025

https://github.com/deardurham/ciprs-reader

Python library for reading CIPRS PDFs

codeforamerica coverage docker pdf pdftotext pytest python

Last synced: 28 Dec 2024

https://github.com/views63/pdf2text

pdf to text

pdf2text pdftotext rust

Last synced: 26 Mar 2025

https://github.com/drmccoy/pdftextorizer

Interactively extract text from multi-column PDFs

gui pdf pdf-extractor pdf-files pdf2text pdftotext pyqt5 qt5

Last synced: 27 Mar 2025

https://github.com/farhan0167/bankaiagent

A tool to convert bank statements into Excel files

bank-statement-parser detr object-detection ocr pdftotext tabledetection transformers

Last synced: 12 Jan 2025

https://github.com/bradsec/pdftotext

Client browser tool to extract text from a PDF file using PDF.js

pdf pdftotext

Last synced: 25 Feb 2025

https://github.com/fabriziosalmi/any-to-mp4

Convert any kind of file to video.

ffmpeg gtts imagemagick mp3 pdftotext php polly sox txt2wav video wikitts

Last synced: 09 Feb 2025

https://github.com/zbioe/grapnel

Repository with tools for convert body in response to plain text

pdf pdftotext tools

Last synced: 15 Mar 2025

https://github.com/joeychilson/pdftotext

A Go library for converting PDF files to text using the pdftotext utility.

go pdf pdftotext

Last synced: 07 Apr 2025

https://github.com/jefferis/paperutils

R package with utility functions to support preparation of journal articles

bibtex lyx pdftk pdftotext r

Last synced: 05 Mar 2025