An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with image-extraction

A curated list of projects in awesome lists tagged with image-extraction .

https://github.com/yfedoseev/pdf_oxide

The fastest PDF library for Python and Rust. Text extraction, image extraction, markdown conversion, PDF creation & editing. 0.8ms mean, 5× faster than industry leaders, 100% pass rate on 3,830 PDFs. MIT/Apache-2.0.

data-extraction document-processing fast image-extraction llm markdown pdf pdf-editor pdf-generation pdf-library pdf-parser pdf-to-markdown pdf-to-text pyo3 python rag rust text-extraction

Last synced: 13 May 2026

https://github.com/developer0hye/anytomd-rs

Pure Rust document-to-Markdown converter for LLM workflows (DOCX, PPTX, XLSX, HTML, CSV, JSON, XML, images).

anytomd content-extraction converter csv docx html image-extraction json llm markdown pptx rust text-processing xlsx xml

Last synced: 31 May 2026

https://github.com/pavansomisetty21/image-emotion-detection-by-using-llms-and-emotion-analysis-technique-

In this we explore that extracting image description from image and image was given as URL and by Emotion Analysis Technique we analyse the emotion of the Image by it's Description

image image-captioning image-classification image-emotion-classification image-extraction

Last synced: 05 Jul 2025

https://github.com/m-ah07/PDF-to-Images-Conversion-php

A lightweight PHP service for converting PDF files into images using pdftoppm. Supports PNG output and generates images for each page in the PDF.

image-extraction image-processing open-source open-source-php pdf-conversion pdf-to-image pdf-tools php-library php-scripts php-tools

Last synced: 11 Sep 2025

https://github.com/setasign/setapdf-imageextractor

PoC to extract images from PDFs with SetaPDF-Core

image-extraction pdf php setapdf

Last synced: 04 Jul 2025

https://github.com/m-ah07/PDF-to-Images-Conversion-python

A lightweight Python service for converting PDF files into images using pdftoppm. It generates one PNG image per page in the PDF.

file-conversion image-extraction image-processing open-source open-source-python pdf-to-image pdf-tools python-library python-scripts python-utilities

Last synced: 11 Sep 2025

https://github.com/hreikin/pdf-toolbox

Extract content from PDF's and convert or create new documents from the content in multiple output formats.

adobe document-conversion document-converter document-creation document-creator document-extraction image-extraction pandoc pymupdf pypandoc python python3 scrapy text-extraction

Last synced: 09 Jul 2025

https://github.com/m-ah07/pdf-to-images-conversion-php

A lightweight PHP service for converting PDF files into images using pdftoppm. Supports PNG output and generates images for each page in the PDF.

image-extraction image-processing open-source open-source-php pdf-conversion pdf-to-image pdf-tools php-library php-scripts php-tools

Last synced: 13 Feb 2026

https://github.com/joelstephen97/tracr

useful for searching for images in a recursive fashion given starting url

beautifulsoup4 image-extraction python3

Last synced: 28 Mar 2025

https://github.com/m-ah07/pdf-to-images-conversion-python

A lightweight Python service for converting PDF files into images using pdftoppm. It generates one PNG image per page in the PDF.

file-conversion image-extraction image-processing open-source open-source-python pdf-to-image pdf-tools python-library python-scripts python-utilities

Last synced: 12 Feb 2026