Projects in Awesome Lists tagged with image-extraction
A curated list of projects in awesome lists tagged with image-extraction .
https://github.com/yfedoseev/pdf_oxide
The fastest PDF library for Python and Rust. Text extraction, image extraction, markdown conversion, PDF creation & editing. 0.8ms mean, 5× faster than industry leaders, 100% pass rate on 3,830 PDFs. MIT/Apache-2.0.
data-extraction document-processing fast image-extraction llm markdown pdf pdf-editor pdf-generation pdf-library pdf-parser pdf-to-markdown pdf-to-text pyo3 python rag rust text-extraction
Last synced: 13 May 2026
https://github.com/flairnlp/fundus
A very simple news crawler with a funny name
cc-news commoncrawl corpus corpus-tools crawler datasets image-classification image-extraction news-crawler news-scraping nlp python rss scraper sitemap text-extraction web-corpus web-scraping
Last synced: 08 Jan 2026
https://github.com/developer0hye/anytomd-rs
Pure Rust document-to-Markdown converter for LLM workflows (DOCX, PPTX, XLSX, HTML, CSV, JSON, XML, images).
anytomd content-extraction converter csv docx html image-extraction json llm markdown pptx rust text-processing xlsx xml
Last synced: 31 May 2026
https://github.com/landonikko/pilko-frame-capture-studio
A fully local web tool for extracting screenshots from videos in many different ways.
client-side css frame-extraction html image-extraction javascript jszip screenshot-generation screenshot-tool secure-by-default timecode-component video-processing video-tools web-application
Last synced: 28 Feb 2026
https://github.com/pavansomisetty21/image-emotion-detection-by-using-llms-and-emotion-analysis-technique-
In this we explore that extracting image description from image and image was given as URL and by Emotion Analysis Technique we analyse the emotion of the Image by it's Description
image image-captioning image-classification image-emotion-classification image-extraction
Last synced: 05 Jul 2025
https://github.com/sourceduty/pdf_image_extractor
🖼️ Extract images from PDF files.
ai ai-image-extraction artificial-intelligence chatgpt custom-gpt gpt gpt-store gpts image image-extraction image-tool images openai pdf pdf-image pdf-image-extractor pdf-images
Last synced: 08 Aug 2025
https://github.com/m-ah07/PDF-to-Images-Conversion-php
A lightweight PHP service for converting PDF files into images using pdftoppm. Supports PNG output and generates images for each page in the PDF.
image-extraction image-processing open-source open-source-php pdf-conversion pdf-to-image pdf-tools php-library php-scripts php-tools
Last synced: 11 Sep 2025
https://github.com/setasign/setapdf-imageextractor
PoC to extract images from PDFs with SetaPDF-Core
image-extraction pdf php setapdf
Last synced: 04 Jul 2025
https://github.com/m-ah07/PDF-to-Images-Conversion-python
A lightweight Python service for converting PDF files into images using pdftoppm. It generates one PNG image per page in the PDF.
file-conversion image-extraction image-processing open-source open-source-python pdf-to-image pdf-tools python-library python-scripts python-utilities
Last synced: 11 Sep 2025
https://github.com/hreikin/pdf-toolbox
Extract content from PDF's and convert or create new documents from the content in multiple output formats.
adobe document-conversion document-converter document-creation document-creator document-extraction image-extraction pandoc pymupdf pypandoc python python3 scrapy text-extraction
Last synced: 09 Jul 2025
https://github.com/m-ah07/pdf-to-images-conversion-php
A lightweight PHP service for converting PDF files into images using pdftoppm. Supports PNG output and generates images for each page in the PDF.
image-extraction image-processing open-source open-source-php pdf-conversion pdf-to-image pdf-tools php-library php-scripts php-tools
Last synced: 13 Feb 2026
https://github.com/joelstephen97/tracr
useful for searching for images in a recursive fashion given starting url
beautifulsoup4 image-extraction python3
Last synced: 28 Mar 2025
https://github.com/m-ah07/pdf-to-images-conversion-python
A lightweight Python service for converting PDF files into images using pdftoppm. It generates one PNG image per page in the PDF.
file-conversion image-extraction image-processing open-source open-source-python pdf-to-image pdf-tools python-library python-scripts python-utilities
Last synced: 12 Feb 2026