https://github.com/syaagalib/project_ocr

This code can do ocr
https://github.com/syaagalib/project_ocr

ocr-python python streamlit ui

Last synced: about 1 year ago
JSON representation

This code can do ocr

Host: GitHub
URL: https://github.com/syaagalib/project_ocr
Owner: SYAAGalib
Created: 2024-09-19T07:46:32.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2025-06-12T14:20:30.000Z (about 1 year ago)
Last Synced: 2025-06-12T15:32:03.453Z (about 1 year ago)
Topics: ocr-python, python, streamlit, ui
Language: Python
Homepage:
Size: 1.23 MB
Stars: 4
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # PDF to Text

[![Open in Streamlit](https://static.streamlit.io/badges/streamlit_badge_black_white.svg)](https://share.streamlit.io/nainiayoub/pdf-text-data-extractor/main/app.py)

![visitor badge](https://visitor-badge.glitch.me/badge?page_id=nainiayoub.pdf-text-data-extractor)

![forks badge](https://img.shields.io/github/forks/nainiayoub/pdf-text-data-extractor)

![starts badge](https://img.shields.io/github/stars/nainiayoub/pdf-text-data-extractor?style=social)

PDF text data extraction app that takes a PDF document as input and returns either a txt file that contains all pages or a compressed folder of txt files representing the document pages. OCR can also be enabled for scanned docoments.

![pdf_text_image](https://user-images.githubusercontent.com/50157142/214037439-448fafb8-5363-46cb-849e-6132f9bc0fb2.PNG)

## How does it worK?

```mermaid

flowchart LR

A[PDF] --> |text conversion / OCR| B(Text)

B --> |Option 1| D[txt file]

B --> |Option 2| E[ZIP folder of txt files for pages]

```

1. Upload your PDF.

2. Enable OCR (for scanned documents).

3. Select the PDF language.

4. Download your output file (zip/txt).

## How to support the project

You can help support the project through feedback and/or [buy me coffee](https://www.buymeacoffee.com/nainiayoub).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/syaagalib/project_ocr

Awesome Lists containing this project

README