https://github.com/drisskhattabi6/pytesseract-ocr-for-image-and-pdfs

This Repo contains implementation of OCR for Image and PDFs Using Pytesseract and OpenCV
https://github.com/drisskhattabi6/pytesseract-ocr-for-image-and-pdfs

Last synced: 3 months ago
JSON representation

This Repo contains implementation of OCR for Image and PDFs Using Pytesseract and OpenCV

Host: GitHub
URL: https://github.com/drisskhattabi6/pytesseract-ocr-for-image-and-pdfs
Owner: drisskhattabi6
Created: 2025-05-10T12:28:18.000Z (5 months ago)
Default Branch: main
Last Pushed: 2025-05-10T13:34:33.000Z (5 months ago)
Last Synced: 2025-05-10T14:37:03.181Z (5 months ago)
Language: Jupyter Notebook
Size: 11.1 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# 🧾 Pytesseract OCR for Images and PDFs

This repository demonstrates how to extract text from **images** and **PDF documents** using **Pytesseract** — a Python wrapper for Google's Tesseract-OCR Engine.

## 📌 Overview

The project showcases:

* How to apply OCR to **images** (JPEG, PNG, etc.)
* How to convert **PDFs** to images and extract text using OCR
* Code examples implemented in Jupyter Notebooks

## Example

Source text :

![](imgs/easy_text.png)

Detected Image :

![](imgs/text_with_boxes.jpg)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/drisskhattabi6/pytesseract-ocr-for-image-and-pdfs

Awesome Lists containing this project

README