https://github.com/shaukataliii/patwar-pdf-compressor
This project is a FastAPI-based service that extracts and compresses images from PDF files while ensuring: Each image is < 300 KB, The total size of images < 3 MB, Proportional compression is applied. This service returns the compressed images as a result.
https://github.com/shaukataliii/patwar-pdf-compressor
land-document patwari pdf-compression pdf-compressor
Last synced: 12 months ago
JSON representation
This project is a FastAPI-based service that extracts and compresses images from PDF files while ensuring: Each image is < 300 KB, The total size of images < 3 MB, Proportional compression is applied. This service returns the compressed images as a result.
- Host: GitHub
- URL: https://github.com/shaukataliii/patwar-pdf-compressor
- Owner: Shaukataliii
- Created: 2025-03-07T10:54:39.000Z (over 1 year ago)
- Default Branch: master
- Last Pushed: 2025-04-15T02:05:28.000Z (about 1 year ago)
- Last Synced: 2025-04-15T03:20:35.982Z (about 1 year ago)
- Topics: land-document, patwari, pdf-compression, pdf-compressor
- Language: Python
- Homepage:
- Size: 2.26 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: Readme.md
Awesome Lists containing this project
README
# 📄 PDF Image Extractor & Compressor 🚀
A FastAPI-based service to **extract and compress images** from PDFs while ensuring:
✅ Each image is **< 300 KB**
✅ Total compressed images are **< 3 MB**
✅ Adaptive, proportional compression for best quality
---
## 🚀 Features
- **Extracts all images** from a PDF
- **Compresses images adaptively** to fit size constraints
- **Ensures quality preservation** while reducing file size
- **Returns a ZIP file** containing the compressed images
- **Fast & scalable** using FastAPI
---
## 🛠️ Installation
### **1⃣ Clone the repository**
```sh
git clone https://github.com/Shaukataliii/patwar-pdf-compressor
cd patwar-pdf-compressor
```
---
### Create a virtual environment (optional but recommended)
```sh
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```
### Install dependencies
```sh
pip install -r requirements.txt
```
## ▶️ Usage
### **1⃣ Run the FastAPI server**
```sh
uvicorn main:app --reload
```
By default, the API runs at http://127.0.0.1:8000.
### **2⃣ Use the API**
Go to http://127.0.0.1:8000/docs to access Swagger UI.
Use the `/process` endpoint to upload a PDF file.
#### Example API Call (cURL)
```sh
curl -X 'POST' 'http://127.0.0.1:8000/process' \
-H 'accept: application/zip' \
-H 'Content-Type: multipart/form-data' \
-F 'file=@sample.pdf' \
--output compressed_images.zip
```
## 📚 API Documentation
### 🔹 POST /process
Uploads a PDF, extracts images, compresses them, and returns a ZIP file.
#### Request
- **Method:** POST
- **URL:** /process
- **Headers:** Content-Type: multipart/form-data
- **Body:**
- `file` (PDF file)
#### Response
- **Success:** `200 OK` – Returns a ZIP file containing the compressed images.
- **Errors:**
- `400 Bad Request` – If the file is not a PDF or has no images.
- `500 Internal Server Error` – If an unknown error occurs.
## ⚙️ Tech Stack
- **Python** 🐍
- **FastAPI** 🚀
- **PyMuPDF (fitz)** 📄 (for extracting images)
- **Pillow (PIL)** 🎨 (for image processing)
## 🤝 Contributing
Feel free to contribute! Fork the repository and submit a pull request.
## 🐟 License
This project is licensed under the MIT License.