An open API service indexing awesome lists of open source software.

https://github.com/edujbarrios/pdfreader-with-moondreamai

A simple yet useful tool to read PDF covers and understand them to give brief explanations about the PDF content
https://github.com/edujbarrios/pdfreader-with-moondreamai

Last synced: 8 months ago
JSON representation

A simple yet useful tool to read PDF covers and understand them to give brief explanations about the PDF content

Awesome Lists containing this project

README

          

# ๐Ÿ“„ PDFanalyzer with MoondreamAPI

> An intelligent PDF cover page analyzer powered by MoondreamAPI and Streamlit

![Another preview](images/another_preview.png)

## ๐ŸŒŸ Features

- ๐Ÿ“Š Extract cover pages from PDF documents
- ๐Ÿ” Detailed visual analysis of cover pages
- ๐Ÿค– AI-powered content recognition
- ๐Ÿ’ซ Real-time processing and results
- ๐ŸŽจ Clean and intuitive user interface
- ๐Ÿงช Comprehensive unit testing

## ๐Ÿš€ Getting Started

1. Clone the repository
2. Install dependencies:
```bash
pip install -r requirements.txt
```
3. Get your Moondream API key
4. Run the application:
```bash
streamlit run app.py
```

## ๐Ÿ› ๏ธ Technologies Used

- Streamlit - Web application framework
- MoondreamAPI - AI-powered image analysis
- PyMuPDF - PDF processing
- Pillow - Image handling

## ๐Ÿ“ Project Structure

```
โ”œโ”€โ”€ .streamlit/ # Streamlit configuration files (will be generated once you excecute the program)
โ”œโ”€โ”€ images/ # Application screenshots and assets
โ”œโ”€โ”€ prompts/ # AI analysis prompt templates
โ”œโ”€โ”€ responses/ # JSON files storing analysis results
โ”œโ”€โ”€ unit_tests/ # Unit tests and test results
โ”‚ โ”œโ”€โ”€ test_utils.py # Test cases for utility functions
โ”‚ โ”œโ”€โ”€ run_tests.py # Test runner with JSON reporting
โ”‚ โ””โ”€โ”€ test_results.json # Detailed test execution results
โ”œโ”€โ”€ utils.py # Utility functions and helpers
โ””โ”€โ”€ app.py # Main application file
```

## ๐Ÿ”‘ Configuration

1. Enter your Moondream API key in the application
2. Start analyzing PDF cover pages
3. Customize the analysis prompt in `prompts/prompt.md`

## ๐Ÿงช Testing

Run the unit tests:
```bash
python unit_tests/run_tests.py
```

Test results will be saved in `unit_tests/test_results.json` with detailed execution information.

## ๐Ÿ“ Contributing

Feel free to contribute to this project! Open an issue or submit a pull request.

ToDo's:

- Improving prompts to get even more detailed descriptions on PDFs
- Getting a smooth an smooth way solution to analyze the whole content of the PDF while not spending to many api calls, e.g: analyzing the context of the whole book without making 300 pages/ api calls, and spending less calls.

## ๐Ÿ‘จโ€๐Ÿ’ป Author

Created with โค๏ธ by [Eduardo Jose Barrios Garcia](https://edujbarrios.com) ([@edujbarrios](https://github.com/edujbarrios))

## ๐Ÿ“œ License

This project is licensed under the MIT License - see the LICENSE file for details.