https://github.com/sreejabethu/smart-report-analyzer
An AI-powered app to analyze and summarize Excel, CSV, and PDF reports using Hugging Face language models. Built with Streamlit.
https://github.com/sreejabethu/smart-report-analyzer
data-analysis huggingface llm nlp pdf-analysis python question-answering streamlit summarization
Last synced: 25 days ago
JSON representation
An AI-powered app to analyze and summarize Excel, CSV, and PDF reports using Hugging Face language models. Built with Streamlit.
- Host: GitHub
- URL: https://github.com/sreejabethu/smart-report-analyzer
- Owner: SreejaBethu
- Created: 2025-03-27T16:02:32.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-03-27T16:41:16.000Z (about 1 year ago)
- Last Synced: 2025-03-27T17:23:17.679Z (about 1 year ago)
- Topics: data-analysis, huggingface, llm, nlp, pdf-analysis, python, question-answering, streamlit, summarization
- Language: Python
- Homepage:
- Size: 12.7 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# π§ Smart Report Analyzer(Hugging Face + Streamlit)
Smart Report Analyzer is an LLM-powered web app built with **Streamlit**, powered by **Hugging Face models**, that intelligently summarizes and analyzes structured data (Excel/CSV) and unstructured text (PDF). It can also answer questions about the uploaded content using LLMs.
---
## π Features
- π Upload **PDF**, **CSV**, or **Excel** files
- π Automatic **data preview** and **exploratory data analysis (EDA)**
- β¨ Generate smart **summaries** from structured or unstructured content
- π¬ Ask natural language questions and get answers powered by **Hugging Face LLMs**
- π Dynamic visualizations with Plotly
---
## π§° Tech Stack
- [Streamlit](https://streamlit.io/)
- [Transformers (Hugging Face)](https://huggingface.co/docs/transformers/index)
- PyTorch
- Pandas, Plotly, PDFplumber, OpenPyXL
---
## π Project Structure
Smart-Report-Analyzer/
βββ app.py # Main Streamlit app
βββ requirements.txt
βββ .env # Contains your Hugging Face token (not to be pushed)
βββ utils/
βββ file_handler.py # Handles file upload and parsing
βββ eda.py # Data visualization (EDA)
βββ llm_agent.py # LLM summarization and Q&A logic
βββ sample_sales_data.xlsx # Test Excel file (optional)
βββ sample_report.pdf # Test PDF file (optional)
βββ README.md
---
### π οΈ Setup Instructions
## 1. Clone the Repository
git clone https://github.com/your-username/smart-report-analyzer.git
cd smart-report-analyzer
### 2. **Create and Activate Virtual Environment**
python -m venv venv
venv\Scripts\activate # On Windows
### OR
source venv/bin/activate # On Mac/Linux
### 3. Install Dependencies
pip install -r requirements.txt
### 4. Create .env File
Create a .env file in the root directory and add your Hugging Face token:
HUGGINGFACE_HUB_TOKEN=hf_xxxxxxxxxxxxxxxxxxxxxx
You can generate a free token from your Hugging Face account here:
π https://huggingface.co/settings/tokens
### βΆοΈ Run the App
streamlit run app.py
The app will open in your browser at http://localhost:8501.
--
### π¦ Sample Files
Use the included sample files to test:
sample_sales_data.xlsx β Structured Excel data
sample_report.pdf β Business-style unstructured report
### β
Models Used
Purpose Model
Summarization knkarthick/MEETING_SUMMARY
Table Q&A google/tapas-large-finetuned-wtq
PDF/Text Q&A google/flan-t5-small
### π Important Notes
Never commit your .env file or token to GitHub.
You can add .env to .gitignore:
.env
### π TODOs / Future Improvements
Add support for larger files or chunked analysis
Upload multiple files for comparison
Deploy to Streamlit Cloud or HuggingFace Spaces
Improve accuracy using retrieval-augmented generation (RAG)
### π License
MIT License β free to use and modify.
### π‘Author
Created by Sreeja Bethu
π LinkedIn (linkedin.com/in/sreejabethu)
---
Would you like me to:
- Help write the `requirements.txt` from your current setup?
- Generate `.gitignore` for Python + Streamlit?
- Zip this project structure for upload?
Letβs get you live on GitHub and ready to showcase π«