https://github.com/punitkmryh/tablesense-ai-rag-application
Chatbot for scraped Table
https://github.com/punitkmryh/tablesense-ai-rag-application
agentic-ai gemma3 generative-ai groq huggingface langchain python3 rag-chatbot
Last synced: 6 months ago
JSON representation
Chatbot for scraped Table
- Host: GitHub
- URL: https://github.com/punitkmryh/tablesense-ai-rag-application
- Owner: punitkmryh
- Created: 2025-04-21T05:09:09.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2025-04-21T12:34:29.000Z (6 months ago)
- Last Synced: 2025-04-21T13:38:59.721Z (6 months ago)
- Topics: agentic-ai, gemma3, generative-ai, groq, huggingface, langchain, python3, rag-chatbot
- Language: Python
- Homepage: https://tablesense-ai-rag-application.streamlit.app/
- Size: 2.62 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# RAG-Application - Scraping-URL-Tables
## Building a **LangChain-powered Table Extractor + LLM Q&A + Downloadable CSV/Excel Generator** deployed on **Streamlit Cloud**.
Here's exactly what we’ll do:
---
### ✅ What This Project Does
1. 🔗 **Takes a user-specified URL** (with tables in HTML)
2. 🧠 Extracts all **HTML tables** using **BeautifulSoup + Pandas**
3. 💬 Chains the **tables with an LLM (via Groq)** for Q&A via RAG
4. 📂 Lets the user:
- Ask specific queries about the table
- Select table(s) to **download as CSV or Excel**
5. 🚀 Fully deployable on **Streamlit Cloud**
6. 💾 Uses **FAISS + HuggingFace Embeddings** for vector storage---
## 📁 Project Structure
```
table-extractor-rag/
│
├── app.py # Streamlit app
├── requirements.txt
├── .env # Contains GROQ API Key
├── vectorstore/ # Stores FAISS index (after ingestion)
│ └── faiss_index/
```---