https://github.com/aditya-ranjan1234/finquery
Smart AI Assistant for Financial Document Retrieval
https://github.com/aditya-ranjan1234/finquery
Last synced: 2 months ago
JSON representation
Smart AI Assistant for Financial Document Retrieval
- Host: GitHub
- URL: https://github.com/aditya-ranjan1234/finquery
- Owner: Aditya-Ranjan1234
- Created: 2025-08-03T08:43:32.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2025-08-04T14:12:21.000Z (2 months ago)
- Last Synced: 2025-08-04T14:17:12.124Z (2 months ago)
- Language: Python
- Homepage: https://fin-query.vercel.app
- Size: 4.99 MB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# HackRx 6.0 – LLM Document Processing System
An end-to-end, well-documented Python project that demonstrates how to build an **LLM-powered query–retrieval and decision engine** over large, unstructured documents (PDFs, Word, e-mails).
## Features
1. **Multi-format ingestion** – loaders for PDF, DOCX, and E-mail files extract text + metadata.
2. **Semantic vector store** – documents are chunked, embedded via `sentence-transformers`, and indexed with FAISS for fast nearest-neighbour search.
3. **Natural-language query parser** – rule-based + LLM fallback extracts structured fields (age, procedure, location, policy age, …).
4. **Decision engine** – pluggable logic evaluates retrieved clauses and returns JSON containing:
```json
{
"decision": "approved | rejected",
"amount": 12345.67,
"justification": "Text explanation …",
"clauses": [ {"id": "…", "text": "…"} ]
}
```
5. **CLI** – `python -m hackrx_llm --docs /path/to/folder --ask "46M knee surgery Pune 3-month policy"`.
6. **Extensible** – swap embeddings, LLM provider, or decision logic.
7. **Test-driven** – `pytest` covers parsing and retrieval.## Quick-start
```bash
python -m venv .venv && source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt# Build vector store (runs ingestion automatically)
python -m hackrx_llm ingest --docs sample_docs# Ask a question
python -m hackrx_llm ask --query "46-year-old male, knee surgery in Pune, 3-month policy" --top_k 5
```If an **OpenAI** key is present (`export OPENAI_API_KEY=…`), the parser will enrich/validate fields via GPT automatically; otherwise, rule-based extraction is used.
## Project Structure
```
├── hackrx_llm/ ← Library package
│ ├── ingestion/ ← PDF, Word, e-mail loaders
│ ├── parser.py ← Query → structured data
│ ├── retriever.py ← Vector store + semantic search
│ ├── decision_engine.py
│ ├── schema.py ← Pydantic models
│ ├── cli.py ← Typer CLI
│ └── __init__.py
├── tests/ ← Unit tests (pytest)
├── requirements.txt
└── README.md ← You are here
```## Design Diagram
```mermaid
graph TD
A(User Query) --> B(Parser)
B --> C[Structured Query]
C --> D(Retriever)
D --> E[Relevant Clauses]
C --> F(Decision Engine)
E --> F
F --> G[JSON Response]
```## Contributing
1. Fork -> git clone -> create feature branch.
2. Ensure `pytest` passes & run `black` / `ruff`.
3. PR with clear description.## License
MIT © 2025 Bajaj Finserv Health Ltd.