{"id":49954748,"url":"https://github.com/mjdevaccount/financial-rag-demo","last_synced_at":"2026-05-17T22:11:05.803Z","repository":{"id":340451384,"uuid":"1166099344","full_name":"mjdevaccount/financial-rag-demo","owner":"mjdevaccount","description":"Financial document Q\u0026A using RAG, Azure OpenAI, and Azure AI Search. Built in C# / .NET 8.","archived":false,"fork":false,"pushed_at":"2026-02-25T02:24:58.000Z","size":18,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-02-25T03:09:28.595Z","etag":null,"topics":["azure","azure-ai-search","csharp","dotnet","financial","llm","openai","rag","vector-search"],"latest_commit_sha":null,"homepage":"","language":"C#","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mjdevaccount.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-24T21:55:51.000Z","updated_at":"2026-02-25T02:24:41.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/mjdevaccount/financial-rag-demo","commit_stats":null,"previous_names":["mjdevaccount/financial-rag-demo"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/mjdevaccount/financial-rag-demo","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mjdevaccount%2Ffinancial-rag-demo","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mjdevaccount%2Ffinancial-rag-demo/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mjdevaccount%2Ffinancial-rag-demo/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mjdevaccount%2Ffinancial-rag-demo/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mjdevaccount","download_url":"https://codeload.github.com/mjdevaccount/financial-rag-demo/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mjdevaccount%2Ffinancial-rag-demo/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33157234,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-17T09:28:26.183Z","status":"ssl_error","status_checked_at":"2026-05-17T09:27:52.702Z","response_time":107,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["azure","azure-ai-search","csharp","dotnet","financial","llm","openai","rag","vector-search"],"created_at":"2026-05-17T22:11:03.666Z","updated_at":"2026-05-17T22:11:05.796Z","avatar_url":"https://github.com/mjdevaccount.png","language":"C#","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Financial RAG Demo\n\nA production-grade **Retrieval-Augmented Generation (RAG)** system built on the Microsoft Azure stack, designed to answer questions grounded in financial documents such as Federal Reserve meeting minutes and SEC filings.\n\nBuilt with **C# / .NET 8**, **Azure OpenAI**, and **Azure AI Search**.\n\n---\n\n## Architecture\n\n```\n┌─────────────────┐     ┌──────────────────────┐     ┌─────────────────────┐\n│   PDF Documents │────▶│  Ingestion Pipeline   │────▶│  Azure AI Search    │\n│  (Fed Minutes,  │     │  - PdfPig extraction  │     │  (Vector Index)     │\n│   SEC Filings)  │     │  - Text chunking      │     │                     │\n└─────────────────┘     │  - Embedding via      │     └────────┬────────────┘\n                        │    Azure OpenAI        │              │\n                        └──────────────────────┘              │\n                                                               │ Vector Search\n┌─────────────────┐     ┌──────────────────────┐              │\n│   REST Client   │────▶│   RagDemo.Api         │◀────────────┘\n│  (Swagger UI)   │     │  - /api/rag/ask       │\n└─────────────────┘     │  - Source citations   │     ┌─────────────────────┐\n                        │  - Grounded answers   │────▶│   Azure OpenAI      │\n                        └──────────────────────┘     │   GPT-4o-mini       │\n                                                      └─────────────────────┘\n```\n\n---\n\n## Features\n\n- **PDF Ingestion Pipeline** — Bulk ingest financial PDFs with automatic text extraction, chunking (512 tokens, 50 token overlap), and vector embedding\n- **Hybrid Vector Search** — Azure AI Search HNSW vector index for semantic retrieval\n- **Grounded Generation** — GPT-4o-mini answers are constrained to retrieved context, minimizing hallucination\n- **Source Citations** — Every response includes the source documents used to generate the answer\n- **REST API** — Clean ASP.NET Core Web API with Swagger UI for interactive exploration\n\n---\n\n## Tech Stack\n\n| Component | Technology |\n|---|---|\n| Language | C# / .NET 8 |\n| LLM | Azure OpenAI (GPT-4o-mini) |\n| Embeddings | Azure OpenAI (text-embedding-3-small) |\n| Vector Store | Azure AI Search (HNSW index) |\n| PDF Extraction | PdfPig |\n| API Framework | ASP.NET Core Web API |\n| Configuration | .NET User Secrets |\n\n---\n\n## Project Structure\n\n```\nRagDemo.sln\n├── RagDemo.Core          # Shared services and models\n│   ├── DocumentChunk.cs       # Search index model\n│   ├── EmbeddingService.cs    # Azure OpenAI embedding client\n│   ├── SearchIndexService.cs  # Index creation and management\n│   ├── DocumentIngester.cs    # Chunking and upsert pipeline\n│   ├── PdfIngester.cs         # PDF extraction and ingestion\n│   ├── RetrievalService.cs    # Vector search retrieval\n│   └── ChatService.cs         # Grounded generation\n├── RagDemo.Ingestion     # Console app for ingesting documents\n└── RagDemo.Api           # ASP.NET Core REST API\n```\n\n---\n\n## Getting Started\n\n### Prerequisites\n\n- .NET 8 SDK\n- Azure subscription\n- Azure OpenAI resource with deployments:\n  - `text-embedding-3-small`\n  - `gpt-4o-mini`\n- Azure AI Search resource (Free tier sufficient)\n\n### Configuration\n\nThis project uses .NET User Secrets to keep credentials out of source control.\n\nIn both `RagDemo.Ingestion` and `RagDemo.Api`, run:\n\n```bash\ndotnet user-secrets set \"AzureOpenAI:Endpoint\" \"https://YOUR_RESOURCE.openai.azure.com/\"\ndotnet user-secrets set \"AzureOpenAI:Key\" \"YOUR_KEY\"\ndotnet user-secrets set \"AzureOpenAI:EmbeddingDeployment\" \"text-embedding-3-small\"\ndotnet user-secrets set \"AzureOpenAI:ChatDeployment\" \"gpt-4o-mini\"\ndotnet user-secrets set \"AzureSearch:Endpoint\" \"https://YOUR_SEARCH.search.windows.net\"\ndotnet user-secrets set \"AzureSearch:Key\" \"YOUR_KEY\"\ndotnet user-secrets set \"AzureSearch:IndexName\" \"financial-docs\"\n```\n\n### Ingest Documents\n\n1. Create a folder for your PDFs (e.g. `C:\\RagDocs`)\n2. Drop in financial PDFs — Federal Reserve meeting minutes work great (available free at [federalreserve.gov](https://www.federalreserve.gov/monetarypolicy/fomccalendars.htm))\n3. Update the folder path in `RagDemo.Ingestion/Program.cs`\n4. Run `RagDemo.Ingestion` — it will create the index and ingest all PDFs automatically\n\n### Run the API\n\nSet `RagDemo.Api` as the startup project and run. Navigate to:\n\n```\nhttps://localhost:{port}/swagger\n```\n\n### Example Request\n\n```http\nPOST /api/rag/ask\nContent-Type: application/json\n\n{\n  \"question\": \"What did the Fed decide about interest rates and how did markets react?\"\n}\n```\n\n### Example Response\n\n```json\n{\n  \"answer\": \"The Federal Reserve maintained the target range for the federal funds rate at 3½ to 3¾ percent, citing solid economic expansion while acknowledging elevated inflation. Two members dissented, advocating for a lower target range. Markets responded with volatility as investors repriced rate expectations.\",\n  \"sources\": [\n    \"fomcminutes20260128\"\n  ]\n}\n```\n\n---\n\n## How It Works\n\n**Ingestion:**\n1. PDFs are extracted to plain text using PdfPig\n2. Text is split into overlapping chunks (512 words, 50 word overlap)\n3. Each chunk is embedded using `text-embedding-3-small` (1536 dimensions)\n4. Chunks and embeddings are upserted into Azure AI Search\n\n**Retrieval + Generation:**\n1. The user's question is embedded using the same model\n2. Azure AI Search performs a k-nearest-neighbor vector search (k=3)\n3. The top matching chunks are injected into the system prompt\n4. GPT-4o-mini generates an answer grounded strictly in the retrieved context\n5. Source document names are returned alongside the answer\n\n---\n\n## License\n\nMIT\n\n## Related Projects\n\n- [financial-agent-demo](https://github.com/mjdevaccount/financialagent) — AI financial research agent using Semantic Kernel, Azure OpenAI, and Alpha Vantage with autonomous tool selection\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmjdevaccount%2Ffinancial-rag-demo","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmjdevaccount%2Ffinancial-rag-demo","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmjdevaccount%2Ffinancial-rag-demo/lists"}