{"id":50947421,"url":"https://github.com/polarbear333/rag-llm-based-recommender","last_synced_at":"2026-06-17T21:32:57.004Z","repository":{"id":281436816,"uuid":"945271424","full_name":"polarbear333/rag-llm-based-recommender","owner":"polarbear333","description":"Explore a smarter way to shop online with this full-stack project built on the infrastructure of Google Cloud Platform (GCP) for RAG based e-commerce with LLM.","archived":false,"fork":false,"pushed_at":"2025-09-24T07:59:09.000Z","size":4658,"stargazers_count":6,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-09-24T09:33:09.956Z","etag":null,"topics":["bigquery","fastapi","gcp","langchain","llm-inference","llmops","pyspark","rag","react","recommender-system","vertex-ai"],"latest_commit_sha":null,"homepage":"https://polarbear333.github.io/rag-llm-based-recommender/","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/polarbear333.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-03-09T03:20:00.000Z","updated_at":"2025-09-24T07:58:40.000Z","dependencies_parsed_at":"2025-03-09T04:24:32.893Z","dependency_job_id":"7a169c61-39ee-4aaf-a2d5-0a41a869f222","html_url":"https://github.com/polarbear333/rag-llm-based-recommender","commit_stats":null,"previous_names":["polarbear333/rag-llm-based-recommender"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/polarbear333/rag-llm-based-recommender","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/polarbear333%2Frag-llm-based-recommender","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/polarbear333%2Frag-llm-based-recommender/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/polarbear333%2Frag-llm-based-recommender/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/polarbear333%2Frag-llm-based-recommender/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/polarbear333","download_url":"https://codeload.github.com/polarbear333/rag-llm-based-recommender/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/polarbear333%2Frag-llm-based-recommender/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34466928,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-17T02:00:05.408Z","response_time":127,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bigquery","fastapi","gcp","langchain","llm-inference","llmops","pyspark","rag","react","recommender-system","vertex-ai"],"created_at":"2026-06-17T21:32:55.749Z","updated_at":"2026-06-17T21:32:56.994Z","avatar_url":"https://github.com/polarbear333.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# RAG‑based Semantic Retrieval Recommender for E-commerce\n\n\u003cdiv align=\"center\"\u003e\nA semantic retrieval recommender that provides suggestions based on user-driven Amazon reviews.\n\u003c/div\u003e\n\n\u003ch3 align=\"center\"\u003e\n   \u003ca href=\"\"\u003e\u003cb\u003eView Demo\u003c/b\u003e\u003c/a\u003e \u0026vert;\n   \u003ca href=\"\"\u003e\u003cb\u003eWhitepaper\u003c/b\u003e\u003c/a\u003e \u0026vert;\n   \u003ca href=\"\"\u003e\u003cb\u003eDocumentation\u003c/b\u003e\u003c/a\u003e \u0026vert;\n   \u003ca href=\"\"\u003e\u003cb\u003eAPI Reference\u003c/b\u003e\u003c/a\u003e\n\u003c/h3\u003e\n\n![Static Badge](https://img.shields.io/badge/Python-3.9%2B-blue?style=flat\u0026logo=python)\n![Static Badge](https://img.shields.io/badge/GCP-BigQuery-green?style=flat\u0026logo=googlebigquery)\n![Static Badge](https://img.shields.io/badge/React-18.3.1-blue?style=flat\u0026logo=react)\n![Codecov](https://img.shields.io/codecov/c/github/polarbear333/rag-llm-based-recommender)\n![GitHub commit activity](https://img.shields.io/github/commit-activity/m/polarbear333/rag-llm-based-recommender)\n![GitHub Issues or Pull Requests](https://img.shields.io/github/issues/polarbear333/rag-llm-based-recommender)\n![GitHub commit status](https://img.shields.io/github/checks-status/polarbear333/rag-llm-based-recommender/dae61c7570a85f23013da637cd7d9799cd2d08c3)\n![GitHub License](https://img.shields.io/github/license/polarbear333/rag-llm-based-recommender)\n\n## Overview\n\nThis repository implements a Retrieval-Augmented Generation (RAG) driven semantic recommender and information retrieval (IR) system built on Google Cloud infrastructure. Its purpose is to enhance the customer shopping experience by leveraging machine learning techniques to help users discover relevant products along with personalized content and recommendations. \n\n\u003eA quick demo of showcasing how the recommender works:\n\u003e\n![ezgif-11408d3f4a992c](https://github.com/user-attachments/assets/24c23681-bbb2-46d1-957a-0ff1972dcd88)\n\n\nThe system combines:\n- Semantic product search based on natural language queries\n- Extraction of key product features\n- Sentiment analysis of product reviews\n- Contextual recommendations\n\nThe primary goal is reproducible, high-throughput semantic retrieval for product recommendation and explainable, context-rich natural language responses.\n\n![Website](pages_files/images/image.png)\n\n## Key Technical Highlights\n\n- **Hybrid retrieval**: ANN vector search + metadata filters (category, brand, price ranges) for high-precision recall.\n- **Vector store**: BigQuery vector columns and ScaNN for approximate nearest neighbor search; embeddings produced by configurable embedding model.\n- **RAG pipeline**: Multi-stage retrieval (candidate retrieval → re-ranking → context assembly → LLM prompt + generation). Supports chunking, passage-scoring, and grounding.\n- **Evaluation**: Offline IR metrics (MRR, nDCG@k) and online/latency measurements to balance retrieval depth vs. response time.\n\n### Data Pipeline\n\n- Uses Amazon Reviews 2023 dataset from UCSD McAuley Lab\n- Processes data with PySpark for ETL\n- Stores data in Google Cloud Storage and BigQuery\n- Generates vector embeddings using Vertex AI\n\n### Backend\n\n- Built with FastAPI for high-performance API endpoints\n- Implements vector search using BigQuery and ScaNN\n- Uses Retrieval Augmented Generation (RAG) with LangChain\n- Integrates with Vertex AI's Gemini-2.5-pro LLM\n\n### Frontend\n\n- Developed with React and Next.js\n- Features a chat interface for natural language queries\n- Displays product recommendations with extracted features\n- Uses dynamic image scraping for product visuals\n\n## File Structure\n\nThe project is built on Google Cloud Platform (GCP) with the following file structure:\n\n```\nllm-rag-based-ecommerce-recommender/\n├── backend/            # FastAPI service for search and recommendations\n├── bigQuery/           # SQL scripts for BigQuery table creation\n├── final_report_files/ # Supporting files for the project report\n├── frontend/           # React/Next.js web application\n├── infra/              # Terraform files for GCP infrastructure\n├── node_modules/       # Node.js dependencies\n├── .gitignore          # Git ignore file\n├── README.md           # Project documentation\n├── etl_full.py         # ETL script for data processing\n├── final_report.html   # Rendered HTML report\n├── final_report.qmd    # Quarto markdown report\n├── package-lock.json   # Node.js package lock\n├── package.json        # Node.js package configuration\n└── references.bib      # Bibliography for the report\n```\n\n## High-level Architecture \n\n1. Data Ingestion \u0026 ETL\n   - Source: Amazon Reviews dataset (preprocessed by product and review).\n   - Pipeline: cleaning, normalization, deduplication, chunking long reviews into passages.\n   - Output: document records with metadata ([] `product_id`, `category`, `title`, `review_id`, `rating`, `timestamp`) and text passages.\n\n2. Embedding Generation\n   - Embedding model is configurable (Vertex AI embeddings by default). Typical embedding dim: 1,024 (configurable per model).\n   - Batched embedding generation with retry/backoff and idempotent writes to BigQuery.\n\n3. Vector Indexing \u0026 Storage\n   - Vectors stored as `ARRAY\u003cFLOAT64\u003e` in BigQuery tables with auxiliary metadata columns.\n   - ANN runtime: ScaNN used for approximate nearest neighbor retrieval where available; BigQuery vector search serves as the primary store for scale and analytical joins.\n\n4. Retrieval Pipeline\n   - Stage 1 (Candidate Retrieval): Query → embedding → ANN k-nearest neighbors (k configurable, default 50) with optional metadata filters.\n   - Stage 2 (Re-ranking): Lightweight BM25-like scoring or cross-encoder scoring applied to top N candidates for precision (N ≪ k).\n   - Stage 3 (Context Assembly): Select top passages/products, deduplicate, and assemble prompt context with provenance (source ids \u0026 snippets).\n   - Stage 4 (Generation): Prompt the LLM (e.g., Gemini / other LLM) with structured context and instructions to generate recommendations and explanations.\n\n5. Serving \u0026 API\n   - FastAPI endpoints: `/search/semantic`, `/rag/query`, `/ingest`, `/metrics`.\n   - Response schema includes candidate ids, scores, provenance snippets, aggregated signals (avg rating, sentiment), and a `generated_answer` field when RAG is used.\n\n## Installation and Setup\n\n### Prerequisites\n\n- Python 3.9+\n- Node.js 16+\n- Google Cloud Platform account with:\n  - Cloud Run enabled\n  - BigQuery enabled\n  - Vertex AI enabled\n  - Cloud Storage configured\n\n\n### Backend Setup\n\n1. Navigate to the backend directory:\n   ```bash\n   cd backend\n   ```\n\n2. Install dependencies:\n   ```bash\n   pip install -r requirements.txt\n   ```\n\n3. Configure environment variables:\n   ```bash\n   export PROJECT_ID=\"your-gcp-project-id\"\n   export VERTEX_AI_REGION=\"us-central1\"\n   export BIGQUERY_DATASET_ID=\"your-bigquery-dataset\"\n   ```\n\n4. Run the FastAPI server:\n   ```bash\n   uvicorn app.main:app --reload\n   ```\n\n### Frontend Setup\n\n1. Navigate to the frontend directory:\n   ```bash\n   cd frontend\n   ```\n\n2. Install dependencies:\n   ```bash\n   npm install\n   ```\n\n3. Create a `.env.local` file with your API URL:\n   ```\n   NEXT_PUBLIC_API_BASE_URL=http://localhost:8000\n   ```\n\n4. Run the development server:\n   ```bash\n   npm run dev\n   ```\n\n### Data Pipeline Setup\n\n1. Install ETL dependencies:\n   ```bash\n   pip install pyspark pandas datasets google-cloud-storage google-cloud-bigquery\n   ```\n\n2. Authenticate with GCP:\n   ```bash\n   gcloud auth application-default login\n   ```\n\n3. Run the ETL script:\n   ```bash\n   python etl_full.py\n   ```\n\n## Infrastructure Deployment\n\nThe project includes Terraform files for deploying the infrastructure to GCP:\n\n1. Navigate to the infra directory:\n   ```bash\n   cd infra\n   ```\n\n2. Initialize Terraform:\n   ```bash\n   terraform init\n   ```\n\n3. Plan the deployment:\n   ```bash\n   terraform plan\n   ```\n\n4. Apply the configuration:\n   ```bash\n   terraform apply\n   ```\n\n## Tech Stack \u0026 System Design\n\n### Tech Stack\n\n**Backend**\n- **FastAPI**: High-performance Python framework for building APIs\n- **LangChain**: Framework for developing applications with LLMs\n- **Vertex AI**: Google Cloud's machine learning platform used for embeddings and LLM access\n- **BigQuery**: Serverless data warehouse used for storing and querying product and review data\n- **ScaNN**: Scalable Nearest Neighbors for vector similarity search\n- **PySpark**: Used for ETL data processing\n- **Google Cloud Storage**: For storing processed datasets\n\n**Frontend**\n- **React**: JavaScript library for building user interfaces\n- **Next.js**: React framework for production-grade applications\n- **Tailwind CSS**: Utility-first CSS framework for styling\n- **Axios**: Promise-based HTTP client for API requests\n\n**Infrastructure**\n- **Terraform**: Infrastructure as Code tool for provisioning GCP resources\n- **Google Cloud Platform**: Cloud services provider\n  - Cloud Run: For deploying containerized applications\n  - BigQuery: For vector search and data storage\n  - Vertex AI: For ML model hosting and inference\n\n### System Design\n\nThe system follows a microservices architecture with these key components:\n\n1. **Data Processing Pipeline**:\n   - Extracts data from the Amazon Reviews dataset\n   - Processes and cleans data with PySpark\n   - Generates embeddings using Vertex AI\n   - Loads data into BigQuery with vector indexes\n\n2. **API Layer**:\n   - RESTful API built with FastAPI\n   - Authentication and rate limiting middleware\n   - Optimized for low-latency responses\n\n3. **Search \u0026 Recommendation Engine**:\n   - Hybrid search combining vector similarity and metadata filtering\n   - RAG (Retrieval Augmented Generation) pipeline that:\n     - Converts user queries to vector embeddings\n     - Retrieves relevant products and reviews from BigQuery\n     - Creates context-specific prompts for the LLM\n     - Generates structured recommendations\n\n4. **Web Interface**:\n   - Responsive chat interface for natural language queries\n   - Product card display with dynamic loading\n   - Real-time recommendation display with extracted features\n   - Image scraping functionality for product visuals\n\n5. **Integration Layer**:\n   - Connects frontend to backend services\n   - Handles error states and loading indicators\n   - Manages API response formatting\n\n## Retrieval \u0026 Ranking Details\n\n- **Hybrid scoring**: final score = alpha * vector_score + beta * metadata_boost + gamma * recency_boost (weights tunable).\n- **Chunking**: Long reviews are chunked (stride overlap) to keep embedding contexts \u003c model max tokens while preserving continuity.\n- **Deduplication**: Candidate deduplication by product id and by highly-similar passage similarity threshold.\n- **Reranking**: Optionally use a cross-encoder on top candidates for stronger relevance signals before generating.\n\n## Prompting \u0026 Grounding (RAG)\n\n- Prompts are programmatically assembled: instruction header, user query, ordered contextual snippets (with source markers), and generation constraints (format, length).\n- Grounding strategy: restrict the LLM to use only provided snippets for factual claims and include provenance links in responses.\n- Safety \u0026 hallucination mitigation: truncate/omit low-confidence sources; include fallback heuristics when context coverage is insufficient.\n\n## Configuration \u0026 Tuning\n\n- `k` (ANN candidates): larger `k` increases recall but costs latency and compute.\n- Re-ranker `N`: tuning tradeoff between precision and LLM prompt token cost.\n- Weights (alpha/beta/gamma): tuned on offline validation (MRR, nDCG) and online A/B tests.\n\n## Evaluation\n\n- Offline evaluation uses held-out queries and metrics:\n  - **MRR**: measures how high the first relevant item appears.\n  - **nDCG@k**: evaluates ranking quality with graded relevance.\n  - **Precision/Recall**: for extracted features and sentiment labels.\n- Latency and cost profiling: measure per-request latency (embedding + retrieval + re-ranking + LLM generation) and per-1000-queries cost estimate.\n\n## Contributing\n\nContributions are welcome! Please feel free to fork the repo and submit a pull request.\n\n## License\n\nThis project is licensed under the MIT License - see the LICENSE file for details.\n\n## Acknowledgments \u0026 References\n\n- Retrieval-Augmented Generation (RAG) literature and LangChain examples.\n- BigQuery vector search and ScaNN documentation.\n- UCSD McAuley Lab for the Amazon Reviews 2023 dataset\n- Google Cloud Platform for infrastructure support\n- LangChain community for RAG implementation resources\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpolarbear333%2Frag-llm-based-recommender","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpolarbear333%2Frag-llm-based-recommender","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpolarbear333%2Frag-llm-based-recommender/lists"}