{"id":26376915,"url":"https://github.com/loftwah/langchain-csv","last_synced_at":"2026-04-18T02:31:09.012Z","repository":{"id":281184881,"uuid":"944459966","full_name":"loftwah/langchain-csv","owner":"loftwah","description":"attempt to read csv with langchain","archived":false,"fork":false,"pushed_at":"2025-03-08T12:02:25.000Z","size":450,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-01-13T15:07:48.513Z","etag":null,"topics":["csv","langchain","python","rag","uv"],"latest_commit_sha":null,"homepage":"https://blog.deanlofts.xyz/blog/rag-product-catalog/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/loftwah.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-03-07T11:35:54.000Z","updated_at":"2025-03-22T13:54:08.000Z","dependencies_parsed_at":"2025-03-07T13:29:37.280Z","dependency_job_id":"1e0ee435-4bd3-404b-846c-bf9d9687079b","html_url":"https://github.com/loftwah/langchain-csv","commit_stats":null,"previous_names":["loftwah/langchain-csv"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/loftwah/langchain-csv","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/loftwah%2Flangchain-csv","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/loftwah%2Flangchain-csv/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/loftwah%2Flangchain-csv/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/loftwah%2Flangchain-csv/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/loftwah","download_url":"https://codeload.github.com/loftwah/langchain-csv/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/loftwah%2Flangchain-csv/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31953752,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-18T00:39:45.007Z","status":"online","status_checked_at":"2026-04-18T02:00:07.018Z","response_time":103,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csv","langchain","python","rag","uv"],"created_at":"2025-03-17T03:19:28.795Z","updated_at":"2026-04-18T02:31:08.996Z","avatar_url":"https://github.com/loftwah.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Product Catalog RAG System\n\nA Retrieval-Augmented Generation (RAG) system that answers natural language questions about product data using local LLMs. This project demonstrates how to build an interactive product catalog explorer using LangChain, Ollama, and Gradio.\n\n## Features\n\n- 🔍 **Interactive Web Interface**: A modern, user-friendly Gradio interface for exploring product data\n- 💬 **Command-line Demo**: A colorful CLI interface with multiple demo modes\n- 🧠 **Hybrid Analysis**: Combines RAG with direct data analysis for precise queries\n- 📊 **Rich Data Support**: Handles complex product data including prices, categories, ratings, stock levels, release dates, discounts, and features\n- 🔄 **Real-time Processing**: Get answers in seconds using local LLMs\n- 🎯 **Specialized Queries**: Advanced handling for price comparisons, discount detection, feature searches, and inventory status\n\n## Prerequisites\n\n- Python 3.8+\n- Ollama installed and running (https://ollama.com)\n- Required Python packages (see Installation)\n\n## Installation\n\n1. Clone the repository:\n\n```bash\ngit clone https://github.com/loftwah/langchain-csv.git\ncd langchain-csv\n```\n\n2. Create and activate a virtual environment:\n\n```bash\n# Create virtual environment with uv\nuv venv\n\n# Activate the virtual environment\nsource .venv/bin/activate  # On Windows: .venv\\Scripts\\activate\n```\n\n3. Install dependencies:\n\n```bash\n# Install with uv\nuv pip install langchain langchain_community langchain_ollama faiss-cpu colorama gradio pandas\n```\n\n4. Ensure Ollama is installed and running:\n\n```bash\n# Install Ollama (if not already installed)\ncurl -fsSL https://ollama.com/install.sh | sh\n\n# Start Ollama server\nollama serve\n```\n\n5. Pull the necessary models:\n\n```bash\nollama pull llama3.2\n```\n\n## Usage\n\n### Web Interface\n\nRun the Gradio demo:\n\n```bash\nuv run gradio_demo.py\n```\n\nThe web interface will be available at `http://localhost:7860`.\n\nSample questions you can ask:\n\n- \"What's the cheapest laptop?\"\n- \"Show me products under $100\"\n- \"Which products have discounts?\"\n- \"What are the newest products?\"\n- \"Which headphones have noise cancellation?\"\n- \"Compare Apple and Samsung products\"\n\n### Command-line Demo\n\nRun the CLI demo:\n\n```bash\nuv run rag_demo.py\n```\n\nThe CLI demo offers three modes:\n\n1. **Sample Queries**: See the system answer preset questions in categories like Price \u0026 Value, Brands \u0026 Categories, Ratings \u0026 Features, and Availability \u0026 Release\n2. **Interactive Mode**: Ask your own questions about the product catalog\n3. **Behind the Scenes**: Learn how RAG works with a step-by-step walkthrough\n\n## How It Works\n\nThis system uses a hybrid approach to answer questions about product data:\n\n1. **Vector Embeddings**: The system creates mathematical representations of your product data\n2. **Semantic Search**: When you ask a question, it finds the most relevant products\n3. **LLM Generation**: It uses a language model to create a natural language answer\n\n### Smart Direct Data Analysis\n\nFor specific types of queries, the system uses direct data analysis instead of relying solely on the LLM:\n\n- **Price Queries**: Finds cheapest/most expensive products, products in specific price ranges\n- **Rating Analysis**: Identifies highest-rated products by category or overall\n- **Discount Detection**: Analyzes products on sale with discount percentages\n- **Inventory Status**: Reports on stock levels and availability\n- **Release Date Analysis**: Identifies newest products or products from specific time periods\n- **Feature Search**: Finds products with specific features mentioned in the query\n\nThis hybrid approach ensures accurate, detailed answers while maintaining the flexibility of natural language understanding.\n\n## Product Data Format\n\nThe system works best with CSV files containing product information. The enhanced version supports the following fields:\n\n- **Basic Fields**: name, price, description, category, brand, rating\n- **Advanced Fields**: stock, release_date, discount_percent, features\n\nExample row:\n\n```\n\"MacBook Pro M3\",1799,\"Powerful laptop with M3 chip...\",Laptop,Apple,4.9,15,2023-11-07,0,\"AI-optimized,Thunderbolt 4,120Hz display\"\n```\n\n## Known Limitations\n\n1. **Numerical Analysis**:\n\n   - Pure RAG approaches may struggle with precise numerical comparisons\n   - The system uses direct data analysis for data-specific queries to ensure accuracy\n   - Complex multi-step numerical reasoning may still be challenging\n\n2. **LLM Limitations**:\n\n   - Local LLMs may have limited context window sizes\n   - The quality of answers depends on the richness of your product data\n\n3. **Data Requirements**:\n   - The system works best with structured CSV data\n   - Missing or inconsistent data may affect answer quality\n   - Large datasets may require more processing time\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request to the [GitHub repository](https://github.com/loftwah/langchain-csv).\n\n## License\n\nThis project is licensed under the MIT License - see the LICENSE file for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Floftwah%2Flangchain-csv","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Floftwah%2Flangchain-csv","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Floftwah%2Flangchain-csv/lists"}