{"id":26541382,"url":"https://github.com/adityaadaki21/fastapi-rag","last_synced_at":"2026-04-05T23:02:52.450Z","repository":{"id":281714822,"uuid":"946178244","full_name":"AdityaAdaki21/FASTapi-RAG","owner":"AdityaAdaki21","description":null,"archived":false,"fork":false,"pushed_at":"2025-03-20T18:04:13.000Z","size":451,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-09-12T02:35:22.512Z","etag":null,"topics":["chromadb","fastapi","llm","llms","ollama","pdf-processing","rag","retrieval-augmented-generation"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AdityaAdaki21.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-03-10T18:25:32.000Z","updated_at":"2025-03-20T18:04:17.000Z","dependencies_parsed_at":null,"dependency_job_id":"82a945e4-90e9-40d1-9ec1-e719efc362ff","html_url":"https://github.com/AdityaAdaki21/FASTapi-RAG","commit_stats":null,"previous_names":["adityaadaki21/fastapi-rag"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/AdityaAdaki21/FASTapi-RAG","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AdityaAdaki21%2FFASTapi-RAG","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AdityaAdaki21%2FFASTapi-RAG/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AdityaAdaki21%2FFASTapi-RAG/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AdityaAdaki21%2FFASTapi-RAG/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AdityaAdaki21","download_url":"https://codeload.github.com/AdityaAdaki21/FASTapi-RAG/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AdityaAdaki21%2FFASTapi-RAG/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31452901,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-05T21:22:52.476Z","status":"ssl_error","status_checked_at":"2026-04-05T21:22:51.943Z","response_time":75,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chromadb","fastapi","llm","llms","ollama","pdf-processing","rag","retrieval-augmented-generation"],"created_at":"2025-03-22T01:20:19.461Z","updated_at":"2026-04-05T23:02:52.414Z","avatar_url":"https://github.com/AdityaAdaki21.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# F1-AI: Formula 1 RAG Application\n\nF1-AI is a Retrieval-Augmented Generation (RAG) application specifically designed for Formula 1 information. It features an intelligent web scraper that automatically discovers and extracts Formula 1-related content from the web, stores it in a vector database, and enables natural language querying of the stored information.\n\n## Features\n\n![Example](image.png)\n\n- Web scraping of Formula 1 content with automatic content extraction\n- Vector database storage using Pinecone for efficient similarity search\n- OpenRouter integration for advanced LLM capabilities\n- RAG-powered question answering with contextual understanding and source citations\n- Command-line interface for automation and scripting\n- User-friendly Streamlit web interface with chat history\n- Asynchronous data ingestion and processing for improved performance\n\n## Architecture\n\nF1-AI is built on a modern tech stack:\n\n- **LangChain**: Orchestrates the RAG pipeline and manages interactions between components\n- **Pinecone**: Vector database for storing and retrieving embeddings\n- **OpenRouter**: Primary LLM provider with Mistral-7B-Instruct model\n- **Ollama**: Alternative local LLM provider for embeddings\n- **Playwright**: Handles web scraping with JavaScript support\n- **BeautifulSoup4**: Processes HTML content and extracts relevant information\n- **Streamlit**: Provides an interactive web interface with chat functionality\n\n## Prerequisites\n\n- Python 3.8 or higher\n- OpenRouter API key (set as OPENROUTER_API_KEY environment variable)\n- Pinecone API key (set as PINECONE_API_KEY environment variable)\n- 8GB RAM minimum (16GB recommended)\n- Internet connection for web scraping\n- Ollama installed (optional, for local embeddings)\n  - Download from [Ollama's website](https://ollama.ai)\n  - Pull the required model: `ollama pull all-minilm-l6-v2`\n\n## Installation\n\n1. Clone the repository:\n   ```bash\n   git clone \u003crepository-url\u003e\n   cd FASTapi-RAG\n   ```\n\n2. Install the required dependencies:\n   ```bash\n   pip install -r requirements.txt\n   ```\n\n3. Install Playwright browsers:\n   ```bash\n   playwright install\n   ```\n\n4. Set up environment variables:\n   Create a .env file with:\n   ```\n   OPENROUTER_API_KEY=your_api_key_here    # Required for LLM functionality\n   PINECONE_API_KEY=your_api_key_here      # Required for vector storage\n   ```\n\n## Usage\n\n### Command Line Interface\n\n1. Scrape and ingest F1 content:\n   ```bash\n   python f1_scraper.py --start-urls https://www.formula1.com/ --max-pages 100 --depth 2 --ingest\n   ```\n   Options:\n   - `--start-urls`: Space-separated list of URLs to start crawling from\n   - `--max-pages`: Maximum number of pages to crawl (default: 100)\n   - `--depth`: Maximum crawl depth (default: 2)\n   - `--ingest`: Flag to ingest discovered content into RAG system\n   - `--max-chunks`: Maximum chunks per URL for ingestion (default: 50)\n   - `--llm-provider`: Choose LLM provider (openrouter, ollama)\n\n2. Ask questions about Formula 1:\n   ```bash\n   python f1_ai.py ask \"Who won the 2023 F1 World Championship?\"\n   ```\n\n### Streamlit Interface\n\nRun the Streamlit app:\n```bash\nstreamlit run streamlit_app.py\n```\n\nThis will open a web interface where you can:\n- Ask questions about Formula 1\n- View responses in a chat-like interface\n- See source citations for answers\n- Track conversation history\n- Get real-time updates on response generation\n\n## Project Structure\n\n- `f1_scraper.py`: Intelligent web crawler implementation\n  - Automatically discovers F1-related content\n  - Handles content relevance detection\n  - Manages crawling depth and limits\n- `f1_ai.py`: Core RAG application implementation\n  - Handles data ingestion and chunking\n  - Manages vector database operations\n  - Implements question-answering logic\n- `llm_manager.py`: LLM provider management\n  - Integrates with OpenRouter for advanced LLM capabilities\n  - Handles embeddings generation\n  - Manages API interactions\n- `streamlit_app.py`: Streamlit web interface\n  - Provides chat-based UI\n  - Manages conversation history\n  - Handles async operations\n\n## Contributing\n\nContributions are welcome! Please follow these steps:\n\n1. Fork the repository\n2. Create a feature branch\n3. Commit your changes\n4. Push to the branch\n5. Submit a Pull Request","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fadityaadaki21%2Ffastapi-rag","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fadityaadaki21%2Ffastapi-rag","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fadityaadaki21%2Ffastapi-rag/lists"}