{"id":29664885,"url":"https://github.com/juliopeixoto/softrag","last_synced_at":"2025-07-22T13:07:25.184Z","repository":{"id":294138020,"uuid":"983152435","full_name":"JulioPeixoto/softrag","owner":"JulioPeixoto","description":"Minimal local-first RAG library powered by SQLite + sqlite-vec.","archived":false,"fork":false,"pushed_at":"2025-07-01T02:36:39.000Z","size":824,"stargazers_count":18,"open_issues_count":0,"forks_count":3,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-07-01T03:36:51.387Z","etag":null,"topics":["agent","chatgpt","generative-ai","image2text","llm","nlp","open-source","openai","rag","retrieval-augmented-generation","sql","sqlite3","text2text","vector-database"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/JulioPeixoto.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-14T00:52:47.000Z","updated_at":"2025-07-01T02:36:40.000Z","dependencies_parsed_at":"2025-05-19T17:48:07.260Z","dependency_job_id":"ce6b05b8-4911-42a8-9095-d3cd05aabd34","html_url":"https://github.com/JulioPeixoto/softrag","commit_stats":null,"previous_names":["softrag/softrag","juliopeixoto/softrag"],"tags_count":8,"template":false,"template_full_name":null,"purl":"pkg:github/JulioPeixoto/softrag","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JulioPeixoto%2Fsoftrag","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JulioPeixoto%2Fsoftrag/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JulioPeixoto%2Fsoftrag/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JulioPeixoto%2Fsoftrag/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/JulioPeixoto","download_url":"https://codeload.github.com/JulioPeixoto/softrag/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JulioPeixoto%2Fsoftrag/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266499444,"owners_count":23938866,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-22T02:00:09.085Z","response_time":66,"last_error":null,"robots_txt_status":null,"robots_txt_updated_at":null,"robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent","chatgpt","generative-ai","image2text","llm","nlp","open-source","openai","rag","retrieval-augmented-generation","sql","sqlite3","text2text","vector-database"],"created_at":"2025-07-22T13:07:24.437Z","updated_at":"2025-07-22T13:07:25.176Z","avatar_url":"https://github.com/JulioPeixoto.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# softrag [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![Python 3.12+](https://img.shields.io/badge/python-3.12+-blue.svg)](https://www.python.org/downloads/) [![PyPI version](https://img.shields.io/pypi/v/softrag.svg)](https://pypi.org/project/softrag/)\r\n\r\n\u003cdiv align=\"center\"\u003e\r\n  \u003cimg src=\"piriquito.png\" width=\"150\" alt=\"SoftRAG mascot – periquito\"/\u003e\r\n\u003c/div\u003e\r\n\r\nMinimal **local-first** Retrieval-Augmented Generation (RAG) library powered by **SQLite + sqlite-vec**.  \r\nEverything—documents, embeddings, cache—lives in a single `.db` file.\r\n\r\ncreated by [Julio Peixoto](https://gh.com/JulioPeixoto).\r\n\r\n---\r\n\r\n## 🌟 Features\r\n\r\n- **Local-first** – All processing happens locally, no external services required for storage\r\n- **SQLite + sqlite-vec** – Documents, embeddings, and cache in a single `.db` file\r\n- **Model-agnostic** – Works with OpenAI, Hugging Face, Ollama, or any compatible models\r\n- **Blazing-fast** – Optimized for minimal overhead and maximum throughput\r\n- **Multi-format support** – PDF, DOCX, Markdown, text files, web pages, and **images**\r\n- **Image understanding** – Uses GPT-4 Vision to analyze and describe images for semantic search\r\n- **Hybrid retrieval** – Combines keyword search (FTS5) and semantic similarity\r\n- **Unified search** – Query across text documents and image descriptions seamlessly\r\n\r\n## 🚀 Quick Start\r\n\r\n```bash\r\npip install softrag\r\n```\r\n\r\n```python\r\nfrom softrag import Rag\r\nfrom langchain_openai import ChatOpenAI, OpenAIEmbeddings\r\n\r\n# Initialize\r\nrag = Rag(\r\n    embed_model=OpenAIEmbeddings(model=\"text-embedding-3-small\"),\r\n    chat_model=ChatOpenAI(model=\"gpt-4o\")\r\n)\r\n\r\n# Add different types of content\r\nrag.add_file(\"document.pdf\")\r\nrag.add_web(\"https://example.com/article\")\r\nrag.add_image(\"photo.jpg\")  # 🆕 Image support!\r\n\r\n# Query across all content types\r\nanswer = rag.query(\"What is shown in the image and how does it relate to the document?\")\r\nprint(answer)\r\n```\r\n\r\n## 📚 Documentation\r\n\r\nFor complete documentation, examples, and advanced usage, see: **[docs/softrag.md](docs/softrag.md)**\r\n\r\n## 🛠️ Next Steps\r\n\r\n- Documentation Creation: Develop comprehensive documentation using tools like Sphinx or MkDocs to provide clear guidance on installation, usage, and contribution.\r\n- Image Support in RAG: Integrate capabilities to handle image data, enabling the retrieval and generation of content based on visual inputs. This could involve incorporating models like CLIP for image embeddings.\r\n- Automated Testing: Implement unit and integration tests using frameworks such as pytest to ensure code reliability and facilitate maintenance.\r\n- Support for Multiple LLM Backends: Extend compatibility to include various language model providers, such as OpenAI, Hugging Face Transformers, and local models, offering users flexibility in choosing their preferred backend.\r\n- Enhanced Context Retrieval: Improve the relevance of retrieved documents by integrating reranking techniques or advanced retrieval models, ensuring more accurate and contextually appropriate responses.\r\n- Performance Benchmarking: Conduct performance evaluations to assess Softrag's efficiency and scalability, comparing it with other RAG solutions to identify areas for optimization.\r\n- Monitoring and Logging: Implement logging mechanisms to track system operations and facilitate debugging, as well as monitoring tools to observe performance metrics and system health.\r\n\r\n## 🤝 Contributing\r\n\r\nWe welcome contributions! Here's how to get started:\r\n\r\n### Development Setup\r\n\r\nThis project uses [uv](https://docs.astral.sh/uv/) for dependency management. Make sure you have it installed:\r\n\r\n```bash\r\n# Install uv if you haven't already\r\ncurl -LsSf https://astral.sh/uv/install.sh | sh\r\n```\r\n\r\n### Getting Started\r\n\r\n1. **Fork and clone the repository:**\r\n   ```bash\r\n   git clone https://github.com/yourusername/softrag.git\r\n   cd softrag\r\n   ```\r\n\r\n2. **Install dependencies with uv:**\r\n   ```bash\r\n   uv sync --dev\r\n   ```\r\n\r\n3. **Activate the virtual environment:**\r\n   ```bash\r\n   source .venv/bin/activate  # On Windows: .venv\\Scripts\\activate\r\n   ```\r\n\r\n### Making Changes\r\n\r\n1. Create a new branch for your feature/fix\r\n2. Make your changes\r\n3. Add tests if applicable\r\n4. Ensure all tests pass\r\n5. Submit a pull request\r\n\r\n### Project Structure\r\n\r\n- `src/softrag/` - Main library code\r\n- `docs/` - Documentation\r\n- `examples/` - Usage examples\r\n- `tests/` - Test suite\r\n\r\n## 📜 License\r\n\r\nThis project is licensed under the MIT License - see the LICENSE file for details.\r\n\r\n## Give to us your star ⭐\r\n\r\nDeveloped with ❤️ for community\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjuliopeixoto%2Fsoftrag","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjuliopeixoto%2Fsoftrag","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjuliopeixoto%2Fsoftrag/lists"}