{"id":22668964,"url":"https://github.com/player29879/postgresml","last_synced_at":"2025-03-29T10:44:00.377Z","repository":{"id":252252384,"uuid":"839855086","full_name":"player29879/postgresml","owner":"player29879","description":"Korvus is a search SDK that unifies the entire RAG pipeline in a single database query. Built on top of Postgres with bindings for Python, JavaScript, Rust and C.","archived":false,"fork":false,"pushed_at":"2024-08-08T13:11:12.000Z","size":280,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-04T09:33:17.182Z","etag":null,"topics":["ai","embeddings","javascript","llm","ml","python","rag","search","sql"],"latest_commit_sha":null,"homepage":"http://postgresml.org","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/player29879.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-08T13:09:38.000Z","updated_at":"2024-08-08T15:58:09.000Z","dependencies_parsed_at":"2024-08-08T16:58:16.005Z","dependency_job_id":"b2b6ba9a-409a-4a8a-9df0-f6a628f52a83","html_url":"https://github.com/player29879/postgresml","commit_stats":null,"previous_names":["player29879/postgresml"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/player29879%2Fpostgresml","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/player29879%2Fpostgresml/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/player29879%2Fpostgresml/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/player29879%2Fpostgresml/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/player29879","download_url":"https://codeload.github.com/player29879/postgresml/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246174470,"owners_count":20735409,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","embeddings","javascript","llm","ml","python","rag","search","sql"],"created_at":"2024-12-09T15:17:36.163Z","updated_at":"2025-03-29T10:44:00.357Z","avatar_url":"https://github.com/player29879.png","language":"Rust","readme":"\u003cdiv align=\"center\"\u003e\n   \u003cpicture\u003e\n     \u003csource media=\"(prefers-color-scheme: dark)\" srcset=\"https://github.com/postgresml/korvus/assets/19626586/54dda262-861b-4751-a3ce-0790762f3cbe\"\u003e\n     \u003csource media=\"(prefers-color-scheme: light)\" srcset=\"https://github.com/postgresml/korvus/assets/19626586/f567ce57-35b2-4411-8e43-5f0887a938cb\"\u003e\n     \u003cimg alt=\"Logo\" src=\"\" width=\"520\"\u003e\n   \u003c/picture\u003e\n\u003c/div\u003e\n\n\u003cp align=\"center\"\u003e\n   \u003cp align=\"center\"\u003e\u003cb\u003eOne query to rule them all\u003c/b\u003e\u003c/p\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n| \u003ca href=\"https://postgresml.org/docs/open-source/korvus/\"\u003e\u003cb\u003eDocumentation\u003c/b\u003e\u003c/a\u003e | \u003ca href=\"https://postgresml.org/blog\"\u003e\u003cb\u003eBlog\u003c/b\u003e\u003c/a\u003e | \u003ca href=\"https://discord.gg/DmyJP3qJ7U\"\u003e\u003cb\u003eDiscord\u003c/b\u003e\u003c/a\u003e |\n\u003c/p\u003e\n\n---\n\nKorvus is a search SDK that unifies the entire RAG pipeline in a single database query. Built on top of Postgres with bindings for Python, JavaScript and Rust, Korvus delivers high-performance, customizable search capabilities with minimal infrastructure concerns.\n\n\u003cdetails open\u003e\n\u003csummary\u003e\u003cb\u003e📕 Table of Contents\u003c/b\u003e\u003c/summary\u003e\n\n- [🦅 What is Korvus?](#-what-is-korvus)\n- [🔠 Languages](#-languages)\n- [🏆 Why Korvus?](#-why-korvus)\n- [⚡ Key Features](#-key-features)\n- [🧩 System Architecture](#-system-architecture)\n- [🚀 Get Started](#-get-started)\n- [🔍 The Power of SQL](#-the-power-of-sql)\n- [📘 Documentation](#-documentation)\n- [🌐 Community](#-community)\n- [🤝 Contributing](#-contributing)\n\n\u003c/details\u003e\n\nhttps://github.com/postgresml/korvus/assets/19626586/2b697dc6-8c38-41a7-8c8e-ef158dacb29b\n\n## 🦅 What is Korvus?\n\nKorvus is an all-in-one, open-source RAG (Retrieval-Augmented Generation) pipeline built for Postgres. It combines LLMs, vector memory, embedding generation, reranking, summarization and custom models into a single query, maximizing performance and simplifying your search architecture.\n\n![korvus-demo](https://github.com/postgresml/korvus/assets/19626586/9ee9d695-7630-4da7-ab2a-386e20ae4a68)\n\n## 🔠 Languages\nKorvus provides SDK support for multiple programming languages, allowing you to integrate it seamlessly into your existing tech stack:\n\n- Python: [PyPI Package](https://pypi.org/project/korvus/)\n- JavaScript: [npm Package](https://www.npmjs.com/package/korvus)\n- Rust: [Crates.io Package](https://crates.io/crates/korvus)\n- C: [Build from source](https://postgresml.org/docs/api/client-sdk/)\n\n## 🏆 Why Korvus?\n\nKorvus stands out by harnessing the full power of Postgres for RAG operations:\n\n1. **Postgres-Native RAG**: Korvus leverages Postgres' robust capabilities, allowing you to perform complex RAG operations directly within your database. This approach eliminates the need for external services and API calls, significantly reducing latency and complexity many times over.\n\n2. **Single Query Efficiency**: With Korvus, your entire RAG pipeline - from embedding generation to text generation - is executed in a single SQL query. This \"one query to rule them all\" approach simplifies your architecture and boosts performance.\n\n3. **Scalability and Performance**: By building on Postgres, Korvus inherits its excellent scalability and performance characteristics. As your data grows, Korvus grows with it, maintaining high performance even with large datasets.\n\n## ⚡ Key Features\n\n- **Simplified Architecture**: Replace complex service oriented architectures with a single, powerful query.\n- **High Performance**: Eliminates API calls and data movement for faster processing and greater reliability.\n- **Open Source**: Improve your developer experience with open source software and models that run locally in Docker too.\n- **Multi-Language Support**: Use Korvus with Python, JavaScript and Rust. Open an issue to vote for other language support.\n- **Unified Pipeline**: Combine embedding generation, vector search, reranking, and text generation in one query.\n- **Postgres-Powered**: Under the hood, Korvus operations are powered by efficient SQL queries on a time-tested database platform.\n\n## 🧩 System Architecture\n\nKorvus utilizes PostgresML's pgml extension and the pgvector extension to compress the entire RAG pipeline inside of Postgres.\n\n![PostgresML_Old-V-New_Diagram-Update](https://github.com/postgresml/korvus/assets/19626586/53128313-ded8-4b29-91c4-f585db859c23)\n\n## 🚀 Get Started\n\n### 📋 Prerequisites\n\nTo use Korvus, you need a Postgres database with pgml and pgvector installed. You have two options:\n\n1. **Self-hosted**: Set up your own database with pgml and pgvector.\n   - For instructions, see our [self-hosting guide](https://postgresml.org/docs/resources/developer-docs/quick-start-with-docker).\n\n2. **Hosted Service**: Use our managed Postgres service with pgml and pgvector pre-installed.\n   - [Sign up for PostgresML Cloud](https://postgresml.org/signup).\n\n### 🏁 Quick Start\n\n1. Install Korvus:\n\n```bash\npip install korvus\n```\n\n2. Set the `KORVUS_DATABASE_URL` env variable:\n\n```bash\nexport KORVUS_DATABASE_URL=\"{YOUR DATABASE CONNECTION STRING}\"\n```\n\n3. Initialize a Collection and Pipeline:\n\n```python\nfrom korvus import Collection, Pipeline\nimport asyncio\n\ncollection = Collection(\"korvus-demo-v0\")\npipeline = Pipeline(\n    \"v1\",\n    {\n        \"text\": {\n            \"splitter\": {\"model\": \"recursive_character\"},\n            \"semantic_search\": {\"model\": \"Alibaba-NLP/gte-base-en-v1.5\"},\n        }\n    },\n)\n\nasync def add_pipeline():\n    await collection.add_pipeline(pipeline)\n\nasyncio.run(add_pipeline())\n```\n\n4. Insert documents:\n```python\nasync def upsert_documents():\n    documents = [\n        {\"id\": \"1\", \"text\": \"Korvus is incredibly fast and easy to use.\"},\n        {\"id\": \"2\", \"text\": \"Tomatoes are incredible on burgers.\"},\n    ]\n    await collection.upsert_documents(documents)\n\nasyncio.run(upsert_documents())\n```\n\n5. Perform RAG\n```python\nasync def rag():\n    query = \"Is Korvus fast?\"\n    print(f\"Querying for response to: {query}\")\n    results = await collection.rag(\n        {\n            \"CONTEXT\": {\n                \"vector_search\": {\n                    \"query\": {\n                        \"fields\": {\"text\": {\"query\": query}},\n                    },\n                    \"document\": {\"keys\": [\"id\"]},\n                    \"limit\": 1,\n                },\n                \"aggregate\": {\"join\": \"\\n\"},\n            },\n            \"chat\": {\n                \"model\": \"meta-llama/Meta-Llama-3-8B-Instruct\",\n                \"messages\": [\n                    {\n                        \"role\": \"system\",\n                        \"content\": \"You are a friendly and helpful chatbot\",\n                    },\n                    {\n                        \"role\": \"user\",\n                        \"content\": f\"Given the context\\n:{{CONTEXT}}\\nAnswer the question: {query}\",\n                    },\n                ],\n                \"max_tokens\": 100,\n            },\n        },\n        pipeline,\n    )\n    print(results)\n\nasyncio.run(rag())\n```\n\n## 🔍 The Power of SQL\n\nWhile Korvus provides a high-level interface in multiple programming languages, its core operations are built on optimized SQL queries. This approach offers several advantages:\n\n- **Transparency**: Advanced users can inspect and understand the underlying queries.\n- **Customizability**: Extend Korvus's capabilities by modifying or adding to its SQL operations.\n- **Performance**: Benefit from PostgreSQL's advanced query optimization capabilities.\n\nDon't worry if you're not a SQL expert - Korvus's intuitive API abstracts away the complexity while still allowing you to harness the full power of SQL-based operations.\n\n## 📘 Documentation\n\nFor comprehensive documentation, including API references, tutorials, and best practices, visit our [official documentation](https://postgresml.org/docs/open-source/korvus/).\n\n## 🌐 Community\n\nJoin our community to get help, share ideas, and contribute:\n\n- [Discord](https://discord.gg/DmyJP3qJ7U)\n- [Twitter](https://x.com/postgresml)\n\n## 🤝 Contributing\n\nWe welcome contributions to Korvus! Please read our [Contribution Guidelines](CONTRIBUTING.md) before submitting pull requests.\n\n---\n\nKorvus is maintained by [PostgresML](https://postgresml.org). For enterprise support and consulting services, please [contact us](https://postgresml.org/contact).\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fplayer29879%2Fpostgresml","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fplayer29879%2Fpostgresml","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fplayer29879%2Fpostgresml/lists"}