{"id":13601311,"url":"https://github.com/devflowinc/trieve","last_synced_at":"2025-05-13T17:10:08.303Z","repository":{"id":156695963,"uuid":"619284964","full_name":"devflowinc/trieve","owner":"devflowinc","description":"All-in-one infrastructure for search, recommendations, RAG, and analytics offered via API","archived":false,"fork":false,"pushed_at":"2025-05-09T07:10:14.000Z","size":54168,"stargazers_count":2114,"open_issues_count":20,"forks_count":184,"subscribers_count":14,"default_branch":"main","last_synced_at":"2025-05-09T07:21:35.489Z","etag":null,"topics":["actix","actix-web","ai","artificial-intelligence","diesel","embedding","hacktoberfest","llm","postgresql","qdrant","qdrant-vector-database","rag","retrieval-augmented-generation","rust","search","search-engine","solidjs","tailwindcss","vector-search"],"latest_commit_sha":null,"homepage":"https://trieve.ai","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/devflowinc.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.txt","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-03-26T19:45:53.000Z","updated_at":"2025-05-09T07:10:18.000Z","dependencies_parsed_at":"2024-05-20T01:43:44.986Z","dependency_job_id":"a1b23ea0-bd4b-4dce-8399-01dbf961af6c","html_url":"https://github.com/devflowinc/trieve","commit_stats":{"total_commits":4273,"total_committers":54,"mean_commits":79.12962962962963,"dds":0.6805523051720104,"last_synced_commit":"085e150562d76e7de713fbc57316a3658fa2d37e"},"previous_names":["arguflow/vault-server","arguflow/arguflow"],"tags_count":12,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devflowinc%2Ftrieve","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devflowinc%2Ftrieve/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devflowinc%2Ftrieve/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devflowinc%2Ftrieve/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/devflowinc","download_url":"https://codeload.github.com/devflowinc/trieve/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253990467,"owners_count":21995774,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["actix","actix-web","ai","artificial-intelligence","diesel","embedding","hacktoberfest","llm","postgresql","qdrant","qdrant-vector-database","rag","retrieval-augmented-generation","rust","search","search-engine","solidjs","tailwindcss","vector-search"],"created_at":"2024-08-01T18:01:00.292Z","updated_at":"2025-05-13T17:10:03.275Z","avatar_url":"https://github.com/devflowinc.png","language":"Rust","readme":"\u003cp align=\"center\"\u003e\n  \u003cimg height=\"100\" src=\"https://cdn.trieve.ai/trieve-logo.png\" alt=\"Trieve Logo\"\u003e\n\u003c/p\u003e\n\u003cp align=\"center\"\u003e\n\u003cstrong\u003e\u003ca href=\"https://dashboard.trieve.ai\"\u003eSign Up (1k chunks free)\u003c/a\u003e | \u003ca href=\"https://pdf2md.trieve.ai\"\u003ePDF2MD\u003c/a\u003e | \u003ca href=\"https://docs.trieve.ai\"\u003eHacker News Search Engine\u003c/a\u003e | \u003ca href=\"https://docs.trieve.ai\"\u003eDocumentation\u003c/a\u003e | \u003ca href=\"https://cal.com/nick.k/meet\"\u003eMeet a Maintainer\u003c/a\u003e | \u003ca href=\"https://discord.gg/eBJXXZDB8z\"\u003eDiscord\u003c/a\u003e | \u003ca href=\"https://matrix.to/#/#trieve-general:trieve.ai\"\u003eMatrix\u003c/a\u003e\n\u003c/strong\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n    \u003ca href=\"https://github.com/devflowinc/trieve/stargazers\"\u003e\n        \u003cimg src=\"https://img.shields.io/github/stars/devflowinc/trieve.svg?style=flat\u0026color=yellow\" alt=\"Github stars\"/\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://discord.gg/CuJVfgZf54\"\u003e\n        \u003cimg src=\"https://img.shields.io/discord/1130153053056684123.svg?label=Discord\u0026logo=Discord\u0026colorB=7289da\u0026style=flat\" alt=\"Join Discord\"/\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://matrix.to/#/#trieve-general:trieve.ai\"\u003e\n        \u003cimg src=\"https://img.shields.io/badge/matrix-join-purple?style=flat\u0026logo=matrix\u0026logocolor=white\" alt=\"Join Matrix\"/\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://smithery.ai/server/trieve-mcp-server\"\u003e\n        \u003cimg src=\"https://smithery.ai/badge/trieve-mcp-server\" alt=\"smithery badge\"/\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://insiders.vscode.dev/redirect?url=vscode%3Amcp%2Finstall%3F%257B%2522name%2522%253A%2522trieve-mcp-server%2522%252C%2522command%2522%253A%2522npx%2522%252C%2522args%2522%253A%255B%2522more%2520args...%2522%255D%257D\"\u003e\n        \u003cimg src=\"https://img.shields.io/badge/vscode-mcp-install?style=flat\u0026logoColor=%230078d4\u0026label=vscode-mcp\u0026labelColor=%230078d4\u0026link=https%3A%2F%2Finsiders.vscode.dev%2Fredirect%3Furl%3Dvscode%253Amcp%252Finstall%253F%25257B%252522name%252522%25253A%252522trieve-mcp-server%252522%25252C%252522command%252522%25253A%252522npx%252522%25252C%252522args%252522%25253A%25255B%252522more%252520args...%252522%25255D%25257D\" alt=\"vscode mcp install badge\"/\u003e\n    \u003c/a\u003e\n\u003c/p\u003e\n\n\u003ch2 align=\"center\"\u003e\n    \u003cb\u003eAll-in-one solution for search, recommendations, and RAG\u003c/b\u003e\n\u003c/h2\u003e\n\n\u003ca href=\"https://trieve.ai\"\u003e\n  \u003cimg src=\"https://cdn.trieve.ai/landing-tabs/light-api.webp\"\u003e\n\u003c/a\u003e\n\n## Quick Links\n\n- [API Reference + Docs](https://docs.trieve.ai/api-reference)\n- [OpenAPI specification](https://api.trieve.ai/redoc)\n- [Typescript SDK](https://ts-sdk.trieve.ai/)\n- [Python SDK](https://pypi.org/project/trieve-py-client/)\n\n## Features\n\n- **🔒 Self-Hosting in your VPC or on-prem**: We have full self-hosting guides for AWS, GCP, Kubernetes generally, and docker compose available on our [documentation page here](https://docs.trieve.ai/self-hosting/docker-compose).\n- **🧠 Semantic Dense Vector Search**: Integrates with OpenAI or Jina embedding models and [Qdrant](https://qdrant.tech) to provide semantic vector search.\n- **🔍 Typo Tolerant Full-Text/Neural Search**: Every uploaded chunk is vector'ized with [naver/efficient-splade-VI-BT-large-query](https://huggingface.co/naver/efficient-splade-VI-BT-large-query) for typo tolerant, quality neural sparse-vector search.\n- **🖊️ Sub-Sentence Highlighting**: Highlight the matching words or sentences within a chunk and bold them on search to enhance UX for your users. Shout out to the [simsearch](https://github.com/smartdatalake/simsearch) crate!\n- **🌟 Recommendations**: Find similar chunks (or files if using grouping) with the recommendation API. Very helpful if you have a platform where users' favorite, bookmark, or upvote content.\n- **🤖 Convenient RAG API Routes**: We integrate with OpenRouter to provide you with access to any LLM you would like for RAG. Try our routes for [fully-managed RAG with topic-based memory management](https://api.trieve.ai/redoc#tag/message/operation/create_message_completion_handler) or [select your own context RAG](https://api.trieve.ai/redoc#tag/chunk/operation/generate_off_chunks).\n- **💼 Bring Your Own Models**: If you'd like, you can bring your own text-embedding, SPLADE, cross-encoder re-ranking, and/or large-language model (LLM) and plug it into our infrastructure.\n- **🔄 Hybrid Search with cross-encoder re-ranking**: For the best results, use hybrid search with [BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large) re-rank optimization.\n- **📆 Recency Biasing**: Easily bias search results for what was most recent to prevent staleness\n- **🛠️ Tunable Merchandizing**: Adjust relevance using signals like clicks, add-to-carts, or citations\n- **🕳️ Filtering**: Date-range, substring match, tag, numeric, and other filter types are supported.\n- **👥 Grouping**: Mark multiple chunks as being part of the same file and search on the file-level such that the same top-level result never appears twice\n\n**Are we missing a feature that your use case would need?** - call us at [628-222-4090](mailto:+16282224090), make a [Github issue](https://github.com/devflowinc/trieve/issues), or join the [Matrix community](https://matrix.to/#/#trieve-general:trieve.ai) and tell us! We are a small company who is still very hands-on and eager to build what you need; professional services are available.\n\n## Local development with Linux\n\n### Installing via Smithery\n\nTo install Trieve for Claude Desktop automatically via [Smithery](https://smithery.ai/server/trieve-mcp-server):\n\n```bash\nnpx -y @smithery/cli install trieve-mcp-server --client claude\n```\n\n### Debian/Ubuntu Packages needed packages\n\n```sh\nsudo apt install curl \\\ngcc \\\ng++ \\\nmake \\\npkg-config \\\npython3 \\\npython3-pip \\\nlibpq-dev \\\nlibssl-dev \\\nopenssl\n```\n\n### Arch Packages needed\n\n```sh\nsudo pacman -S base-devel postgresql-libs\n```\n\n### Install NodeJS and Yarn\n\nYou can install [NVM](https://github.com/nvm-sh/nvm) using its install script.\n\n```\ncurl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.5/install.sh | bash\n```\n\nYou should restart the terminal to update bash profile with NVM. Then, you can install NodeJS LTS release and Yarn.\n\n```\nnvm install --lts\nnpm install -g yarn\n```\n\n### Make server tmp dir\n\n```\nmkdir server/tmp\n```\n\n### Install rust\n\n```\ncurl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh\n```\n\n### Install cargo-watch\n\n```\ncargo install cargo-watch\n```\n\n### Setup env's\n\nYou might need to create the `analytics` directory in ./frontends\n\n```\ncp .env.analytics ./frontends/analytics/.env\ncp .env.chat ./frontends/chat/.env\ncp .env.search ./frontends/search/.env\ncp .env.example ./server/.env\ncp .env.dashboard ./frontends/dashboard/.env\n```\n\n### Add your `LLM_API_KEY` to `./server/.env`\n\n[Here is a guide for acquiring that](https://blog.streamlit.io/beginners-guide-to-openai-api/#get-your-own-openai-api-key).\n\n#### Steps once you have the key\n\n1. Open the `./server/.env` file\n2. Replace the value for `LLM_API_KEY` to be your own OpenAI API key.\n3. Replace the value for `OPENAI_API_KEY` to be your own OpenAI API key.\n\n### Export the following keys in your terminal for local dev\n\nThe PAGEFIND_CDN_BASE_URL and S3_SECRET_KEY_CSVJSONL could be set to a random list of strings.\n\n```\nexport OPENAI_API_KEY=\"your_OpenAI_api_key\" \\\nLLM_API_KEY=\"your_OpenAI_api_key\" \\\nPAGEFIND_CDN_BASE_URL=\"lZP8X4h0Q5Sj2ZmV,aAmu1W92T6DbFUkJ,DZ5pMvz8P1kKNH0r,QAqwvKh8rI5sPmuW,YMwgsBz7jLfN0oX8\" \\\nS3_SECRET_KEY_CSVJSONL=\"Gq6wzS3mjC5kL7i4KwexnL3gP8Z1a5Xv,V2c4ZnL0uHqBzFvR2NcN8Pb1g6CjmX9J,TfA1h8LgI5zYkH9A9p7NvWlL0sZzF9p8N,pKr81pLq5n6MkNzT1X09R7Qb0Vn5cFr0d,DzYwz82FQiW6T3u9A4z9h7HLOlJb7L2V1\" \\\nGROQ_API_KEY=\"GROQ_API_KEY_if_applicable\"\n\n```\n\n### Start docker container services needed for local dev\n\n```\ncat .env.chat .env.search .env.server .env.docker-compose \u003e .env\n\n./convenience.sh -l\n```\n\n### Install front-end packages for local dev\n\n```\ncd frontends\nyarn\n```\n`cd ..`\n\n```\ncd clients/ts-sdk\nyarn build\n```\n`cd ../..`\n\n### Start services for local dev\n\nIt is recommend to manage services through [tmuxp, see the guide here](https://gist.github.com/skeptrunedev/101c7a13bb9b9242999830655470efac) or terminal tabs.\n\n```\ncd frontends\nyarn\nyarn dev\n```\n\n```\ncd server\ncargo watch -x run\n```\n\n```\ncd server\ncargo run --bin ingestion-worker\n```\n\n```\ncd server\ncargo run --bin file-worker\n```\n\n```\ncd server\ncargo run --bin delete-worker\n```\n\n```\ncd search\nyarn\nyarn dev\n```\n\n### Verify Working Setup\n\nAfter the cargo build has finished (after the `tmuxp load trieve`):\n- check that you can see redoc with the OpenAPI reference at [localhost:8090/redoc](http://localhost:8090/redoc)\n- make an account create a dataset with test data at [localhost:5173](http://localhost:5173)\n- search that dataset with test data at [localhost:5174](http://localhost:5174)\n\n### Additional Instructions for testing cross encoder reranking models\n\nTo test the Cross Encoder rerankers in local dev, \n- click on the dataset, go to the Dataset Settings -\u003e Dataset Options -\u003e Additional Options and uncheck the `Fulltext Enabled` option.\n- in the Embedding Settings, select your reranker model and enter the respective key in the adjacent textbox, and hit save.\n- in the search playground, set Type -\u003e Semantic and select Rerank By -\u003e Cross Encoder\n- if AIMon Reranker is selected in the Embedding Settings, you can enter an optional Task Definition in the search playground to specify the domain of context documents to the AIMon reranker.\n\n\n### Debugging issues with local dev\n\nReach out to us on [discord](https://discord.gg/E9sPRZqpDT) for assistance. We are available and more than happy to assist.\n\n## Debug diesel by getting the exact generated SQL\n\n`diesel::debug_query(\u0026query).to_string();`\n\n## Local Setup for Testing Stripe Features\n\nInstall Stripe CLI.\n\n1. `stripe login`\n2. `stripe listen --forward-to localhost:8090/api/stripe/webhook`\n3. set the `STRIPE_WEBHOOK_SECRET` in the `server/.env` to the resulting webhook signing secret\n4. `stripe products create --name trieve --default-price-data.unit-amount 1200 --default-price-data.currency usd`\n5. `stripe plans create --amount=1200 --currency=usd --interval=month --product={id from response of step 3}`\n\n## Contributors\n\n\u003ca href=\"https://github.com/devflowinc/trieve/graphs/contributors\"\u003e\n  \u003cimg alt=\"contributors\" src=\"https://contrib.rocks/image?repo=devflowinc/trieve\"/\u003e\n\u003c/a\u003e\n","funding_links":[],"categories":["Vector Database Engines","A01_文本生成_文本对话","Rust","artificial-intelligence","APIs and HTTP Requests","服务器实现","MCP Servers","Vector Databases \u0026 Retrieval Platforms"],"sub_categories":["大语言对话模型及数据","数据与知识","Knowledge \u0026 Memory"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdevflowinc%2Ftrieve","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdevflowinc%2Ftrieve","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdevflowinc%2Ftrieve/lists"}