{"id":13583883,"url":"https://github.com/ray-project/llm-applications","last_synced_at":"2025-04-11T19:14:28.667Z","repository":{"id":188625184,"uuid":"679091311","full_name":"ray-project/llm-applications","owner":"ray-project","description":"A comprehensive guide to building RAG-based LLM applications for production.","archived":false,"fork":false,"pushed_at":"2024-08-02T00:27:10.000Z","size":31165,"stargazers_count":1786,"open_issues_count":14,"forks_count":250,"subscribers_count":18,"default_branch":"main","last_synced_at":"2025-04-11T19:14:16.484Z","etag":null,"topics":["anyscale","fine-tuning","llama2","llms","machine-learning","openai","ray","serving"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"cc-by-4.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ray-project.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-08-16T04:41:26.000Z","updated_at":"2025-04-08T21:15:57.000Z","dependencies_parsed_at":"2023-10-15T02:58:44.141Z","dependency_job_id":"6db641aa-c192-47b4-9a7d-b1e369ca26b6","html_url":"https://github.com/ray-project/llm-applications","commit_stats":{"total_commits":86,"total_committers":5,"mean_commits":17.2,"dds":"0.40697674418604646","last_synced_commit":"2044d3faac9489379d3af5f555bbc36c4d7b33f6"},"previous_names":["ray-project/llm-applications"],"tags_count":9,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ray-project%2Fllm-applications","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ray-project%2Fllm-applications/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ray-project%2Fllm-applications/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ray-project%2Fllm-applications/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ray-project","download_url":"https://codeload.github.com/ray-project/llm-applications/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248465345,"owners_count":21108244,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anyscale","fine-tuning","llama2","llms","machine-learning","openai","ray","serving"],"created_at":"2024-08-01T15:03:52.220Z","updated_at":"2025-04-11T19:14:28.648Z","avatar_url":"https://github.com/ray-project.png","language":"Jupyter Notebook","funding_links":[],"categories":["Jupyter Notebook","A01_文本生成_文本对话","📋 Playbooks \u0026 Design-Pattern Catalogs","openai","Educational Content"],"sub_categories":["大语言对话模型及数据","T19 · Voice \u0026 Multimodal","Courses and Tutorials"],"readme":"# LLM Applications\n\nA comprehensive guide to building RAG-based LLM applications for production.\n\n- **Blog post**: https://www.anyscale.com/blog/a-comprehensive-guide-for-building-rag-based-llm-applications-part-1\n- **GitHub repository**: https://github.com/ray-project/llm-applications\n- **Interactive notebook**: https://github.com/ray-project/llm-applications/blob/main/notebooks/rag.ipynb\n- **Anyscale Endpoints**: https://endpoints.anyscale.com/\n- **Ray documentation**: https://docs.ray.io/\n\nIn this guide, we will learn how to:\n\n- 💻 Develop a retrieval augmented generation (RAG) based LLM application from scratch.\n- 🚀 Scale the major components (load, chunk, embed, index, serve, etc.) in our application.\n- ✅ Evaluate different configurations of our application to optimize for both per-component (ex. retrieval_score) and overall performance (quality_score).\n- 🔀 Implement LLM hybrid routing approach to bridge the gap b/w OSS and closed LLMs.\n- 📦 Serve the application in a highly scalable and available manner.\n- 💥 Share the 1st order and 2nd order impacts LLM applications have had on our products.\n\n\u003cbr\u003e\n\u003cimg width=\"800\" src=\"https://images.ctfassets.net/xjan103pcp94/7FWrvPPlIdz5fs8wQgxLFz/fdae368044275028f0544a3d252fcfe4/image15.png\"\u003e\n\n## Setup\n\n### API keys\nWe'll be using [OpenAI](https://platform.openai.com/docs/models/) to access ChatGPT models like `gpt-3.5-turbo`, `gpt-4`, etc. and [Anyscale Endpoints](https://endpoints.anyscale.com/) to access OSS LLMs like `Llama-2-70b`. Be sure to create your accounts for both and have your credentials ready.\n\n### Compute\n\u003cdetails\u003e\n  \u003csummary\u003eLocal\u003c/summary\u003e\n  You could run this on your local laptop but a we highly recommend using a setup with access to GPUs. You can set this up on your own or on [Anyscale](http://anyscale.com/).\n\u003c/details\u003e\n\n\u003cdetails open\u003e\n  \u003csummary\u003eAnyscale\u003c/summary\u003e\u003cbr\u003e\n\u003cul\u003e\n\u003cli\u003eStart a new \u003ca href=\"https://console.anyscale-staging.com/o/anyscale-internal/workspaces\"\u003eAnyscale workspace on staging\u003c/a\u003e using an \u003ca href=\"https://instances.vantage.sh/aws/ec2/g3.8xlarge\"\u003e\u003ccode\u003eg3.8xlarge\u003c/code\u003e\u003c/a\u003e head node, which has 2 GPUs and 32 CPUs. We can also add GPU worker nodes to run the workloads faster. If you\u0026#39;re not on Anyscale, you can configure a similar instance on your cloud.\u003c/li\u003e\n\u003cli\u003eUse the \u003ca href=\"https://docs.anyscale.com/reference/base-images/ray-262/py39#ray-2-6-2-py39\"\u003e\u003ccode\u003edefault_cluster_env_2.6.2_py39\u003c/code\u003e\u003c/a\u003e cluster environment.\u003c/li\u003e\n\u003cli\u003eUse the \u003ccode\u003eus-west-2\u003c/code\u003e if you\u0026#39;d like to use the artifacts in our shared storage (source docs, vector DB dumps, etc.).\u003c/li\u003e\n\u003c/ul\u003e\n\n\u003c/details\u003e\n\n### Repository\n```bash\ngit clone https://github.com/ray-project/llm-applications.git .\ngit config --global user.name \u003cGITHUB-USERNAME\u003e\ngit config --global user.email \u003cEMAIL-ADDRESS\u003e\n```\n\n### Data\nOur data is already ready at `/efs/shared_storage/goku/docs.ray.io/en/master/` (on Staging, `us-east-1`) but if you wanted to load it yourself, run this bash command (change `/desired/output/directory`, but make sure it's on the shared storage,\nso that it's accessible to the workers)\n```bash\ngit clone https://github.com/ray-project/llm-applications.git .\n```\n\n### Environment\n\nThen set up the environment correctly by specifying the values in your `.env` file,\nand installing the dependencies:\n\n```bash\npip install --user -r requirements.txt\nexport PYTHONPATH=$PYTHONPATH:$PWD\npre-commit install\npre-commit autoupdate\n```\n\n### Credentials\n```bash\ntouch .env\n# Add environment variables to .env\nOPENAI_API_BASE=\"https://api.openai.com/v1\"\nOPENAI_API_KEY=\"\"  # https://platform.openai.com/account/api-keys\nANYSCALE_API_BASE=\"https://api.endpoints.anyscale.com/v1\"\nANYSCALE_API_KEY=\"\"  # https://app.endpoints.anyscale.com/credentials\nDB_CONNECTION_STRING=\"dbname=postgres user=postgres host=localhost password=postgres\"\nsource .env\n```\n\nNow we're ready to go through the [rag.ipynb](notebooks/rag.ipynb) interactive notebook to develop and serve our LLM application!\n\n### Learn more\n- If your team is investing heavily in developing LLM applications, [reach out](mailto:endpoints-help@anyscale.com) to us to learn more about how [Ray](https://github.com/ray-project/ray) and [Anyscale](http://anyscale.com/) can help you scale and productionize everything.\n- Start serving (+fine-tuning) OSS LLMs with [Anyscale Endpoints](https://endpoints.anyscale.com/) ($1/M tokens for `Llama-3-70b`) and private endpoints available upon request (1M free tokens trial).\n- Learn more about how companies like OpenAI, Netflix, Pinterest, Verizon, Instacart and others leverage Ray and Anyscale for their AI workloads at the [Ray Summit 2024](https://raysummit.anyscale.com/) this Sept 18-20 in San Francisco.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fray-project%2Fllm-applications","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fray-project%2Fllm-applications","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fray-project%2Fllm-applications/lists"}