{"id":26671124,"url":"https://github.com/scaleapi/llm-engine","last_synced_at":"2025-04-09T01:24:35.307Z","repository":{"id":181444160,"uuid":"666119207","full_name":"scaleapi/llm-engine","owner":"scaleapi","description":"Scale LLM Engine public repository","archived":false,"fork":false,"pushed_at":"2024-10-30T01:05:55.000Z","size":5511,"stargazers_count":780,"open_issues_count":30,"forks_count":55,"subscribers_count":24,"default_branch":"main","last_synced_at":"2024-10-30T02:50:47.150Z","etag":null,"topics":["fine-tune","llm","python","scaleai"],"latest_commit_sha":null,"homepage":"https://llm-engine.scale.com","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/scaleapi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"docs/contributing.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-07-13T18:57:44.000Z","updated_at":"2024-10-30T00:57:07.000Z","dependencies_parsed_at":"2023-08-16T15:00:05.578Z","dependency_job_id":"b86bf4c4-2489-47a2-9e16-a72fce28c576","html_url":"https://github.com/scaleapi/llm-engine","commit_stats":null,"previous_names":["scaleapi/llm-engine","scaleapi/spellbook-serve"],"tags_count":17,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scaleapi%2Fllm-engine","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scaleapi%2Fllm-engine/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scaleapi%2Fllm-engine/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scaleapi%2Fllm-engine/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/scaleapi","download_url":"https://codeload.github.com/scaleapi/llm-engine/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247954382,"owners_count":21024213,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["fine-tune","llm","python","scaleai"],"created_at":"2025-03-25T23:43:55.809Z","updated_at":"2025-04-09T01:24:35.282Z","avatar_url":"https://github.com/scaleapi.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# LLM Engine\n\n[![LICENSE](https://img.shields.io/github/license/scaleapi/llm-engine.svg)](https://github.com/scaleapi/llm-engine/blob/master/LICENSE)\n[![Release Notes](https://img.shields.io/github/release/scaleapi/llm-engine)](https://github.com/scaleapi/llm-engine/releases)\n[![CircleCI](https://circleci.com/gh/scaleapi/llm-engine.svg?style=shield)](https://circleci.com/gh/scaleapi/llm-engine)\n\n🚀 **The open source engine for fine-tuning and serving large language models**. 🚀\n\nScale's LLM Engine is the easiest way to customize and serve LLMs. In LLM Engine, models can be accessed via Scale's hosted version or by using the Helm charts in this repository to run model inference and fine-tuning in your own infrastructure.\n\n## 💻 Quick Install\n\n```commandline\npip install scale-llm-engine\n```\n\n## 🤔 About\n\nFoundation models are emerging as the building blocks of AI. However,\ndeploying these models to the cloud and fine-tuning them are expensive\noperations that require infrastructure and ML expertise. It is also difficult\nto maintain over time as new models are released and new techniques for both\ninference and fine-tuning are made available.\n\nLLM Engine is a Python library, CLI, and Helm chart that provides\neverything you need to serve and fine-tune foundation models, whether you use\nScale's hosted infrastructure or do it in your own cloud infrastructure using\nKubernetes.\n\n### Key Features\n\n🎁 **Ready-to-use APIs for your favorite models**: Deploy and serve\nopen-source foundation models — including LLaMA, MPT and Falcon.\nUse Scale-hosted models or deploy to your own infrastructure.\n\n🔧 **Fine-tune foundation models**: Fine-tune open-source foundation\nmodels on your own data for optimized performance.\n\n🎙️ **Optimized Inference**: LLM Engine provides inference APIs\nfor streaming responses and dynamically batching inputs for higher throughput\nand lower latency.\n\n🤗 **Open-Source Integrations**: Deploy any [Hugging Face](https://huggingface.co/)\nmodel with a single command.\n\n### Features Coming Soon\n\n🐳 **K8s Installation Documentation**: We are working hard to document installation and\nmaintenance of inference and fine-tuning functionality on your own infrastructure.\nFor now, our documentation covers using our client libraries to access Scale's\nhosted infrastructure.\n\n❄ **Fast Cold-Start Times**: To prevent GPUs from idling, LLM Engine\nautomatically scales your model to zero when it's not in use and scales up\nwithin seconds, even for large foundation models.\n\n💸 **Cost Optimization**: Deploy AI models cheaper than commercial ones,\nincluding cold-start and warm-down times.\n\n## 🚀 Quick Start\n\nNavigate to [Scale Spellbook](https://spellbook.scale.com/) to first create \nan account, and then grab your API key on the [Settings](https://spellbook.scale.com/settings) \npage. Set this API key as the `SCALE_API_KEY` environment variable by adding the\nfollowing line to your `.zshrc` or `.bash_profile`:\n\n```commandline\nexport SCALE_API_KEY=\"[Your API key]\"\n```\n\nIf you run into an \"Invalid API Key\" error, you may need to run the `. ~/.zshrc` command to \nre-read your updated `.zshrc`.\n\n\nWith your API key set, you can now send LLM Engine requests using the Python client. \nTry out this starter code:\n\n```py\nfrom llmengine import Completion\n\nresponse = Completion.create(\n    model=\"falcon-7b-instruct\",\n    prompt=\"I'm opening a pancake restaurant that specializes in unique pancake shapes, colors, and flavors. List 3 quirky names I could name my restaurant.\",\n    max_new_tokens=100,\n    temperature=0.2,\n)\n\nprint(response.output.text)\n```\n\nYou should see a successful completion of your given prompt!\n\n_What's next?_ Visit the [LLM Engine documentation pages](https://scaleapi.github.io/llm-engine/) for more on\nthe `Completion` and `FineTune` APIs and how to use them. Check out this [blog post](https://scale.com/blog/fine-tune-llama-2) for an end-to-end example.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fscaleapi%2Fllm-engine","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fscaleapi%2Fllm-engine","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fscaleapi%2Fllm-engine/lists"}