{"id":13722422,"url":"https://github.com/undreamai/LLMUnity","last_synced_at":"2025-05-07T15:30:45.823Z","repository":{"id":215121373,"uuid":"736173948","full_name":"undreamai/LLMUnity","owner":"undreamai","description":"Create characters in Unity with LLMs!","archived":false,"fork":false,"pushed_at":"2025-04-30T21:21:46.000Z","size":19728,"stargazers_count":1056,"open_issues_count":15,"forks_count":115,"subscribers_count":18,"default_branch":"main","last_synced_at":"2025-05-02T15:18:03.246Z","etag":null,"topics":["ai","character","chat","chatbot","conversational-ai","dialogue","game-development","gamedev","generative-ai","llama","llama-cpp","llm","npc","rag","unity","unity2d","unity3d"],"latest_commit_sha":null,"homepage":"https://undream.ai","language":"C#","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/undreamai.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":"LICENSE.md","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":["amakropoulos"]}},"created_at":"2023-12-27T07:20:05.000Z","updated_at":"2025-05-02T12:29:44.000Z","dependencies_parsed_at":"2024-04-23T11:56:44.241Z","dependency_job_id":"442b7942-9879-49b4-9fcc-0118a11e136d","html_url":"https://github.com/undreamai/LLMUnity","commit_stats":{"total_commits":1052,"total_committers":10,"mean_commits":105.2,"dds":"0.12357414448669202","last_synced_commit":"5f94629010f8b7f268d77eb9fabff23772cc6fbf"},"previous_names":["undreamai/llmunity"],"tags_count":35,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/undreamai%2FLLMUnity","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/undreamai%2FLLMUnity/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/undreamai%2FLLMUnity/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/undreamai%2FLLMUnity/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/undreamai","download_url":"https://codeload.github.com/undreamai/LLMUnity/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252905550,"owners_count":21822823,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","character","chat","chatbot","conversational-ai","dialogue","game-development","gamedev","generative-ai","llama","llama-cpp","llm","npc","rag","unity","unity2d","unity3d"],"created_at":"2024-08-03T01:01:28.570Z","updated_at":"2025-05-07T15:30:40.814Z","avatar_url":"https://github.com/undreamai.png","language":"C#","funding_links":["https://github.com/sponsors/amakropoulos"],"categories":["Project List"],"sub_categories":["\u003cspan id=\"tool\"\u003eLLM (LLM \u0026 Tool)\u003c/span\u003e"],"readme":"\n\u003cp align=\"center\"\u003e\n\u003cpicture\u003e\n  \u003csource media=\"(prefers-color-scheme: dark)\" srcset=\".github/logo_white.png\"\u003e\n  \u003csource media=\"(prefers-color-scheme: light)\" srcset=\".github/logo.png\"\u003e\n  \u003cimg src=\".github/logo.png\" height=\"150\"/\u003e\n\u003c/picture\u003e\n\u003c/p\u003e\n\n\u003ch3 align=\"center\"\u003eCreate characters in Unity with LLMs!\u003c/h3\u003e\n\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\u003ca href=\"https://discord.gg/RwXKQb6zdv\"\u003e\u003cimg src=\"https://discordapp.com/api/guilds/1194779009284841552/widget.png?style=shield\"/\u003e\u003c/a\u003e\n[![Reddit](https://img.shields.io/badge/Reddit-%23FF4500.svg?style=flat\u0026logo=Reddit\u0026logoColor=white)](https://www.reddit.com/user/UndreamAI)\n[![LinkedIn](https://img.shields.io/badge/LinkedIn-blue?style=flat\u0026logo=linkedin\u0026labelColor=blue)](https://www.linkedin.com/company/undreamai)\n[![Asset Store](https://img.shields.io/badge/Asset%20Store-black.svg?style=flat\u0026logo=unity)](https://assetstore.unity.com/packages/slug/273604)\n[![GitHub Repo stars](https://img.shields.io/github/stars/undreamai/LLMUnity?style=flat\u0026logo=github\u0026color=f5f5f5)](https://github.com/undreamai/LLMUnity)\n[![Documentation](https://img.shields.io/badge/Docs-white.svg?logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAADAAAAAwEAYAAAAHkiXEAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAABl0RVh0U29mdHdhcmUAd3d3Lmlua3NjYXBlLm9yZ5vuPBoAAATqSURBVHic7ZtbiE1RGMc349K4M5EwklwjzUhJCMmTJPJAYjQXJJcH8+Blkry4lPJA8aAoJbekDLmUS6E8SHJL5AW5JPf77eHv93C22Wfttc/ee+0zc/4vv+bMXvusvfZa3/q+b33H80oqqaSSSmqrKnPdgXjUvbvYq5f4+7f486eb/rRajRsn7t4tPngg/vol/vkj/vghXr0q7tghzpyZ//79+on79omXLombNondukXrd9GoSxdx8mSxqUm8eVNkgAvl0aPioEFip07i6dP52z15Ig4fbvVY2VVFhbhokXjrlogJiWvAg/jwoXjqVO73+leUny9eiFVV5mfMlLDRBw+KX76ISQ+0LZ8/F00v4uJFsWPHFh83O+rdWzx3TnQ9wCZ+/Sqyl5iux1RmTu3aiYcPi64H1pasALypoOv4/8SJXraEbXc9kLbECxo2TKyuFj9/zt9u+XIvG8LWv3wpuh5QW86f3/JznT+fv93s2S23C1Z72wbhtH692LdvMvdPSgzkhAkiJhT16ZO/PRPOmcr+Rda4aa5nclTeuZP7PDgRpr1g40bPrQYOFF0PYKHEC+raVVy8OFy7R49EArvURU4mrUAqaTY0iB8/2rXD+XCm5mbR9QAWylevorV7/VpkL0ld06eLpkiyWPj9u93179+LpFZwZ1PXtGnitWui64GMStPmG7SH1NSIJBNHjvTSFZvRvHlise0N9JcBtW1/44Y4dqx45IjnU0JxAGLpklPx+9VZFwPp/9v/eZDGjxcZh7dv4+mXtch+up7Rca+MsJvxiRNi6nvBhg25HWprZMaPGeOlqxEjxGKz+XGRTAAmyJnq6sR370TXA2NLW+8HNjZ62dLOnaLrAQ1r2zmqPH482n0mTfJCKmEvCJHUooNZE/369Elct06kqiKsONRfulTEFDsX8QDlIa5nup9374pE8IiZHPY+ly+LZE/37/cM6mC6IB6Vl4urV6fzfUG6d0/csyf37wsXRFInaM4ckTjGdPg+apTYs6dI3RIWwH//1DV1qkiuxNY2FzrTd+2y6y8z2HQU6efZs+KBAyJZ4v+V0h6ArlwROaQP0uPH4ooV4sqV8Xz/4MF211M2wwoOq1mzRAq5Pnywa5+4KDHE9mI7ly0TO3fOvZ6/eZCoKwB32HS0SMFV1DNtImBKHYstBROoQ4fEQk2RaS+qrxejmj5M7NatIhWARS82xUJfAKahzFcdPnq0GLYgy7Rnbd8e6rGKRyzpuNzPBQty709RcNSZf/KkuHCh2GpMDyKbGNcLYE+YMkVks336NFx7XhTZ3szXiBaqtWvFuAOxM2dEZiyH8UErgc8JLNun7E0aFffSI7RP6owZmz9kSO73HjsmXr8ukppYsybSYyQvBp5QfOjQ3M9tRR496pGgLf1JtLlzRZJzlFzGp4SWDnUxFCrdvy+uWiWa3DJe3N69oj8uSEq8CER88uaNOGBAOv2ILGY69TBBJoM8O0t72zaRoztXBzlLlrT8XARW/IQq82JTMv3mKmv0/9CC4mJMYPwrMSETxAyurRUxQVmXP1fEid7mzeK3b+n2Jzb16CFu2SIWmtNJiriVxANsyq0uoCJfTk4G9y4t24/bSQ0rTkP6gVTG3mz//uKMGSK/ucId5Xe9lZUi5eMMLGUgz56J5Hxu3xZ50Xg3RMIltVn9BRja26PYsBHgAAAAAElFTkSuQmCC)](https://undream.ai/LLMUnity)\n\nLLM for Unity enables seamless integration of Large Language Models (LLMs) within the Unity engine.\u003cbr\u003e\nIt allows to create intelligent characters that your players can interact with for an immersive experience.\u003cbr\u003e\nThe package also features a Retrieval-Augmented Generation (RAG) system that allows to performs semantic search across your data, which can be used to enhance the character's knowledge.\nLLM for Unity is built on top of the awesome [llama.cpp](https://github.com/ggerganov/llama.cpp) library.\n\n\u003csub\u003e\n\u003ca href=\"#at-a-glance\" style=\"color: black\"\u003eAt a glance\u003c/a\u003e\u0026nbsp;\u0026nbsp;•\u0026nbsp;\n\u003ca href=\"#how-to-help\" style=color: black\u003eHow to help\u003c/a\u003e\u0026nbsp;\u0026nbsp;•\u0026nbsp;\n\u003ca href=\"#games-using-llm-for-unity\" style=color: black\u003eGames using LLM for Unity\u003c/a\u003e\u0026nbsp;\u0026nbsp;•\u0026nbsp;\n\u003ca href=\"#setup\" style=color: black\u003eSetup\u003c/a\u003e\u0026nbsp;\u0026nbsp;•\u0026nbsp;\n\u003ca href=\"#how-to-use\" style=color: black\u003eHow to use\u003c/a\u003e\u0026nbsp;\u0026nbsp;•\u0026nbsp;\n\u003ca href=\"#semantic-search-with-a-retrieval-augmented-generation-(rag)-system\" style=color: black\u003eRAG\u003c/a\u003e\u0026nbsp;\u0026nbsp;•\u0026nbsp;\n\u003ca href=\"#llm-model-management\" style=color: black\u003eLLM model management\u003c/a\u003e\u0026nbsp;\u0026nbsp;•\u0026nbsp;\n\u003ca href=\"#examples\" style=color: black\u003eExamples\u003c/a\u003e\u0026nbsp;\u0026nbsp;•\u0026nbsp;\n\u003ca href=\"#options\" style=color: black\u003eOptions\u003c/a\u003e\u0026nbsp;\u0026nbsp;•\u0026nbsp;\n\u003ca href=\"#license\" style=color: black\u003eLicense\u003c/a\u003e\n\u003c/sub\u003e\n\n## At a glance\n- 💻 Cross-platform! Windows, Linux, macOS and Android\n- 🏠 Runs locally without internet access. No data ever leave the game!\n- ⚡ Blazing fast inference on CPU and GPU (Nvidia, AMD, Apple Metal)\n- 🤗 Supports all major LLM models\n- 🔧 Easy to setup, call with a single line of code\n- 💰 Free to use for both personal and commercial purposes\n\n🧪 Tested on Unity: 2021 LTS, 2022 LTS, 2023\u003cbr\u003e\n🚦 [Upcoming Releases](https://github.com/orgs/undreamai/projects/2/views/10)\n\n## How to help\n- [⭐ Star](https://github.com/undreamai/LLMUnity) the repo, leave us a [review](https://assetstore.unity.com/packages/slug/273604) and spread the word about the project!\n- Join us at [Discord](https://discord.gg/RwXKQb6zdv) and say hi.\n- [Contribute](CONTRIBUTING.md) by submitting feature requests, bugs or even your own PR.\n- [![](https://img.shields.io/static/v1?label=Sponsor\u0026message=%E2%9D%A4\u0026logo=GitHub\u0026color=%23fe8e86)](https://github.com/sponsors/amakropoulos) this work to allow even cooler features!\n\n\n## Games using LLM for Unity\n- [Verbal Verdict](https://store.steampowered.com/app/2778780/Verbal_Verdict/)\n- [I, Chatbot: AISYLUM](https://store.epicgames.com/de/p/i-chatbot-aisylum-83b2b5)\n- [Nameless Souls of the Void](https://unicorninteractive.itch.io/nameless-souls-of-the-void)\n- [Murder in Aisle 4](https://roadedlich.itch.io/murder-in-aisle-4)\n- [Finicky Food Delivery AI](https://helixngc7293.itch.io/finicky-food-delivery-ai)\n- [AI Emotional Girlfriend](https://whynames.itch.io/aiemotionalgirlfriend)\n- [Case Closed](https://store.steampowered.com/app/2532160/Case_Closed)\n\nContact us to add your project!\n\n## Setup\n_Method 1: Install using the asset store_\n- Open the [LLM for Unity](https://assetstore.unity.com/packages/slug/273604) asset page and click `Add to My Assets`\n- Open the Package Manager in Unity: `Window \u003e Package Manager`\n- Select the `Packages: My Assets` option from the drop-down\n- Select the `LLM for Unity` package, click `Download` and then `Import`\n\n_Method 2: Install using the GitHub repo:_\n- Open the Package Manager in Unity: `Window \u003e Package Manager`\n- Click the `+` button and select `Add package from git URL`\n- Use the repository URL `https://github.com/undreamai/LLMUnity.git` and click `Add`\n\n## How to use\n\u003cimg height=\"300\" src=\".github/character.png\"/\u003e\n\nFirst you will setup the LLM for your game 🏎:\n- Create an empty GameObject.\u003cbr\u003eIn the GameObject Inspector click `Add Component` and select the LLM script.\n- Download one of the default models with the `Download Model` button (~GBs).\u003cbr\u003eOr load your own .gguf model with the `Load model` button (see [LLM model management](#llm-model-management)).\n\nThen you can setup each of your characters as follows 🙋‍♀️:\n- Create an empty GameObject for the character.\u003cbr\u003eIn the GameObject Inspector click `Add Component` and select the LLMCharacter script.\n- Define the role of your AI in the `Prompt`. You can define the name of the AI (`AI Name`) and the player (`Player Name`).\n- (Optional) Select the LLM constructed above in the `LLM` field if you have more than one LLM GameObjects.\n\nYou can also adjust the LLM and character settings according to your preference (see [Options](#options)).\n\nIn your script you can then use it as follows 🦄:\n``` c#\nusing LLMUnity;\n\npublic class MyScript {\n  public LLMCharacter llmCharacter;\n  \n  void HandleReply(string reply){\n    // do something with the reply from the model\n    Debug.Log(reply);\n  }\n  \n  void Game(){\n    // your game function\n    ...\n    string message = \"Hello bot!\";\n    _ = llmCharacter.Chat(message, HandleReply);\n    ...\n  }\n}\n```\nYou can also specify a function to call when the model reply has been completed.\u003cbr\u003e\nThis is useful if the `Stream` option is enabled for continuous output from the model (default behaviour):\n``` c#\n  void ReplyCompleted(){\n    // do something when the reply from the model is complete\n    Debug.Log(\"The AI replied\");\n  }\n  \n  void Game(){\n    // your game function\n    ...\n    string message = \"Hello bot!\";\n    _ = llmCharacter.Chat(message, HandleReply, ReplyCompleted);\n    ...\n  }\n```\n\nTo stop the chat without waiting for its completion you can use:\n``` c#\n    llmCharacter.CancelRequests();\n```\n\n- Finally, in the Inspector of the GameObject of your script, select the LLMCharacter GameObject created above as the llmCharacter property.\n\nThat's all ✨!\n\u003cbr\u003e\u003cbr\u003e\nYou can also:\n\n\u003cdetails\u003e\n\u003csummary\u003eBuild a mobile app on Android\u003c/summary\u003e\n\nTo build an Android app you need to specify the `IL2CPP` scripting backend and the `ARM64` as the target architecture in the player settings.\u003cbr\u003e\nThese settings can be accessed from the `Edit \u003e Project Settings` menu within the `Player \u003e Other Settings` section.\u003cbr\u003e\n\n\u003cimg width=\"400\" src=\".github/android.png\"\u003e\n\nIt is also a good idea to enable the `Download on Build` option in the LLM GameObject to download the model on launch in order to keep the app size small.\n\n\u003c/details\u003e\n\u003cdetails\u003e\n\u003csummary\u003eSave / Load your chat history\u003c/summary\u003e\n\nTo automatically save / load your chat history, you can specify the `Save` parameter of the LLMCharacter to the filename (or relative path) of your choice.\nThe file is saved in the [persistentDataPath folder of Unity](https://docs.unity3d.com/ScriptReference/Application-persistentDataPath.html).\nThis also saves the state of the LLM which means that the previously cached prompt does not need to be recomputed.\n\nTo manually save your chat history, you can use:\n``` c#\n    llmCharacter.Save(\"filename\");\n```\nand to load the history:\n``` c#\n    llmCharacter.Load(\"filename\");\n```\nwhere filename the filename or relative path of your choice.\n\n\u003c/details\u003e\n\u003cdetails\u003e\n\u003csummary\u003eProcess the prompt at the beginning of your app for faster initial processing time\u003c/summary\u003e\n\n``` c#\n  void WarmupCompleted(){\n    // do something when the warmup is complete\n    Debug.Log(\"The AI is nice and ready\");\n  }\n\n  void Game(){\n    // your game function\n    ...\n    _ = llmCharacter.Warmup(WarmupCompleted);\n    ...\n  }\n```\n\n\u003c/details\u003e\n\u003cdetails\u003e\n\u003csummary\u003eDecide whether or not to add the message to the chat/prompt history\u003c/summary\u003e\n\n  The last argument of the `Chat` function is a boolean that specifies whether to add the message to the history (default: true):\n``` c#\n  void Game(){\n    // your game function\n    ...\n    string message = \"Hello bot!\";\n    _ = llmCharacter.Chat(message, HandleReply, ReplyCompleted, false);\n    ...\n  }\n```\n\n\u003c/details\u003e\n\u003cdetails\u003e\n\u003csummary\u003eUse pure text completion\u003c/summary\u003e\n\n``` c#\n  void Game(){\n    // your game function\n    ...\n    string message = \"The cat is away\";\n    _ = llmCharacter.Complete(message, HandleReply, ReplyCompleted);\n    ...\n  }\n```\n\n\u003c/details\u003e\n\u003cdetails\u003e\n\u003csummary\u003eWait for the reply before proceeding to the next lines of code\u003c/summary\u003e\n\n  For this you can use the `async`/`await` functionality:\n``` c#\n  async void Game(){\n    // your game function\n    ...\n    string message = \"Hello bot!\";\n    string reply = await llmCharacter.Chat(message, HandleReply, ReplyCompleted);\n    Debug.Log(reply);\n    ...\n  }\n```\n\n\u003c/details\u003e\n\u003cdetails\u003e\n\u003csummary\u003eAdd a LLM / LLMCharacter component programmatically\u003c/summary\u003e\n\n``` c#\nusing UnityEngine;\nusing LLMUnity;\n\npublic class MyScript : MonoBehaviour\n{\n    LLM llm;\n    LLMCharacter llmCharacter;\n\n    async void Start()\n    {\n        // disable gameObject so that theAwake is not called immediately\n        gameObject.SetActive(false);\n\n        // Add an LLM object\n        llm = gameObject.AddComponent\u003cLLM\u003e();\n        // set the model using the filename of the model.\n        // The model needs to be added to the LLM model manager (see LLM model management) by loading or downloading it.\n        // Otherwise the model file can be copied directly inside the StreamingAssets folder.\n        llm.SetModel(\"Phi-3-mini-4k-instruct-q4.gguf\");\n        // optional: you can also set loras in a similar fashion and set their weights (if needed)\n        llm.AddLora(\"my-lora.gguf\");\n        llm.SetLoraWeight(0.5f);\n        // optional: you can set the chat template of the model if it is not correctly identified\n        // You can find a list of chat templates in the ChatTemplate.templates.Keys\n        llm.SetTemplate(\"phi-3\");\n        // optional: set number of threads\n        llm.numThreads = -1;\n        // optional: enable GPU by setting the number of model layers to offload to it\n        llm.numGPULayers = 10;\n\n        // Add an LLMCharacter object\n        llmCharacter = gameObject.AddComponent\u003cLLMCharacter\u003e();\n        // set the LLM object that handles the model\n        llmCharacter.llm = llm;\n        // set the character prompt\n        llmCharacter.SetPrompt(\"A chat between a curious human and an artificial intelligence assistant.\");\n        // set the AI and player name\n        llmCharacter.AIName = \"AI\";\n        llmCharacter.playerName = \"Human\";\n        // optional: set streaming to false to get the complete result in one go\n        // llmCharacter.stream = true;\n        // optional: set a save path\n        // llmCharacter.save = \"AICharacter1\";\n        // optional: enable the save cache to avoid recomputation when loading a save file (requires ~100 MB)\n        // llmCharacter.saveCache = true;\n        // optional: set a grammar\n        // await llmCharacter.SetGrammar(\"json.gbnf\");\n\n        // re-enable gameObject\n        gameObject.SetActive(true);\n    }\n}\n```\n\n\u003c/details\u003e\n\u003cdetails\u003e\n\u003csummary\u003eUse a remote server\u003c/summary\u003e\n\nYou can use a remote server to carry out the processing and implement characters that interact with it.\n\n**Create the server**\u003cbr\u003e\nTo create the server:\n- Create a project with a GameObject using the `LLM` script as described above\n- Enable the `Remote` option of the `LLM` and optionally configure the server parameters: port, API key, SSL certificate, SSL key\n- Build and run to start the server\n\nAlternatively you can use a server binary for easier deployment:\n- Run the above scene from the Editor and copy the command from the Debug messages (starting with \"Server command:\")\n- Download the [server binaries](https://github.com/undreamai/LlamaLib/releases/download/v1.1.12/undreamai-v1.1.12-server.zip) and [DLLs](https://github.com/undreamai/LlamaLib/releases/download/v1.1.12/undreamai-v1.1.12-llamacpp-full.zip) and extract them into the same folder\n- Find the architecture you are interested in from the folder above e.g. for Windows and CUDA use the `windows-cuda-cu12.2.0`.\u003cbr\u003eYou can also check the architecture that works for your system from the Debug messages (starting with \"Using architecture\").\n- From command line change directory to the architecture folder selected and start the server by running the command copied from above.\n\n**Create the characters**\u003cbr\u003e\nCreate a second project with the game characters using the `LLMCharacter` script as described above.\nEnable the `Remote` option and configure the host with the IP address (starting with \"http://\") and port of the server.\n\n\u003c/details\u003e\n\u003cdetails\u003e\n\u003csummary\u003eCompute embeddings using a LLM\u003c/summary\u003e\n\nThe `Embeddings` function can be used to obtain the emdeddings of a phrase:\n``` c#\n    List\u003cfloat\u003e embeddings = await llmCharacter.Embeddings(\"hi, how are you?\");\n```\n\n\u003c/details\u003e\n\nA \u003cb\u003edetailed documentation\u003c/b\u003e on function level can be found here:\n\u003ca href=\"https://undream.ai/LLMUnity\"\u003e\u003cimg src=\"https://img.shields.io/badge/Documentation-white.svg?logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAADAAAAAwEAYAAAAHkiXEAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAABl0RVh0U29mdHdhcmUAd3d3Lmlua3NjYXBlLm9yZ5vuPBoAAATqSURBVHic7ZtbiE1RGMc349K4M5EwklwjzUhJCMmTJPJAYjQXJJcH8+Blkry4lPJA8aAoJbekDLmUS6E8SHJL5AW5JPf77eHv93C22Wfttc/ee+0zc/4vv+bMXvusvfZa3/q+b33H80oqqaSSSmqrKnPdgXjUvbvYq5f4+7f486eb/rRajRsn7t4tPngg/vol/vkj/vghXr0q7tghzpyZ//79+on79omXLombNondukXrd9GoSxdx8mSxqUm8eVNkgAvl0aPioEFip07i6dP52z15Ig4fbvVY2VVFhbhokXjrlogJiWvAg/jwoXjqVO73+leUny9eiFVV5mfMlLDRBw+KX76ISQ+0LZ8/F00v4uJFsWPHFh83O+rdWzx3TnQ9wCZ+/Sqyl5iux1RmTu3aiYcPi64H1pasALypoOv4/8SJXraEbXc9kLbECxo2TKyuFj9/zt9u+XIvG8LWv3wpuh5QW86f3/JznT+fv93s2S23C1Z72wbhtH692LdvMvdPSgzkhAkiJhT16ZO/PRPOmcr+Rda4aa5nclTeuZP7PDgRpr1g40bPrQYOFF0PYKHEC+raVVy8OFy7R49EArvURU4mrUAqaTY0iB8/2rXD+XCm5mbR9QAWylevorV7/VpkL0ld06eLpkiyWPj9u93179+LpFZwZ1PXtGnitWui64GMStPmG7SH1NSIJBNHjvTSFZvRvHlise0N9JcBtW1/44Y4dqx45IjnU0JxAGLpklPx+9VZFwPp/9v/eZDGjxcZh7dv4+mXtch+up7Rca+MsJvxiRNi6nvBhg25HWprZMaPGeOlqxEjxGKz+XGRTAAmyJnq6sR370TXA2NLW+8HNjZ62dLOnaLrAQ1r2zmqPH482n0mTfJCKmEvCJHUooNZE/369Elct06kqiKsONRfulTEFDsX8QDlIa5nup9374pE8IiZHPY+ly+LZE/37/cM6mC6IB6Vl4urV6fzfUG6d0/csyf37wsXRFInaM4ckTjGdPg+apTYs6dI3RIWwH//1DV1qkiuxNY2FzrTd+2y6y8z2HQU6efZs+KBAyJZ4v+V0h6ArlwROaQP0uPH4ooV4sqV8Xz/4MF211M2wwoOq1mzRAq5Pnywa5+4KDHE9mI7ly0TO3fOvZ6/eZCoKwB32HS0SMFV1DNtImBKHYstBROoQ4fEQk2RaS+qrxejmj5M7NatIhWARS82xUJfAKahzFcdPnq0GLYgy7Rnbd8e6rGKRyzpuNzPBQty709RcNSZf/KkuHCh2GpMDyKbGNcLYE+YMkVks336NFx7XhTZ3szXiBaqtWvFuAOxM2dEZiyH8UErgc8JLNun7E0aFffSI7RP6owZmz9kSO73HjsmXr8ukppYsybSYyQvBp5QfOjQ3M9tRR496pGgLf1JtLlzRZJzlFzGp4SWDnUxFCrdvy+uWiWa3DJe3N69oj8uSEq8CER88uaNOGBAOv2ILGY69TBBJoM8O0t72zaRoztXBzlLlrT8XARW/IQq82JTMv3mKmv0/9CC4mJMYPwrMSETxAyurRUxQVmXP1fEid7mzeK3b+n2Jzb16CFu2SIWmtNJiriVxANsyq0uoCJfTk4G9y4t24/bSQ0rTkP6gVTG3mz//uKMGSK/ucId5Xe9lZUi5eMMLGUgz56J5Hxu3xZ50Xg3RMIltVn9BRja26PYsBHgAAAAAElFTkSuQmCC\"/\u003e\u003c/a\u003e\n\n## Semantic search with a Retrieval-Augmented Generation (RAG) system\nLLM for Unity implements a super-fast similarity search functionality with a Retrieval-Augmented Generation (RAG) system.\u003cbr\u003e\nIt is based on the LLM functionality, and the Approximate Nearest Neighbors (ANN) search from the [usearch](https://github.com/unum-cloud/usearch) library.\u003cbr\u003e\nSemantic search works as follows.\n\n**Building the data** You provide text inputs (a phrase, paragraph, document) to add to the data.\u003cbr\u003e\nEach input is split into chunks (optional) and encoded into embeddings with a LLM.\n\n**Searching** You can then search for a query text input. \u003cbr\u003e\nThe input is again encoded and the most similar text inputs or chunks in the data are retrieved.\n\nTo use semantic serch:\n- create a GameObject for the LLM as described above. Download one of the provided RAG models or load your own (good options can be found at the [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard)).\n- create an empty GameObject. In the GameObject Inspector click `Add Component` and select the `RAG` script.\n- In the Search Type dropdown of the RAG select your preferred search method. `SimpleSearch` is a simple brute-force search, while`DBSearch` is a fast ANN method that should be preferred in most cases.\n- In the Chunking Type dropdown of the RAG you can select a method for splitting the inputs into chunks. This is useful to have a more consistent meaning within each data part. Chunking methods for splitting according to tokens, words and sentences are provided.\n\nAlternatively, you can create the RAG from code (where llm is your LLM):\n``` c#\n  RAG rag = gameObject.AddComponent\u003cRAG\u003e();\n  rag.Init(SearchMethods.DBSearch, ChunkingMethods.SentenceSplitter, llm);\n```\n\nIn your script you can then use it as follows :unicorn::\n``` c#\nusing LLMUnity;\n\npublic class MyScript : MonoBehaviour\n{\n  RAG rag;\n\n  async void Game(){\n    ...\n    string[] inputs = new string[]{\n      \"Hi! I'm a search system.\",\n      \"the weather is nice. I like it.\",\n      \"I'm a RAG system\"\n    };\n    // add the inputs to the RAG\n    foreach (string input in inputs) await rag.Add(input);\n    // get the 2 most similar inputs and their distance (dissimilarity) to the search query\n    (string[] results, float[] distances) = await rag.Search(\"hello!\", 2);\n    // to get the most similar text parts (chnuks) you can enable the returnChunks option\n    rag.ReturnChunks(true);\n    (results, distances) = await rag.Search(\"hello!\", 2);\n    ...\n  }\n}\n```\n\nYou can save the RAG state (stored in the `Assets/StreamingAssets` folder):\n``` c#\nrag.Save(\"rag.zip\");\n```\nand load it from disk:\n``` c#\nawait rag.Load(\"rag.zip\");\n```\n\nYou can use the RAG to feed relevant data to the LLM based on a user message:\n``` c#\n  string message = \"How is the weather?\";\n  (string[] similarPhrases, float[] distances) = await rag.Search(message, 3);\n\n  string prompt = \"Answer the user query based on the provided data.\\n\\n\";\n  prompt += $\"User query: {message}\\n\\n\";\n  prompt += $\"Data:\\n\";\n  foreach (string similarPhrase in similarPhrases) prompt += $\"\\n- {similarPhrase}\";\n\n  _ = llmCharacter.Chat(prompt, HandleReply, ReplyCompleted);\n```\n\nThe `RAG` sample includes an example RAG implementation as well as an example RAG-LLM integration.\n\nThat's all :sparkles:!\n\n## LLM model management\nLLM for Unity uses a model manager that allows to load or download LLMs and ship them directly in your game.\u003cbr\u003e\nThe model manager can be found as part of the LLM GameObject:\u003cbr\u003e\n\u003cimg width=\"360\" src=\".github/LLM_manager.png\"\u003e\n\nYou can download models with the `Download model` button.\u003cbr\u003e\nLLM for Unity includes different state of the art models built-in for different model sizes, quantised with the Q4_K_M method.\u003cbr\u003e\nAlternative models can be downloaded from [HuggingFace](https://huggingface.co/models?library=gguf\u0026sort=downloads) in the .gguf format.\u003cbr\u003e\nYou can download a model locally and load it with the `Load model` button, or copy the URL in the `Download model \u003e Custom URL` field to directly download it.\u003cbr\u003e\nIf a HuggingFace model does not provide a gguf file, it can be converted to gguf with this [online converter](https://huggingface.co/spaces/ggml-org/gguf-my-repo).\u003cbr\u003e\n\nThe chat template used for constructing the prompts is determined automatically from the model (if a relevant entry exists) or the model name. \u003cbr\u003e\nIf incorrecly identified, you can select another template from the chat template dropdown.\u003cbr\u003e\n\u003cbr\u003e\nModels added in the model manager are copied to the game during the building process.\u003cbr\u003e\nYou can omit a model from being built in by deselecting the \"Build\" checkbox.\u003cbr\u003e\nTo remove the model (but not delete it from disk) you can click the bin button.\u003cbr\u003e\nThe the path and URL (if downloaded) of each added model is diplayed in the expanded view of the model manager access with the `\u003e\u003e` button:\u003cbr\u003e\n\u003cimg width=\"600\" src=\".github/LLM_manager_expanded.png\"\u003e\n\nYou can create lighter builds by selecting the `Download on Build` option.\u003cbr\u003e\nUsing this option the models will be downloaded the first time the game starts instead of copied in the build.\u003cbr\u003e\nIf you have loaded a model locally you need to set its URL through the expanded view, otherwise it will be copied in the build.\u003cbr\u003e\n\n❕ Before using any model make sure you **check their license** ❕\n\n## Examples\nThe [Samples~](Samples~) folder contains several examples of interaction 🤖:\n- [SimpleInteraction](Samples~/SimpleInteraction): Demonstrates a simple interaction with an AI character\n- [MultipleCharacters](Samples~/MultipleCharacters): Demonstrates a simple interaction using multiple AI characters\n- [RAG](Samples~/RAG): RAG sample. Includes an example using the RAG to feed information to a LLM\n- [ChatBot](Samples~/ChatBot): Demonstrates interaction between a player and a AI with a UI similar to a messaging app (see image below)\n- [KnowledgeBaseGame](Samples~/KnowledgeBaseGame): Simple detective game using a knowledge base to provide information to the LLM based on [google/mysteryofthreebots](https://github.com/google/mysteryofthreebots)\n- [AndroidDemo](Samples~/AndroidDemo): Example Android app with an initial screen with model download progress\n  \n\u003cimg width=\"400\" src=\".github/demo.gif\"\u003e\n\nTo install a sample:\n- Open the Package Manager: `Window \u003e Package Manager`\n- Select the `LLM for Unity` Package. From the `Samples` Tab, click `Import` next to the sample you want to install.\n\nThe samples can be run with the `Scene.unity` scene they contain inside their folder.\u003cbr\u003e\nIn the scene, select the `LLM` GameObject and click the `Download Model` button to download a default model or `Load model` to load your own model (see [LLM model management](#llm-model-management)).\u003cbr\u003e\nSave the scene, run and enjoy!\n\n## Options\n\n### LLM Settings\n\n- `Show/Hide Advanced Options` Toggle to show/hide advanced options from below\n- `Log Level` select how verbose the log messages are\n- `Use extras` select to install and allow the use of extra features (flash attention and IQ quants)\n\n#### 💻 Setup Settings\n\n\u003cdiv\u003e\n\u003cimg width=\"300\" src=\".github/LLM_GameObject.png\" align=\"right\"/\u003e\n\u003c/div\u003e\n\n- `Remote` select to provide remote access to the LLM\n- `Port` port to run the LLM server (if `Remote` is set)\n- `Num Threads` number of threads to use (default: -1 = all)\n- `Num GPU Layers` number of model layers to offload to the GPU.\nIf set to 0 the GPU is not used. Use a large number i.e. \u003e30 to utilise the GPU as much as possible.\nNote that higher values of context size will use more VRAM.\nIf the user's GPU is not supported, the LLM will fall back to the CPU\n- `Debug` select to log the output of the model in the Unity Editor\n- \u003cdetails\u003e\u003csummary\u003eAdvanced options\u003c/summary\u003e\n\n  - \u003cdetails\u003e\u003csummary\u003e\u003ccode\u003eParallel Prompts\u003c/code\u003e number of prompts / slots that can happen in parallel (default: -1 = number of LLMCharacter objects). Note that the context size is divided among the slots.\u003c/summary\u003e If you want to retain as much context for the LLM and don't need all the characters present at the same time, you can set this number and specify the slot for each LLMCharacter object.\n  e.g. Setting `Parallel Prompts` to 1 and slot 0 for all LLMCharacter objects will use the full context, but the entire prompt will need to be computed (no caching) whenever a LLMCharacter object is used for chat. \u003c/details\u003e\n  - `Dont Destroy On Load` select to not destroy the LLM GameObject when loading a new Scene\n\n\u003c/details\u003e\n\n### Server Security Settings\n\n- `API key` API key to use to allow access to requests from LLMCharacter objects (if `Remote` is set)\n- \u003cdetails\u003e\u003csummary\u003eAdvanced options\u003c/summary\u003e\n\n  - `Load SSL certificate` allows to load a SSL certificate for end-to-end encryption of requests (if `Remote` is set). Requires SSL key as well.\n  - `Load SSL key` allows to load a SSL key for end-to-end encryption of requests (if `Remote` is set). Requires SSL certificate as well.\n  - `SSL certificate path` the SSL certificate used for end-to-end encryption of requests (if `Remote` is set).\n  - `SSL key path` the SSL key used for end-to-end encryption of requests (if `Remote` is set).\n\n\u003c/details\u003e\n\n#### 🤗 Model Settings\n- `Download model` click to download one of the default models\n- `Load model` click to load your own model in .gguf format\n- `Download on Start` enable to downloaded the LLM models the first time the game starts. Alternatively the LLM models wil be copied directly in the build\n- \u003cdetails\u003e\u003csummary\u003e\u003ccode\u003eContext Size\u003c/code\u003e size of the prompt context (0 = context size of the model)\u003c/summary\u003e This is the number of tokens the model can take as input when generating responses. Higher values use more RAM or VRAM (if using GPU). \u003c/details\u003e\n\n- \u003cdetails\u003e\u003csummary\u003eAdvanced options\u003c/summary\u003e\n\n  - `Download lora` click to download a LoRA model in .gguf format\n  - `Load lora` click to load a LoRA model in .gguf format\n  - `Batch Size` batch size for prompt processing (default: 512)\n  - `Model` the path of the model being used (relative to the Assets/StreamingAssets folder)\n  - `Chat Template` the chat template being used for the LLM\n  - `Lora` the path of the LoRAs being used (relative to the Assets/StreamingAssets folder)\n  - `Lora Weights` the weights of the LoRAs being used\n  - `Flash Attention` click to use flash attention in the model (if `Use extras` is enabled)\n\n\u003c/details\u003e\n\n#### 🗨️ Chat Settings\n- \u003cdetails\u003e\u003csummary\u003eAdvanced options\u003c/summary\u003e\n\n- `Base Prompt` a common base prompt to use across all LLMCharacter objects using the LLM\n\n\u003c/details\u003e\n\n### LLMCharacter Settings\n\n- `Show/Hide Advanced Options` Toggle to show/hide advanced options from below\n- `Log Level` select how verbose the log messages are\n- `Use extras` select to install and allow the use of extra features (flash attention and IQ quants)\n\n#### 💻 Setup Settings\n\u003cdiv\u003e\n\u003cimg width=\"300\" src=\".github/LLMCharacter_GameObject.png\" align=\"right\"/\u003e\n\u003c/div\u003e\n\n- `Remote` whether the LLM used is remote or local\n- `LLM` the LLM GameObject (if `Remote` is not set)\n- `Hort` ip of the LLM server (if `Remote` is set)\n- `Port` port of the LLM server (if `Remote` is set)\n- `Num Retries` number of HTTP request retries from the LLM server (if `Remote` is set)\n- `API key` API key of the LLM server (if `Remote` is set)\n- \u003cdetails\u003e\u003csummary\u003e\u003ccode\u003eSave\u003c/code\u003e save filename or relative path\u003c/summary\u003e If set, the chat history and LLM state (if save cache is enabled) is automatically saved to file specified. \u003cbr\u003e The chat history is saved with a json suffix and the LLM state with a cache suffix. \u003cbr\u003e Both files are saved in the [persistentDataPath folder of Unity](https://docs.unity3d.com/ScriptReference/Application-persistentDataPath.html).\u003c/details\u003e\n- `Save Cache` select to save the LLM state along with the chat history. The LLM state is typically around 100MB+.\n- `Debug Prompt` select to log the constructed prompts in the Unity Editor\n\n#### 🗨️ Chat Settings\n- `Player Name` the name of the player\n- `AI Name` the name of the AI\n- `Prompt` description of the AI role\n\n#### 🤗 Model Settings\n- `Stream` select to receive the reply from the model as it is produced (recommended!).\u003cbr\u003e\nIf it is not selected, the full reply from the model is received in one go\n- \u003cdetails\u003e\u003csummary\u003e\u003ccode\u003eNum Predict\u003c/code\u003e maximum number of tokens to predict (default: 256, -1 = infinity, -2 = until context filled)\u003c/summary\u003eThis is the maximum amount of tokens the model will maximum predict. When N tokens are reached the model will stop generating. This means words / sentences might not get finished if this is too low. \u003c/details\u003e\n\n- \u003cdetails\u003e\u003csummary\u003eAdvanced options\u003c/summary\u003e\n\n  - `Load grammar` click to load a grammar in .gbnf format\n  - `Grammar` the path of the grammar being used (relative to the Assets/StreamingAssets folder)\n  - \u003cdetails\u003e\u003csummary\u003e\u003ccode\u003eCache Prompt\u003c/code\u003e save the ongoing prompt from the chat (default: true)\u003c/summary\u003e Saves the prompt while it is being created by the chat to avoid reprocessing the entire prompt every time\u003c/details\u003e\n  - `Slot` slot of the server to use for computation. Value can be set from 0 to `Parallel Prompts`-1 (default: -1 = new slot for each character)\n  - `Seed` seed for reproducibility. For random results every time use -1\n  - \u003cdetails\u003e\u003csummary\u003e\u003ccode\u003eTemperature\u003c/code\u003e LLM temperature, lower values give more deterministic answers (default: 0.2)\u003c/summary\u003eThe temperature setting adjusts how random the generated responses are. Turning it up makes the generated choices more varied and unpredictable. Turning it down makes the generated responses more predictable and focused on the most likely options.\u003c/details\u003e\n  - \u003cdetails\u003e\u003csummary\u003e\u003ccode\u003eTop K\u003c/code\u003e top-k sampling (default: 40, 0 = disabled)\u003c/summary\u003eThe top k value controls the top k most probable tokens at each step of generation. This value can help fine tune the output and make this adhere to specific patterns or constraints.\u003c/details\u003e\n  - \u003cdetails\u003e\u003csummary\u003e\u003ccode\u003eTop P\u003c/code\u003e top-p sampling (default: 0.9, 1.0 = disabled)\u003c/summary\u003eThe top p value controls the cumulative probability of generated tokens. The model will generate tokens until this theshold (p) is reached. By lowering this value you can shorten output \u0026 encourage / discourage more diverse outputs.\u003c/details\u003e\n  - \u003cdetails\u003e\u003csummary\u003e\u003ccode\u003eMin P\u003c/code\u003e minimum probability for a token to be used (default: 0.05)\u003c/summary\u003e The probability is defined relative to the probability of the most likely token.\u003c/details\u003e\n  - \u003cdetails\u003e\u003csummary\u003e\u003ccode\u003eRepeat Penalty\u003c/code\u003e control the repetition of token sequences in the generated text (default: 1.1)\u003c/summary\u003eThe penalty is applied to repeated tokens.\u003c/details\u003e\n  - \u003cdetails\u003e\u003csummary\u003e\u003ccode\u003ePresence Penalty\u003c/code\u003e repeated token presence penalty (default: 0.0, 0.0 = disabled)\u003c/summary\u003e Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.\u003c/details\u003e\n  - \u003cdetails\u003e\u003csummary\u003e\u003ccode\u003eFrequency Penalty\u003c/code\u003e repeated token frequency penalty (default: 0.0, 0.0 = disabled)\u003c/summary\u003e Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.\u003c/details\u003e\n  - `Tfs_z`: enable tail free sampling with parameter z (default: 1.0, 1.0 = disabled).\n  - `Typical P`: enable locally typical sampling with parameter p (default: 1.0, 1.0 = disabled).\n  - `Repeat Last N`: last N tokens to consider for penalizing repetition (default: 64, 0 = disabled, -1 = ctx-size).\n  - `Penalize Nl`: penalize newline tokens when applying the repeat penalty (default: true).\n  - `Penalty Prompt`: prompt for the purpose of the penalty evaluation. Can be either `null`, a string or an array of numbers representing tokens (default: `null` = use original `prompt`).\n  - `Mirostat`: enable Mirostat sampling, controlling perplexity during text generation (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0).\n  - `Mirostat Tau`: set the Mirostat target entropy, parameter tau (default: 5.0).\n  - `Mirostat Eta`: set the Mirostat learning rate, parameter eta (default: 0.1).\n  - `N Probs`: if greater than 0, the response also contains the probabilities of top N tokens for each generated token (default: 0)\n  - `Ignore Eos`: enable to ignore end of stream tokens and continue generating (default: false).\n\n\u003c/details\u003e\n\n## License\nThe license of LLM for Unity is MIT ([LICENSE.md](LICENSE.md)) and uses third-party software with MIT and Apache licenses.\nSome models included in the asset define their own license terms, please review them before using each model.\nThird-party licenses can be found in the ([Third Party Notices.md](\u003cThird Party Notices.md\u003e)).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fundreamai%2FLLMUnity","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fundreamai%2FLLMUnity","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fundreamai%2FLLMUnity/lists"}