{"id":13452977,"url":"https://github.com/SciSharp/LLamaSharp","last_synced_at":"2025-03-24T00:32:45.247Z","repository":{"id":163964303,"uuid":"638609341","full_name":"SciSharp/LLamaSharp","owner":"SciSharp","description":"A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.","archived":false,"fork":false,"pushed_at":"2025-03-15T23:31:19.000Z","size":410279,"stargazers_count":3046,"open_issues_count":164,"forks_count":397,"subscribers_count":57,"default_branch":"master","last_synced_at":"2025-03-18T02:25:08.847Z","etag":null,"topics":["chatbot","gpt","llama","llama-cpp","llama2","llama3","llamacpp","llava","llm","multi-modal","semantic-kernel"],"latest_commit_sha":null,"homepage":"https://scisharp.github.io/LLamaSharp","language":"C#","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SciSharp.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-05-09T18:03:21.000Z","updated_at":"2025-03-17T17:55:50.000Z","dependencies_parsed_at":null,"dependency_job_id":"570b0a61-df87-405a-84a3-13942f551eeb","html_url":"https://github.com/SciSharp/LLamaSharp","commit_stats":{"total_commits":1217,"total_committers":63,"mean_commits":"19.317460317460316","dds":0.5488907148726376,"last_synced_commit":"0435c656fcf8696950911db9f0b5491847836900"},"previous_names":[],"tags_count":28,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SciSharp%2FLLamaSharp","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SciSharp%2FLLamaSharp/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SciSharp%2FLLamaSharp/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SciSharp%2FLLamaSharp/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SciSharp","download_url":"https://codeload.github.com/SciSharp/LLamaSharp/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245191492,"owners_count":20575246,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chatbot","gpt","llama","llama-cpp","llama2","llama3","llamacpp","llava","llm","multi-modal","semantic-kernel"],"created_at":"2024-07-31T08:00:29.852Z","updated_at":"2025-03-24T00:32:45.227Z","avatar_url":"https://github.com/SciSharp.png","language":"C#","readme":"﻿![logo](Assets/LLamaSharpLogo.png)\n\n[![Discord](https://img.shields.io/discord/1106946823282761851?label=Discord)](https://discord.gg/7wNVU65ZDY)\n[![QQ Group](https://img.shields.io/static/v1?label=QQ\u0026message=加入QQ群\u0026color=brightgreen)](http://qm.qq.com/cgi-bin/qm/qr?_wv=1027\u0026k=sN9VVMwbWjs5L0ATpizKKxOcZdEPMrp8\u0026authKey=RLDw41bLTrEyEgZZi%2FzT4pYk%2BwmEFgFcrhs8ZbkiVY7a4JFckzJefaYNW6Lk4yPX\u0026noverify=0\u0026group_code=985366726)\n[![LLamaSharp Badge](https://img.shields.io/nuget/v/LLamaSharp?label=LLamaSharp)](https://www.nuget.org/packages/LLamaSharp)\n[![LLamaSharp Badge](https://img.shields.io/nuget/v/LLamaSharp.Backend.Cpu?label=LLamaSharp.Backend.Cpu)](https://www.nuget.org/packages/LLamaSharp.Backend.Cpu)\n[![LLamaSharp Badge](https://img.shields.io/nuget/v/LLamaSharp.Backend.Cuda11?label=LLamaSharp.Backend.Cuda11)](https://www.nuget.org/packages/LLamaSharp.Backend.Cuda11)\n[![LLamaSharp Badge](https://img.shields.io/nuget/v/LLamaSharp.Backend.Cuda12?label=LLamaSharp.Backend.Cuda12)](https://www.nuget.org/packages/LLamaSharp.Backend.Cuda12)\n[![LLamaSharp Badge](https://img.shields.io/nuget/v/LLamaSharp.semantic-kernel?label=LLamaSharp.semantic-kernel)](https://www.nuget.org/packages/LLamaSharp.semantic-kernel)\n[![LLamaSharp Badge](https://img.shields.io/nuget/v/LLamaSharp.kernel-memory?label=LLamaSharp.kernel-memory)](https://www.nuget.org/packages/LLamaSharp.kernel-memory)\n[![LLamaSharp Badge](https://img.shields.io/nuget/v/LLamaSharp.Backend.Vulkan?label=LLamaSharp.Backend.Vulkan)](https://www.nuget.org/packages/LLamaSharp.Backend.Vulkan)\n\n\n**LLamaSharp is a cross-platform library to run 🦙LLaMA/LLaVA model (and others) on your local device. Based on [llama.cpp](https://github.com/ggerganov/llama.cpp), inference with LLamaSharp is efficient on both CPU and GPU. With the higher-level APIs and RAG support, it's convenient to deploy LLMs (Large Language Models) in your application with LLamaSharp.**\n\n**Please star the repo to show your support for this project!🤗**\n\n---\n\n\n\u003cdetails\u003e\n  \u003csummary\u003eTable of Contents\u003c/summary\u003e\n  \u003cul\u003e\n    \u003cli\u003e\u003ca href=\"#Documentation\"\u003eDocumentation\u003c/a\u003e\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"#Console Demo\"\u003eConsole Demo\u003c/a\u003e\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"#Integrations \u0026 Examples\"\u003eIntegrations \u0026 Examples\u003c/a\u003e\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"#Get started\"\u003eGet started\u003c/a\u003e\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"#FAQ\"\u003eFAQ\u003c/a\u003e\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"#Contributing\"\u003eContributing\u003c/a\u003e\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"#Join the community\"\u003eJoin the community\u003c/a\u003e\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"#Star history\"\u003eStar history\u003c/a\u003e\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"#Contributor wall of fame\"\u003eContributor wall of fame\u003c/a\u003e\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"#Map of LLamaSharp and llama.cpp versions\"\u003eMap of LLamaSharp and llama.cpp versions\u003c/a\u003e\u003c/li\u003e\n  \u003c/ul\u003e\n\u003c/details\u003e\n\n## 📖Documentation\n\n- [Quick start](https://scisharp.github.io/LLamaSharp/latest/QuickStart/)\n- [FAQ](https://scisharp.github.io/LLamaSharp/latest/FAQ/)\n- [Tutorial](https://scisharp.github.io/LLamaSharp/latest/Tutorials/NativeLibraryConfig/)\n- [Full documentation](https://scisharp.github.io/LLamaSharp/latest/)\n- [API reference](https://scisharp.github.io/LLamaSharp/latest/xmldocs/)\n\n\n## 📌Console Demo\n\n\u003ctable class=\"center\"\u003e\n    \u003ctr style=\"line-height: 0\"\u003e\n    \u003ctd width=50% height=30 style=\"border: none; text-align: center\"\u003eLLaMA\u003c/td\u003e\n    \u003ctd width=50% height=30 style=\"border: none; text-align: center\"\u003eLLaVA\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n    \u003ctd width=25% style=\"border: none\"\u003e\u003cimg src=\"Assets/console_demo.gif\" style=\"width:100%\"\u003e\u003c/td\u003e\n    \u003ctd width=25% style=\"border: none\"\u003e\u003cimg src=\"Assets/llava_demo.gif\" style=\"width:100%\"\u003e\u003c/td\u003e\n    \u003c/tr\u003e\n\u003c/table\u003e\n\n\n## 🔗Integrations \u0026 Examples\n\nThere are integrations for the following libraries, making it easier to develop your APP. Integrations for semantic-kernel and kernel-memory are developed in the LLamaSharp repository, while others are developed in their own repositories.\n\n- [semantic-kernel](https://github.com/microsoft/semantic-kernel): an SDK that integrates LLMs like OpenAI, Azure OpenAI, and Hugging Face.\n- [kernel-memory](https://github.com/microsoft/kernel-memory): a multi-modal AI Service specialized in the efficient indexing of datasets through custom continuous data hybrid pipelines, with support for RAG ([Retrieval Augmented Generation](https://en.wikipedia.org/wiki/Prompt_engineering#Retrieval-augmented_generation)), synthetic memory, prompt engineering, and custom semantic memory processing.\n- [BotSharp](https://github.com/SciSharp/BotSharp): an open source machine learning framework for AI Bot platform builder.\n- [Langchain](https://github.com/tryAGI/LangChain): a framework for developing applications powered by language models.\n\n\nThe following examples show how to build APPs with LLamaSharp.\n\n- [Official Console Examples](./LLama.Examples/)\n- [Unity Demo](https://github.com/eublefar/LLAMASharpUnityDemo)\n- [LLamaStack (with WPF and Web demo)](https://github.com/saddam213/LLamaStack)\n- [Blazor Demo (with Model Explorer)](https://github.com/alexhiggins732/BLlamaSharp.ChatGpt.Blazor)\n- [ASP.NET Demo](./LLama.Web/)\n- [LLamaWorker (ASP.NET Web API like OAI and Function Calling Support)](https://github.com/sangyuxiaowu/LLamaWorker)\n- [VirtualPet (Desktop Application)](https://github.com/AcoranGonzalezMoray/VirtualPet-WindowsEdition)\n\n![LLamaSharp-Integrations](./Assets/LLamaSharp-Integrations.png)\n\n\n## 🚀Get started\n\n### Installation\n\nTo gain high performance, LLamaSharp interacts with native libraries compiled from c++, these are called `backends`. We provide backend packages for Windows, Linux and Mac with CPU, CUDA, Metal and Vulkan. You **don't** need to compile any c++, just install the backend packages.\n\nIf no published backend matches your device, please open an issue to let us know. If compiling c++ code is not difficult for you, you could also follow [this guide](./docs/ContributingGuide.md) to compile a backend and run LLamaSharp with it.\n\n1.  Install [LLamaSharp](https://www.nuget.org/packages/LLamaSharp) package on NuGet:\n\n```\nPM\u003e Install-Package LLamaSharp\n```\n\n2. Install one or more of these backends, or use a self-compiled backend.\n\n   - [`LLamaSharp.Backend.Cpu`](https://www.nuget.org/packages/LLamaSharp.Backend.Cpu): Pure CPU for Windows, Linux \u0026 Mac. Metal (GPU) support for Mac.\n   - [`LLamaSharp.Backend.Cuda11`](https://www.nuget.org/packages/LLamaSharp.Backend.Cuda11): CUDA 11 for Windows \u0026 Linux.\n   - [`LLamaSharp.Backend.Cuda12`](https://www.nuget.org/packages/LLamaSharp.Backend.Cuda12): CUDA 12 for Windows \u0026 Linux.\n   - [`LLamaSharp.Backend.Vulkan`](https://www.nuget.org/packages/LLamaSharp.Backend.Vulkan): Vulkan for Windows \u0026 Linux.\n\n3. (optional) For [Microsoft semantic-kernel](https://github.com/microsoft/semantic-kernel) integration, install the [LLamaSharp.semantic-kernel](https://www.nuget.org/packages/LLamaSharp.semantic-kernel) package.\n4. (optional) To enable RAG support, install the [LLamaSharp.kernel-memory](https://www.nuget.org/packages/LLamaSharp.kernel-memory) package (this package only supports `net6.0` or higher yet), which is based on [Microsoft kernel-memory](https://github.com/microsoft/kernel-memory) integration.\n\n### Model preparation\n\nThere are two popular formats of model file of LLMs, these are PyTorch format (.pth) and Huggingface format (.bin). LLamaSharp uses a `GGUF` format file, which can be converted from these two formats. To get a `GGUF` file, there are two options:\n\n1. Search model name + 'gguf' in [Huggingface](https://huggingface.co), you will find lots of model files that have already been converted to GGUF format. Please take note of the publishing time of them because some old ones may only work with older versions of LLamaSharp.\n\n2. Convert PyTorch or Huggingface format to GGUF format yourself. Please follow the instructions from [this part of llama.cpp readme](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#prepare-and-quantize) to convert them with python scripts.\n\nGenerally, we recommend downloading models with quantization rather than fp16, because it significantly reduces the required memory size while only slightly impacting the generation quality.\n\n\n### Example of LLaMA chat session\n\nHere is a simple example to chat with a bot based on a LLM in LLamaSharp. Please replace the model path with yours.\n\n```cs\nusing LLama.Common;\nusing LLama;\n\nstring modelPath = @\"\u003cYour Model Path\u003e\"; // change it to your own model path.\n\nvar parameters = new ModelParams(modelPath)\n{\n    ContextSize = 1024, // The longest length of chat as memory.\n    GpuLayerCount = 5 // How many layers to offload to GPU. Please adjust it according to your GPU memory.\n};\nusing var model = LLamaWeights.LoadFromFile(parameters);\nusing var context = model.CreateContext(parameters);\nvar executor = new InteractiveExecutor(context);\n\n// Add chat histories as prompt to tell AI how to act.\nvar chatHistory = new ChatHistory();\nchatHistory.AddMessage(AuthorRole.System, \"Transcript of a dialog, where the User interacts with an Assistant named Bob. Bob is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision.\");\nchatHistory.AddMessage(AuthorRole.User, \"Hello, Bob.\");\nchatHistory.AddMessage(AuthorRole.Assistant, \"Hello. How may I help you today?\");\n\nChatSession session = new(executor, chatHistory);\n\nInferenceParams inferenceParams = new InferenceParams()\n{\n    MaxTokens = 256, // No more than 256 tokens should appear in answer. Remove it if antiprompt is enough for control.\n    AntiPrompts = new List\u003cstring\u003e { \"User:\" }, // Stop generation once antiprompts appear.\n\n    SamplingPipeline = new DefaultSamplingPipeline(),\n};\n\nConsole.ForegroundColor = ConsoleColor.Yellow;\nConsole.Write(\"The chat session has started.\\nUser: \");\nConsole.ForegroundColor = ConsoleColor.Green;\nstring userInput = Console.ReadLine() ?? \"\";\n\nwhile (userInput != \"exit\")\n{\n    await foreach ( // Generate the response streamingly.\n        var text\n        in session.ChatAsync(\n            new ChatHistory.Message(AuthorRole.User, userInput),\n            inferenceParams))\n    {\n        Console.ForegroundColor = ConsoleColor.White;\n        Console.Write(text);\n    }\n    Console.ForegroundColor = ConsoleColor.Green;\n    userInput = Console.ReadLine() ?? \"\";\n}\n```\n\nFor more examples, please refer to [LLamaSharp.Examples](./LLama.Examples).\n\n\n## 💡FAQ\n\n#### Why is my GPU not used when I have installed CUDA?\n\n1. If you are using backend packages, please make sure you have installed the CUDA backend package which matches the CUDA version installed on your system.\n2. Add the following line to the very beginning of your code. The log will show which native library file is loaded. If the CPU library is loaded, please try to compile the native library yourself and open an issue for that. If the CUDA library is loaded, please check if `GpuLayerCount \u003e 0` when loading the model weight.\n\n```cs\n    NativeLibraryConfig.All.WithLogCallback(delegate (LLamaLogLevel level, string message) { Console.Write($\"{level}: {message}\"); } )\n```\n\n\n#### Why is the inference so slow?\n\nFirstly, due to the large size of LLM models, it requires more time to generate output than other models, especially when you are using models larger than 30B parameters.\n\nTo see if that's a LLamaSharp performance issue, please follow the two tips below.\n\n1. If you are using CUDA, Metal or Vulkan, please set `GpuLayerCount` as large as possible.\n2. If it's still slower than you expect it to be, please try to run the same model with same setting in [llama.cpp examples](https://github.com/ggerganov/llama.cpp/tree/master/examples). If llama.cpp outperforms LLamaSharp significantly, it's likely a LLamaSharp BUG and please report that to us.\n\n\n#### Why does the program crash before any output is generated?\n\nGenerally, there are two possible cases for this problem:\n\n1. The native library (backend) you are using is not compatible with the LLamaSharp version. If you compiled the native library yourself, please make sure you have checked-out llama.cpp to the corresponding commit of LLamaSharp, which can be found at the bottom of README.\n2. The model file you are using is not compatible with the backend. If you are using a GGUF file downloaded from huggingface, please check its publishing time.\n\n#### Why is my model generating output infinitely?\n\nPlease set anti-prompt or max-length when executing the inference.\n\n\n## 🙌Contributing\n\nAll contributions are welcome! There's a TODO list in [LLamaSharp Dev Project](https://github.com/orgs/SciSharp/projects/5) and you can pick an interesting one to start. Please read the [contributing guide](./CONTRIBUTING.md) for more information. \n\nYou can also do one of the following to help us make LLamaSharp better:\n\n- Submit a feature request.\n- Star and share LLamaSharp to let others know about it.\n- Write a blog or demo about LLamaSharp.\n- Help to develop Web API and UI integration.\n- Just open an issue about the problem you've found!\n\n## Join the community\n\nJoin our chat on [Discord](https://discord.gg/7wNVU65ZDY) (please contact Rinne to join the dev channel if you want to be a contributor).\n\nJoin [QQ group](http://qm.qq.com/cgi-bin/qm/qr?_wv=1027\u0026k=sN9VVMwbWjs5L0ATpizKKxOcZdEPMrp8\u0026authKey=RLDw41bLTrEyEgZZi%2FzT4pYk%2BwmEFgFcrhs8ZbkiVY7a4JFckzJefaYNW6Lk4yPX\u0026noverify=0\u0026group_code=985366726)\n\n## Star history\n\n[![Star History Chart](https://api.star-history.com/svg?repos=SciSharp/LLamaSharp)](https://star-history.com/#SciSharp/LLamaSharp\u0026Date)\n\n## Contributor wall of fame\n\n[![LLamaSharp Contributors](https://contrib.rocks/image?repo=SciSharp/LLamaSharp)](https://github.com/SciSharp/LLamaSharp/graphs/contributors)\n\n## Map of LLamaSharp and llama.cpp versions\nIf you want to compile llama.cpp yourself you **must** use the exact commit ID listed for each version.\n\n| LLamaSharp | Verified Model Resources | llama.cpp commit id |\n| - | -- | - |\n| v0.2.0 | This version is not recommended to use. | - |\n| v0.2.1 | [WizardLM](https://huggingface.co/TheBloke/wizardLM-7B-GGML/tree/previous_llama), [Vicuna (filenames with \"old\")](https://huggingface.co/eachadea/ggml-vicuna-13b-1.1/tree/main) | - |\n| v0.2.2, v0.2.3 | [WizardLM](https://huggingface.co/TheBloke/wizardLM-7B-GGML/tree/previous_llama_ggmlv2), [Vicuna (filenames without \"old\")](https://huggingface.co/eachadea/ggml-vicuna-13b-1.1/tree/main) | `63d2046` |\n| v0.3.0, v0.4.0 | [LLamaSharpSamples v0.3.0](https://huggingface.co/AsakusaRinne/LLamaSharpSamples/tree/v0.3.0), [WizardLM](https://huggingface.co/TheBloke/wizardLM-7B-GGML/tree/main) | `7e4ea5b` |\n| v0.4.1-preview | [Open llama 3b](https://huggingface.co/SlyEcho/open_llama_3b_ggml), [Open Buddy](https://huggingface.co/OpenBuddy/openbuddy-llama-ggml)| `aacdbd4` |\n|v0.4.2-preview | [Llama2 7B (GGML)](https://huggingface.co/TheBloke/llama-2-7B-Guanaco-QLoRA-GGML)| `3323112` |\n| v0.5.1 | [Llama2 7B (GGUF)](https://huggingface.co/TheBloke/llama-2-7B-Guanaco-QLoRA-GGUF)| `6b73ef1` |\n| v0.6.0 | | [`cb33f43`](https://github.com/ggerganov/llama.cpp/commit/cb33f43a2a9f5a5a5f8d290dd97c625d9ba97a2f) |\n| v0.7.0, v0.8.0 | [Thespis-13B](https://huggingface.co/TheBloke/Thespis-13B-v0.5-GGUF/tree/main?not-for-all-audiences=true), [LLaMA2-7B](https://huggingface.co/TheBloke/llama-2-7B-Guanaco-QLoRA-GGUF) | [`207b519`](https://github.com/ggerganov/llama.cpp/commit/207b51900e15cc7f89763a3bb1c565fe11cbb45d) |\n| v0.8.1 | | [`e937066`](https://github.com/ggerganov/llama.cpp/commit/e937066420b79a757bf80e9836eb12b88420a218) |\n| v0.9.0, v0.9.1 | [Mixtral-8x7B](https://huggingface.co/TheBloke/Mixtral-8x7B-v0.1-GGUF) | [`9fb13f9`](https://github.com/ggerganov/llama.cpp/blob/9fb13f95840c722ad419f390dc8a9c86080a3700) |\n| v0.10.0 | [Phi2](https://huggingface.co/TheBloke/phi-2-GGUF) | [`d71ac90`](https://github.com/ggerganov/llama.cpp/tree/d71ac90985854b0905e1abba778e407e17f9f887) |\n| v0.11.1, v0.11.2 | [LLaVA-v1.5](https://hf-mirror.com/jartine/llava-v1.5-7B-GGUF/blob/main/llava-v1.5-7b-mmproj-Q4_0.gguf), [Phi2](https://huggingface.co/TheBloke/phi-2-GGUF)| [`3ab8b3a`](https://github.com/ggerganov/llama.cpp/tree/3ab8b3a92ede46df88bc5a2dfca3777de4a2b2b6) |\n| v0.12.0 | LLama3 | [`a743d76`](https://github.com/ggerganov/llama.cpp/tree/a743d76a01f23038b2c85af1e9048ee836767b44) |\n| v0.13.0 | | [`1debe72`](https://github.com/ggerganov/llama.cpp/tree/1debe72737ea131cb52975da3d53ed3a835df3a6) |\n| v0.14.0 | Gemma2 | [`36864569`](https://github.com/ggerganov/llama.cpp/tree/368645698ab648e390dcd7c00a2bf60efa654f57) |\n| v0.15.0 | LLama3.1 | [`345c8c0c`](https://github.com/ggerganov/llama.cpp/tree/345c8c0c87a97c1595f9c8b14833d531c8c7d8df) |\n| v0.16.0 |  | [`11b84eb4`](https://github.com/ggerganov/llama.cpp/tree/11b84eb4578864827afcf956db5b571003f18180) |\n| v0.17.0 |  | [`c35e586e`](https://github.com/ggerganov/llama.cpp/tree/c35e586ea57221844442c65a1172498c54971cb0) |\n| v0.18.0 |  | [`c35e586e`](https://github.com/ggerganov/llama.cpp/tree/c35e586ea57221844442c65a1172498c54971cb0) |\n| v0.19.0 |  | [`958367bf`](https://github.com/ggerganov/llama.cpp/tree/958367bf530d943a902afa1ce1c342476098576b) |\n| v0.20.0 |  | [`0827b2c1`](https://github.com/ggerganov/llama.cpp/tree/0827b2c1da299805288abbd556d869318f2b121e) |\n| v0.21.0 | [DeepSeek R1](https://huggingface.co/collections/unsloth/deepseek-r1-all-versions-678e1c48f5d2fce87892ace5) | [`5783575c`](https://github.com/ggerganov/llama.cpp/tree/5783575c9d99c4d9370495800663aa5397ceb0be) |\n| v0.22.0 | Gemma3 | [`be7c3034`](https://github.com/ggerganov/llama.cpp/tree/be7c3034108473beda214fd1d7c98fd6a7a3bdf5) |\n| v0.23.0 | Gemma3 | [`be7c3034`](https://github.com/ggerganov/llama.cpp/tree/be7c3034108473beda214fd1d7c98fd6a7a3bdf5) |\n\n## License\n\nThis project is licensed under the terms of the MIT license.\n\n","funding_links":[],"categories":["others","Plugins","C#","C\\#","Tools","SDK, Libraries, Frameworks","Artificial Intelligence","A01_文本生成_文本对话","Open-Source Local LLM Projects"],"sub_categories":["Other","C#","大语言对话模型及数据"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FSciSharp%2FLLamaSharp","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FSciSharp%2FLLamaSharp","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FSciSharp%2FLLamaSharp/lists"}