{"id":14992151,"url":"https://github.com/edgenai/llama_cpp-rs","last_synced_at":"2025-04-04T07:04:03.836Z","repository":{"id":203507437,"uuid":"707249291","full_name":"edgenai/llama_cpp-rs","owner":"edgenai","description":"High-level, optionally asynchronous Rust bindings to llama.cpp","archived":false,"fork":false,"pushed_at":"2024-06-05T16:11:00.000Z","size":398,"stargazers_count":215,"open_issues_count":17,"forks_count":38,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-03-28T06:02:51.611Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/edgenai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE-APACHE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-10-19T14:11:55.000Z","updated_at":"2025-03-23T22:57:57.000Z","dependencies_parsed_at":null,"dependency_job_id":"8f1d95ee-8a8a-4c26-be0a-d782b296094e","html_url":"https://github.com/edgenai/llama_cpp-rs","commit_stats":null,"previous_names":["binedge/llama_cpp-rs"],"tags_count":7,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/edgenai%2Fllama_cpp-rs","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/edgenai%2Fllama_cpp-rs/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/edgenai%2Fllama_cpp-rs/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/edgenai%2Fllama_cpp-rs/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/edgenai","download_url":"https://codeload.github.com/edgenai/llama_cpp-rs/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247135108,"owners_count":20889419,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-09-24T15:00:46.340Z","updated_at":"2025-04-04T07:04:03.815Z","avatar_url":"https://github.com/edgenai.png","language":"Rust","funding_links":[],"categories":["Machine Learning","Rust"],"sub_categories":[],"readme":"# llama_cpp-rs\n\n[![Documentation](https://docs.rs/llama_cpp/badge.svg)](https://docs.rs/llama_cpp/)\n[![Crate](https://img.shields.io/crates/v/llama_cpp.svg)](https://crates.io/crates/llama_cpp)\n\nSafe, high-level Rust bindings to the C++ project [of the same name](https://github.com/ggerganov/llama.cpp), meant to\nbe as user-friendly as possible. Run GGUF-based large language models directly on your CPU in fifteen lines of code, no\nML experience required!\n\n```rust\n// Create a model from anything that implements `AsRef\u003cPath\u003e`:\nlet model = LlamaModel::load_from_file(\"path_to_model.gguf\", LlamaParams::default()).expect(\"Could not load model\");\n\n// A `LlamaModel` holds the weights shared across many _sessions_; while your model may be\n// several gigabytes large, a session is typically a few dozen to a hundred megabytes!\nlet mut ctx = model.create_session(SessionParams::default()).expect(\"Failed to create session\");\n\n// You can feed anything that implements `AsRef\u003c[u8]\u003e` into the model's context.\nctx.advance_context(\"This is the story of a man named Stanley.\").unwrap();\n\n// LLMs are typically used to predict the next word in a sequence. Let's generate some tokens!\nlet max_tokens = 1024;\nlet mut decoded_tokens = 0;\n\n// `ctx.start_completing_with` creates a worker thread that generates tokens. When the completion\n// handle is dropped, tokens stop generating!\nlet mut completions = ctx.start_completing_with(StandardSampler::default(), 1024).into_strings();\n\nfor completion in completions {\n    print!(\"{completion}\");\n    let _ = io::stdout().flush();\n    \n    decoded_tokens += 1;\n    \n    if decoded_tokens \u003e max_tokens {\n        break;\n    }\n}\n```\n\nThis repository hosts the high-level bindings (`crates/llama_cpp`) as well as automatically generated bindings to\nllama.cpp's low-level C API (`crates/llama_cpp_sys`). Contributions are welcome--just keep the UX clean!\n\n## Building\n\nKeep in mind that [llama.cpp](https://github.com/ggerganov/llama.cpp) is very computationally heavy, meaning standard\ndebug builds (running just `cargo build`/`cargo run`) will suffer greatly from the lack of optimisations. Therefore,\nunless\ndebugging is really necessary, it is highly recommended to build and run using Cargo's `--release` flag.\n\n### Cargo Features\n\nSeveral of [llama.cpp](https://github.com/ggerganov/llama.cpp)'s backends are supported through features:\n\n- `cuda` - Enables the CUDA backend, the CUDA Toolkit is required for compilation if this feature is enabled.\n- `vulkan` - Enables the Vulkan backend, the Vulkan SDK is required for compilation if this feature is enabled.\n- `metal` - Enables the Metal backend, macOS only.\n- `hipblas` - Enables the hipBLAS/ROCm backend, ROCm is required for compilation if this feature is enabled.\n\n## Experimental\n\nSomething that's provided by these bindings is the ability to predict context size in memory, however it should be\nnoted that this is a highly experimental feature as this isn't something\nthat [llama.cpp](https://github.com/ggerganov/llama.cpp) itself provides.\nThe returned values may be highly inaccurate, however an attempt is made to never return values lower than the real\nsize.\n\n## License\n\nMIT or Apache-2.0, at your option (the \"Rust\" license). See `LICENSE-MIT` and `LICENSE-APACHE`.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fedgenai%2Fllama_cpp-rs","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fedgenai%2Fllama_cpp-rs","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fedgenai%2Fllama_cpp-rs/lists"}