{"id":50677673,"url":"https://github.com/kassane/llama-cpp-d","last_synced_at":"2026-06-08T16:35:02.187Z","repository":{"id":345521803,"uuid":"1186224773","full_name":"kassane/llama-cpp-d","owner":"kassane","description":"D bindings for llama.cpp","archived":false,"fork":false,"pushed_at":"2026-03-31T13:03:28.000Z","size":58,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-08T14:34:25.660Z","etag":null,"topics":["bindings","d","dlang","ggml","llama-cpp"],"latest_commit_sha":null,"homepage":"https://llama-cpp-d.dub.pm","language":"D","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kassane.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-19T11:59:10.000Z","updated_at":"2026-04-01T13:31:59.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/kassane/llama-cpp-d","commit_stats":null,"previous_names":["kassane/llama-cpp-d"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/kassane/llama-cpp-d","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kassane%2Fllama-cpp-d","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kassane%2Fllama-cpp-d/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kassane%2Fllama-cpp-d/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kassane%2Fllama-cpp-d/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kassane","download_url":"https://codeload.github.com/kassane/llama-cpp-d/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kassane%2Fllama-cpp-d/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34071657,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-08T02:00:07.615Z","response_time":111,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bindings","d","dlang","ggml","llama-cpp"],"created_at":"2026-06-08T16:35:01.695Z","updated_at":"2026-06-08T16:35:02.180Z","avatar_url":"https://github.com/kassane.png","language":"D","funding_links":[],"categories":[],"sub_categories":[],"readme":"# llama-cpp-d\n\n[![CI Build](https://github.com/kassane/llama-cpp-d/actions/workflows/ci.yml/badge.svg)](https://github.com/kassane/llama-cpp-d/actions/workflows/ci.yml)\n![Latest release](https://img.shields.io/github/v/release/kassane/llama-cpp-d?include_prereleases\u0026label=latest)\n[![Static Badge](https://img.shields.io/badge/v2.111.0%20(stable)-f8240e?logo=d\u0026logoColor=f8240e\u0026label=frontend)](https://dlang.org/download.html)\n[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/kassane/llama-cpp-d)\n\nD bindings for [llama.cpp](https://github.com/ggml-org/llama.cpp).\n\n## Requirements\n\n| Tool | Minimum |\n|------|---------|\n| LDC or DMD | ≥ 2.111 (`importC` required) |\n| CMake | ≥ 3.14 |\n| C++17 compiler | GCC / Clang / MSVC |\n\n## How to use\n\n```sh\ndub add llama-cpp-d\n```\n\n## Tools\n\n### hf-download\n\nList and download GGUF files from HuggingFace Hub:\n\n```sh\ncd tools \u0026\u0026 dub build --build=release\n\n# List available .gguf files in a repository\n./build/hf-download -r unsloth/Qwen3.5-0.8B-GGUF\n\n# Download a specific file\n./build/hf-download -r unsloth/Qwen3.5-0.8B-GGUF -f Qwen3.5-0.8B-Q4_K_M.gguf -o ~/models\n\n# With authentication (private repos / higher rate limits)\nHF_TOKEN=hf_xxx ./build/hf-download -r myorg/mymodel -f model.gguf\n```\n\n| Flag | Description |\n|------|-------------|\n| `-r owner/repo` | HuggingFace repository (required) |\n| `-f filename` | File to download; omit to list `.gguf` files |\n| `-o outdir` | Output directory (default: `.`) |\n| `-t token` | HF access token (or `HF_TOKEN` env var) |\n\n## Examples\n\n```sh\n# Text completion\ndub run :simple -- -m model.gguf -n 64 \"Tell me a joke\"\n\n# Tokenization inspector\ndub run :tokenize -- -m model.gguf -s \"Hello, world!\"\n\n# Sentence embeddings (cosine similarity between prompts)\ndub run :embedding -- -m model.gguf\ndub run :embedding -- -m model.gguf -p \"custom sentence\"\n\n# Context state save/load (verifies two runs produce identical output)\ndub run :save-load-state -- -m model.gguf -n 32\n\n# Multimodal (vision/audio) — text only\ndub run :multimodal -c default -- -m model.gguf --mmproj mmproj.gguf -n 200 \"Describe this.\"\n\n# Multimodal with an image\ndub run :multimodal -c default -- -m model.gguf --mmproj mmproj.gguf -i photo.jpg \"What do you see?\"\n```\n\n| Example | Required flags | Optional flags |\n|---------|----------------|----------------|\n| `simple` | `-m \u003cpath\u003e` | `-n \u003ctokens\u003e` (default 32), `-ngl \u003cgpu-layers\u003e` (default 99) |\n| `tokenize` | `-m \u003cpath\u003e` | `-s` include BOS/EOS |\n| `embedding` | `-m \u003cpath\u003e` | `-p \u003ctext\u003e`, `-ngl` (default 99) |\n| `save-load-state` | `-m \u003cpath\u003e` | `-n \u003ctokens\u003e` (default 16), `-ngl`, `--state-file \u003cpath\u003e` |\n| `multimodal` | `-m \u003cpath\u003e`, `--mmproj \u003cpath\u003e` | `-i \u003cimage\u003e`, `-n \u003ctokens\u003e` (default 512), `-ngl` (default 99), `--no-gpu` |\n\n### Configurations\n\n| Config | Description |\n|--------|-------------|\n| `default` | CPU only |\n| `mtmd` | CPU multimodal (llama + libmtmd) |\n| `cuda` | CUDA GPU acceleration |\n| `vulkan` | Vulkan GPU acceleration |\n| `metal` | Apple Metal (macOS) |\n| `hipblas` | AMD ROCm/HIP |\n| `openblas` | OpenBLAS |\n| `openmp` | OpenMP threading |\n| `sycl` | Intel oneAPI SYCL |\n\n## Quick start\n\n### Text completion\n\n```d\nimport llama;\n\nvoid main()\n{\n    loadAllBackends();\n\n    // D-string overload; second arg is GPU layer count (0 = CPU only)\n    auto model = LlamaModel.loadFromFile(\"model.gguf\", 99);\n    assert(model);\n\n    // Context window = model default; batch size = number of prompt tokens\n    auto tokens = tokenize(model.vocab, \"Hello\");\n    auto ctx    = LlamaContext.fromModel(model,\n                      cast(uint) tokens.length + 32,  // nCtx\n                      cast(uint) tokens.length);       // nBatch\n    assert(ctx);\n\n    // Two-statement form: SamplerChain is non-copyable, so no chaining on init\n    auto smpl = SamplerChain.create();\n    smpl.topK(40).topP(0.9f).temp(0.8f).dist();\n\n    auto batch = batchGetOne(tokens);\n    ctx.decode(batch);\n\n    auto next = smpl.sample(ctx); // samples from the last output position\n}\n```\n\n### Multimodal (vision/audio)\n\n```d\nimport llama;\n\nvoid main() @trusted\n{\n    loadAllBackends();\n\n    auto model = LlamaModel.loadFromFile(\"model.gguf\", 99);\n    assert(model);\n\n    auto mparams = mtmd_context_params_default();\n    mparams.use_gpu = true;\n\n    auto mtmd = MtmdContext.initFromFile(\"mmproj.gguf\", model.ptr, mparams);\n    assert(mtmd);\n\n    // Load an image (or skip for text-only)\n    auto bitmap = mtmd.loadBitmap(\"photo.jpg\");\n    assert(bitmap);\n\n    import std.string : fromStringz;\n    string marker    = fromStringz(mtmd_default_marker()).idup;\n    string prompt    = marker ~ \"\\nDescribe the image.\";\n    auto   chunks    = InputChunks.create();\n    auto   inputTxt  = mtmd_input_text(\u0026prompt[0], true, true);\n    const(mtmd_bitmap)*[1] bitmaps = [bitmap.ptr];\n    mtmd.tokenize(chunks, inputTxt, bitmaps[]);\n\n    auto ctx = LlamaContext.fromModel(model,\n                   cast(uint)(chunks.nTokens + 256),\n                   512);\n    assert(ctx);\n\n    llama_pos nPast;\n    mtmd.evalChunks(ctx.ptr, chunks, 0, 0, 512, true, nPast);\n\n    auto smpl = SamplerChain.create();\n    smpl.temp(0.8f).topK(40).topP(0.95f).dist();\n\n    // Generation loop\n    llama_token[1] buf;\n    foreach (i; 0 .. 256)\n    {\n        auto tok = smpl.sample(ctx);\n        if (isEog(model.vocab, tok)) break;\n        import std.stdio : write;\n        write(tokenToString(model.vocab, tok));\n        smpl.accept(tok);\n        buf[0] = tok;\n        ctx.decode(batchGetOne(buf[]));\n    }\n}\n```\n\n## License\n\n[MIT](./LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkassane%2Fllama-cpp-d","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkassane%2Fllama-cpp-d","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkassane%2Fllama-cpp-d/lists"}