https://github.com/withcatai/node-llama-cpp

Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level
https://github.com/withcatai/node-llama-cpp

ai bindings catai cmake cmake-js cuda embedding function-calling gguf gpu grammar json-schema llama llama-cpp llm metal nodejs prebuilt-binaries self-hosted vulkan

Last synced: 6 months ago
JSON representation

Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level

Host: GitHub
URL: https://github.com/withcatai/node-llama-cpp
Owner: withcatai
License: mit
Created: 2023-08-12T20:53:16.000Z (about 2 years ago)
Default Branch: master
Last Pushed: 2025-03-28T01:05:34.000Z (7 months ago)
Last Synced: 2025-05-09T15:55:03.409Z (6 months ago)
Topics: ai, bindings, catai, cmake, cmake-js, cuda, embedding, function-calling, gguf, gpu, grammar, json-schema, llama, llama-cpp, llm, metal, nodejs, prebuilt-binaries, self-hosted, vulkan
Language: TypeScript
Homepage: https://node-llama-cpp.withcat.ai
Size: 21.7 MB
Stars: 1,480
Watchers: 16
Forks: 125
Open Issues: 8
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE

Awesome Lists containing this project

StarryDivineSky - withcatai/node-llama-cpp - llama-cpp 是一个 Node.js 库，它允许您在本地机器上运行AI 模型，并提供 llama.cpp 的绑定。该库具有多种功能，包括GPU 支持（Metal、CUDA 和 Vulkan）、预构建的二进制文件（支持 macOS、Linux 和 Windows）、自动硬件适配、完整的 LLM 使用套件、CLI 工具、对最新 llama.cpp 版本的支持、JSON 输出格式控制、函数调用、嵌入支持、完整的 TypeScript 支持和详细的文档。您可以在终端中使用一个命令来与模型进行聊天，也可以通过 npm 安装并使用 TypeScript 代码进行调用。 (A01_文本生成_文本对话 / 大语言对话模型及数据)

README

          


    

    node-llama-cpp

    Run AI models locally on your machine

    _{Pre-built bindings are provided with a fallback to building from source with cmake}

    





[![Build](https://github.com/withcatai/node-llama-cpp/actions/workflows/build.yml/badge.svg)](https://github.com/withcatai/node-llama-cpp/actions/workflows/build.yml)

[![License](https://badgen.net/badge/color/MIT/green?label=license)](https://www.npmjs.com/package/node-llama-cpp)

[![Types](https://badgen.net/badge/color/TypeScript/blue?label=types)](https://www.npmjs.com/package/node-llama-cpp)

[![Version](https://badgen.net/npm/v/node-llama-cpp)](https://www.npmjs.com/package/node-llama-cpp)



✨ [DeepSeek R1 is here!](https://node-llama-cpp.withcat.ai/blog/v3.6-deepseek-r1) ✨

## Features

* Run LLMs locally on your machine

* [Metal, CUDA and Vulkan support](https://node-llama-cpp.withcat.ai/guide/#gpu-support)

* [Pre-built binaries are provided](https://node-llama-cpp.withcat.ai/guide/building-from-source), with a fallback to building from source _**without**_ `node-gyp` or Python

* [Adapts to your hardware automatically](https://node-llama-cpp.withcat.ai/guide/#gpu-support), no need to configure anything

* A Complete suite of everything you need to use LLMs in your projects

* [Use the CLI to chat with a model without writing any code](#try-it-without-installing)

* Up-to-date with the latest `llama.cpp`. Download and compile the latest release with a [single CLI command](https://node-llama-cpp.withcat.ai//guide/building-from-source#downloading-a-release)

* Enforce a model to generate output in a parseable format, [like JSON](https://node-llama-cpp.withcat.ai/guide/chat-session#json-response), or even force it to [follow a specific JSON schema](https://node-llama-cpp.withcat.ai/guide/chat-session#response-json-schema)

* [Provide a model with functions it can call on demand](https://node-llama-cpp.withcat.ai/guide/chat-session#function-calling) to retrieve information or perform actions

* [Embedding and reranking support](https://node-llama-cpp.withcat.ai/guide/embedding)

* [Safe against special token injection attacks](https://node-llama-cpp.withcat.ai/guide/llama-text#input-safety-in-node-llama-cpp)

* Great developer experience with full TypeScript support, and [complete documentation](https://node-llama-cpp.withcat.ai/guide/)

* Much more

## [Documentation](https://node-llama-cpp.withcat.ai)

* [Getting started guide](https://node-llama-cpp.withcat.ai/guide/)

* [API reference](https://node-llama-cpp.withcat.ai/api/functions/getLlama)

* [CLI help](https://node-llama-cpp.withcat.ai/cli/)

* [Blog](https://node-llama-cpp.withcat.ai/blog/)

* [Changelog](https://github.com/withcatai/node-llama-cpp/releases)

* [Roadmap](https://github.com/orgs/withcatai/projects/1)

## Try It Without Installing

Chat with a model in your terminal using [a single command](https://node-llama-cpp.withcat.ai/cli/chat):

```bash

npx -y node-llama-cpp chat

```

## Installation

```bash

npm install node-llama-cpp

```

[This package comes with pre-built binaries](https://node-llama-cpp.withcat.ai/guide/building-from-source) for macOS, Linux and Windows.

If binaries are not available for your platform, it'll fallback to download a release of `llama.cpp` and build it from source with `cmake`.

To disable this behavior, set the environment variable `NODE_LLAMA_CPP_SKIP_DOWNLOAD` to `true`.

## Usage

```typescript

import {fileURLToPath} from "url";

import path from "path";

import {getLlama, LlamaChatSession} from "node-llama-cpp";

const __dirname = path.dirname(fileURLToPath(import.meta.url));

const llama = await getLlama();

const model = await llama.loadModel({

    modelPath: path.join(__dirname, "models", "Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf")

});

const context = await model.createContext();

const session = new LlamaChatSession({

    contextSequence: context.getSequence()

});

const q1 = "Hi there, how are you?";

console.log("User: " + q1);

const a1 = await session.prompt(q1);

console.log("AI: " + a1);

const q2 = "Summarize what you said";

console.log("User: " + q2);

const a2 = await session.prompt(q2);

console.log("AI: " + a2);

```

> For more examples, see the [getting started guide](https://node-llama-cpp.withcat.ai/guide/)

## Contributing

To contribute to `node-llama-cpp` read the [contribution guide](https://node-llama-cpp.withcat.ai/guide/contributing).

## Acknowledgements

* llama.cpp: [ggml-org/llama.cpp](https://github.com/ggml-org/llama.cpp)






    

    


    

        If you like this repo, star it ✨

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/withcatai/node-llama-cpp

Awesome Lists containing this project

README

node-llama-cpp