https://github.com/developer239/llama.cpp-ts
llama.cpp 🦙 LLM inference in TypeScript
https://github.com/developer239/llama.cpp-ts
ggml gguf llama llama3 llm llms meta-ai node-addon-api nodejs typescript
Last synced: 7 months ago
JSON representation
llama.cpp 🦙 LLM inference in TypeScript
- Host: GitHub
- URL: https://github.com/developer239/llama.cpp-ts
- Owner: developer239
- License: mit
- Created: 2024-07-27T15:33:37.000Z (12 months ago)
- Default Branch: master
- Last Pushed: 2024-09-26T23:58:11.000Z (10 months ago)
- Last Synced: 2024-12-19T04:11:59.994Z (7 months ago)
- Topics: ggml, gguf, llama, llama3, llm, llms, meta-ai, node-addon-api, nodejs, typescript
- Language: C++
- Homepage:
- Size: 101 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# llama.cpp-ts 🦙
[](https://www.npmjs.com/package/llama.cpp-ts "View this project on npm")
LlamaCPP-ts is a Node.js binding for the [LlamaCPP](https://github.com/developer239/llama-wrapped-cmake) library, which wraps the [llama.cpp](https://github.com/ggerganov/llama.cpp) framework. It provides an easy-to-use interface for running language models in Node.js applications, supporting asynchronous streaming responses.
**Supported Systems:**
- MacOS
- Windows (not tested yet)
- Linux (not tested yet)## Models
You can find some models [here](https://huggingface.co/bullerwins/Meta-Llama-3.1-8B-Instruct-GGUF/tree/main)
Example is using this one [Meta-Llama-3.1-8B-Instruct-Q3_K_S.gguf](https://huggingface.co/bullerwins/Meta-Llama-3.1-8B-Instruct-GGUF/resolve/main/Meta-Llama-3.1-8B-Instruct-Q3_K_S.gguf).
## Installation
Ensure that you have [CMake](https://cmake.org) installed on your system:
- On MacOS: `brew install cmake`
- On Windows: `choco install cmake`
- On Linux: `sudo apt-get install cmake`Then, install the package:
```bash
npm install llama.cpp-ts
# or
yarn add llama.cpp-ts
```## Usage
### Basic Usage
```javascript
const { Llama } = require('llama.cpp-ts');async function main() {
const llama = new Llama();
const modelPath = "./path/to/your/model.gguf";
const modelParams = { nGpuLayers: 32 };
const contextParams = { nContext: 2048 };if (!llama.initialize(modelPath, modelParams, contextParams)) {
console.error("Failed to initialize the model");
return;
}llama.setSystemPrompt("You are a helpful assistant. Always provide clear, concise, and accurate answers.");
const question = "What is the capital of France?";
const tokenStream = llama.prompt(question);console.log("Question:", question);
console.log("Answer: ");while (true) {
const token = await tokenStream.read();
if (token === null) break;
process.stdout.write(token);
}
}main().catch(console.error);
```## API Reference
### Llama Class
The `Llama` class provides methods to interact with language models loaded through llama.cpp.
#### Public Methods
- `constructor()`: Creates a new Llama instance.
- `initialize(modelPath: string, modelParams?: object, contextParams?: object): boolean`: Initializes the model with the specified path and parameters.
- `setSystemPrompt(systemPrompt: string): void`: Sets the system prompt for the conversation.
- `prompt(userMessage: string): TokenStream`: Streams the response to the given prompt, returning a `TokenStream` object.
- `resetConversation(): void`: Resets the conversation history.### TokenStream Class
The `TokenStream` class represents a stream of tokens generated by the language model.
#### Public Methods
- `read(): Promise`: Reads the next token from the stream. Returns `null` when the stream is finished.
## Example
Here's a more comprehensive example demonstrating the usage of the library:
```javascript
const { Llama } = require('llama.cpp-ts');async function main() {
const llama = new Llama();
const modelPath = __dirname + "/models/Meta-Llama-3.1-8B-Instruct-Q3_K_S.gguf";
const modelParams = { nGpuLayers: 32 };
const contextParams = { nContext: 2048 };if (!llama.initialize(modelPath, modelParams, contextParams)) {
console.error("Failed to initialize the model");
return;
}llama.setSystemPrompt("You are a helpful assistant. Always provide clear, concise, and accurate answers.");
const questions = [
"What is the capital of France?",
"What's the population of that city?",
"What country is the city in?"
];for (const question of questions) {
const tokenStream = llama.prompt(question);console.log("Question:", question);
console.log("Answer: ");while (true) {
const token = await tokenStream.read();
if (token === null) break;
process.stdout.write(token);
}console.log("\n");
}
}main().catch(console.error);
```