https://github.com/cactus-compute/cactus-react-native
Cactus React Native package: Run AI locally in your React Native apps
https://github.com/cactus-compute/cactus-react-native
ai apps cactus llamacpp llm llm-inference llms react react-native
Last synced: about 2 months ago
JSON representation
Cactus React Native package: Run AI locally in your React Native apps
- Host: GitHub
- URL: https://github.com/cactus-compute/cactus-react-native
- Owner: cactus-compute
- License: other
- Created: 2025-09-01T19:51:34.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2026-04-19T03:05:37.000Z (2 months ago)
- Last Synced: 2026-04-19T04:34:02.672Z (2 months ago)
- Topics: ai, apps, cactus, llamacpp, llm, llm-inference, llms, react, react-native
- Language: C++
- Homepage: https://cactuscompute.com
- Size: 102 MB
- Stars: 156
- Watchers: 4
- Forks: 21
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README

## Resources
[](https://github.com/cactus-compute/cactus) [](https://huggingface.co/Cactus-Compute/models?sort=downloads) [](https://discord.gg/bNurx3AXTJ) [](https://cactuscompute.com/docs/react-native)
## Installation
```bash
npm install cactus-react-native react-native-nitro-modules
```
## Quick Start
Get started with Cactus in just a few lines of code:
```typescript
import { CactusLM, type CactusLMMessage } from 'cactus-react-native';
// Create a new instance
const cactusLM = new CactusLM();
// Download the model
await cactusLM.download({
onProgress: (progress) => console.log(`Download: ${Math.round(progress * 100)}%`)
});
// Generate a completion
const messages: CactusLMMessage[] = [
{ role: 'user', content: 'What is the capital of France?' }
];
const result = await cactusLM.complete({ messages });
console.log(result.response); // "The capital of France is Paris."
// Clean up resources
await cactusLM.destroy();
```
**Using the React Hook:**
```tsx
import { useCactusLM } from 'cactus-react-native';
const App = () => {
const cactusLM = useCactusLM();
useEffect(() => {
// Download the model if not already available
if (!cactusLM.isDownloaded) {
cactusLM.download();
}
}, []);
const handleGenerate = () => {
// Generate a completion
cactusLM.complete({
messages: [{ role: 'user', content: 'Hello!' }],
});
};
if (cactusLM.isDownloading) {
return (
Downloading model: {Math.round(cactusLM.downloadProgress * 100)}%
);
}
return (
<>
{cactusLM.completion}
>
);
};
```
## Language Model
### Model Options
Choose model quantization and NPU acceleration with Pro models.
```typescript
import { CactusLM } from 'cactus-react-native';
// Use int8 for better accuracy (default)
const cactusLM = new CactusLM({
model: 'lfm2-vl-450m',
options: {
quantization: 'int8', // 'int4' or 'int8'
pro: false
}
});
// Use pro models for NPU acceleration
const cactusPro = new CactusLM({
model: 'lfm2-vl-450m',
options: {
quantization: 'int8',
pro: true
}
});
```
### Completion
Generate text responses from the model by providing a conversation history.
#### Class
```typescript
import { CactusLM, type CactusLMMessage } from 'cactus-react-native';
const cactusLM = new CactusLM();
const messages: CactusLMMessage[] = [{ role: 'user', content: 'Hello, World!' }];
const onToken = (token: string) => { console.log('Token:', token) };
const result = await cactusLM.complete({ messages, onToken });
console.log('Completion result:', result);
```
#### Hook
```tsx
import { useCactusLM, type CactusLMMessage } from 'cactus-react-native';
const App = () => {
const cactusLM = useCactusLM();
const handleComplete = async () => {
const messages: CactusLMMessage[] = [{ role: 'user', content: 'Hello, World!' }];
const result = await cactusLM.complete({ messages });
console.log('Completion result:', result);
};
return (
<>
{cactusLM.completion}
>
);
};
```
### Vision
Vision allows you to pass images along with text prompts, enabling the model to analyze and understand visual content.
#### Class
```typescript
import { CactusLM, type CactusLMMessage } from 'cactus-react-native';
// Vision-capable model
const cactusLM = new CactusLM({ model: 'lfm2-vl-450m' });
const messages: CactusLMMessage[] = [
{
role: 'user',
content: "What's in the image?",
images: ['path/to/your/image'],
},
];
const result = await cactusLM.complete({ messages });
console.log('Response:', result.response);
```
#### Hook
```tsx
import { useCactusLM, type CactusLMMessage } from 'cactus-react-native';
const App = () => {
// Vision-capable model
const cactusLM = useCactusLM({ model: 'lfm2-vl-450m' });
const handleAnalyze = async () => {
const messages: CactusLMMessage[] = [
{
role: 'user',
content: "What's in the image?",
images: ['path/to/your/image'],
},
];
await cactusLM.complete({ messages });
};
return (
<>
{cactusLM.completion}
>
);
};
```
### Tool Calling
Enable the model to generate function calls by defining available tools and their parameters.
#### Class
```typescript
import { CactusLM, type CactusLMMessage, type CactusLMTool } from 'cactus-react-native';
const tools: CactusLMTool[] = [
{
name: 'get_weather',
description: 'Get current weather for a location',
parameters: {
type: 'object',
properties: {
location: {
type: 'string',
description: 'City name',
},
},
required: ['location'],
},
},
];
const cactusLM = new CactusLM();
const messages: CactusLMMessage[] = [
{ role: 'user', content: "What's the weather in San Francisco?" },
];
const result = await cactusLM.complete({ messages, tools });
console.log('Response:', result.response);
console.log('Function calls:', result.functionCalls);
```
#### Hook
```tsx
import { useCactusLM, type CactusLMMessage, type CactusLMTool } from 'cactus-react-native';
const tools: CactusLMTool[] = [
{
name: 'get_weather',
description: 'Get current weather for a location',
parameters: {
type: 'object',
properties: {
location: {
type: 'string',
description: 'City name',
},
},
required: ['location'],
},
},
];
const App = () => {
const cactusLM = useCactusLM();
const handleComplete = async () => {
const messages: CactusLMMessage[] = [
{ role: 'user', content: "What's the weather in San Francisco?" },
];
const result = await cactusLM.complete({ messages, tools });
console.log('Response:', result.response);
console.log('Function calls:', result.functionCalls);
};
return ;
};
```
### Audio Completion
Pass raw PCM audio alongside text prompts for multimodal completion with audio-capable models like Gemma 4.
#### Class
```typescript
import { CactusLM, type CactusLMMessage } from 'cactus-react-native';
const cactusLM = new CactusLM({ model: 'gemma-4-e2b-it' });
const messages: CactusLMMessage[] = [
{ role: 'user', content: 'What do you hear in this audio?' },
];
// Pass raw 16-bit PCM samples as a byte array
const pcmAudio: number[] = [/* raw PCM bytes */];
const result = await cactusLM.complete({ messages, audio: pcmAudio });
console.log(result.response);
```
#### Hook
```tsx
import { useCactusLM, type CactusLMMessage } from 'cactus-react-native';
const App = () => {
const cactusLM = useCactusLM({ model: 'gemma-4-e2b-it' });
const handleAudioComplete = async (pcmAudio: number[]) => {
const messages: CactusLMMessage[] = [
{ role: 'user', content: 'Describe this audio.' },
];
await cactusLM.complete({ messages, audio: pcmAudio });
};
return {cactusLM.completion};
};
```
### RAG (Retrieval Augmented Generation)
RAG allows you to provide a corpus of documents that the model can reference during generation, enabling it to answer questions based on your data.
#### Class
```typescript
import { CactusLM, type CactusLMMessage } from 'cactus-react-native';
const cactusLM = new CactusLM({
corpusDir: 'path/to/your/corpus', // Directory containing .txt files
});
const messages: CactusLMMessage[] = [
{ role: 'user', content: 'What information is in the documents?' },
];
const result = await cactusLM.complete({ messages });
console.log(result.response);
// Or query the RAG index directly
const ragResult = await cactusLM.ragQuery({ query: 'search terms', topK: 5 });
console.log('Chunks:', ragResult.chunks);
// [{ score: 0.85, source: 'doc.txt', content: '...' }, ...]
```
#### Hook
```tsx
import { useCactusLM, type CactusLMMessage } from 'cactus-react-native';
const App = () => {
const cactusLM = useCactusLM({
corpusDir: 'path/to/your/corpus', // Directory containing .txt files
});
const handleAsk = async () => {
const messages: CactusLMMessage[] = [
{ role: 'user', content: 'What information is in the documents?' },
];
await cactusLM.complete({ messages });
};
const handleRagQuery = async () => {
const result = await cactusLM.ragQuery({ query: 'search terms', topK: 3 });
console.log('Chunks:', result.chunks);
};
return (
<>
{cactusLM.completion}
>
);
};
```
### Tokenization
Convert text into tokens using the model's tokenizer.
#### Class
```typescript
import { CactusLM } from 'cactus-react-native';
const cactusLM = new CactusLM();
const result = await cactusLM.tokenize({ text: 'Hello, World!' });
console.log('Token IDs:', result.tokens);
```
#### Hook
```tsx
import { useCactusLM } from 'cactus-react-native';
const App = () => {
const cactusLM = useCactusLM();
const handleTokenize = async () => {
const result = await cactusLM.tokenize({ text: 'Hello, World!' });
console.log('Token IDs:', result.tokens);
};
return ;
};
```
### Score Window
Calculate perplexity scores for a window of tokens within a sequence.
#### Class
```typescript
import { CactusLM } from 'cactus-react-native';
const cactusLM = new CactusLM();
const tokens = [123, 456, 789, 101, 112];
const result = await cactusLM.scoreWindow({
tokens,
start: 1,
end: 3,
context: 2
});
console.log('Score:', result.score);
```
#### Hook
```tsx
import { useCactusLM } from 'cactus-react-native';
const App = () => {
const cactusLM = useCactusLM();
const handleScoreWindow = async () => {
const tokens = [123, 456, 789, 101, 112];
const result = await cactusLM.scoreWindow({
tokens,
start: 1,
end: 3,
context: 2
});
console.log('Score:', result.score);
};
return ;
};
```
### Embedding
Convert text and images into numerical vector representations that capture semantic meaning, useful for similarity search and semantic understanding.
#### Text Embedding
##### Class
```typescript
import { CactusLM } from 'cactus-react-native';
const cactusLM = new CactusLM();
const result = await cactusLM.embed({ text: 'Hello, World!' });
console.log('Embedding vector:', result.embedding);
console.log('Embedding vector length:', result.embedding.length);
```
##### Hook
```tsx
import { useCactusLM } from 'cactus-react-native';
const App = () => {
const cactusLM = useCactusLM();
const handleEmbed = async () => {
const result = await cactusLM.embed({ text: 'Hello, World!' });
console.log('Embedding vector:', result.embedding);
console.log('Embedding vector length:', result.embedding.length);
};
return ;
};
```
#### Image Embedding
##### Class
```typescript
import { CactusLM } from 'cactus-react-native';
const cactusLM = new CactusLM({ model: 'lfm2-vl-450m' });
const result = await cactusLM.imageEmbed({ imagePath: 'path/to/your/image.jpg' });
console.log('Image embedding vector:', result.embedding);
console.log('Embedding vector length:', result.embedding.length);
```
##### Hook
```tsx
import { useCactusLM } from 'cactus-react-native';
const App = () => {
const cactusLM = useCactusLM({ model: 'lfm2-vl-450m' });
const handleImageEmbed = async () => {
const result = await cactusLM.imageEmbed({ imagePath: 'path/to/your/image.jpg' });
console.log('Image embedding vector:', result.embedding);
console.log('Embedding vector length:', result.embedding.length);
};
return ;
};
```
## Speech-to-Text (STT)
The `CactusSTT` class provides audio transcription and audio embedding capabilities using speech-to-text models such as Whisper and Moonshine.
### Transcription
Transcribe audio to text with streaming support. Accepts either a file path or raw PCM audio samples.
#### Class
```typescript
import { CactusSTT } from 'cactus-react-native';
const cactusSTT = new CactusSTT({ model: 'whisper-small' });
// Transcribe from file path
const result = await cactusSTT.transcribe({
audio: 'path/to/audio.wav',
onToken: (token) => console.log('Token:', token)
});
console.log('Transcription:', result.response);
// Or transcribe from raw PCM samples
const pcmSamples: number[] = [/* ... */];
const result2 = await cactusSTT.transcribe({
audio: pcmSamples,
onToken: (token) => console.log('Token:', token)
});
console.log('Transcription:', result2.response);
```
#### Hook
```tsx
import { useCactusSTT } from 'cactus-react-native';
const App = () => {
const cactusSTT = useCactusSTT({ model: 'whisper-small' });
const handleTranscribe = async () => {
// Transcribe from file path
const result = await cactusSTT.transcribe({
audio: 'path/to/audio.wav',
});
console.log('Transcription:', result.response);
const pcmSamples: number[] = [/* ... */];
const result2 = await cactusSTT.transcribe({
audio: pcmSamples,
});
console.log('Transcription:', result2.response);
};
return (
<>
{cactusSTT.transcription}
>
);
};
```
### Streaming Transcription
Transcribe audio in real-time with incremental results. Each call to `streamTranscribeProcess` feeds an audio chunk and returns the currently confirmed and pending text.
#### Class
```typescript
import { CactusSTT } from 'cactus-react-native';
const cactusSTT = new CactusSTT({ model: 'whisper-small' });
await cactusSTT.streamTranscribeStart({
confirmationThreshold: 0.99, // confidence required to confirm text
minChunkSize: 32000, // minimum samples before processing
});
const audioChunk: number[] = [/* PCM samples as bytes */];
const result = await cactusSTT.streamTranscribeProcess({ audio: audioChunk });
console.log('Confirmed:', result.confirmed);
console.log('Pending:', result.pending);
const final = await cactusSTT.streamTranscribeStop();
console.log('Final confirmed:', final.confirmed);
```
#### Hook
```tsx
import { useCactusSTT } from 'cactus-react-native';
const App = () => {
const cactusSTT = useCactusSTT({ model: 'whisper-small' });
const handleStart = async () => {
await cactusSTT.streamTranscribeStart({ confirmationThreshold: 0.99 });
};
const handleChunk = async (audioChunk: number[]) => {
const result = await cactusSTT.streamTranscribeProcess({ audio: audioChunk });
console.log('Confirmed:', result.confirmed);
console.log('Pending:', result.pending);
};
const handleStop = async () => {
const final = await cactusSTT.streamTranscribeStop();
console.log('Final:', final.confirmed);
};
return (
<>
{cactusSTT.streamTranscribeConfirmed}
{cactusSTT.streamTranscribePending}
>
);
};
```
### Audio Embedding
Generate embeddings from audio files for audio understanding.
#### Class
```typescript
import { CactusSTT } from 'cactus-react-native';
const cactusSTT = new CactusSTT();
const result = await cactusSTT.audioEmbed({
audioPath: 'path/to/audio.wav'
});
console.log('Audio embedding vector:', result.embedding);
console.log('Embedding vector length:', result.embedding.length);
```
#### Hook
```tsx
import { useCactusSTT } from 'cactus-react-native';
const App = () => {
const cactusSTT = useCactusSTT();
const handleAudioEmbed = async () => {
const result = await cactusSTT.audioEmbed({
audioPath: 'path/to/audio.wav'
});
console.log('Audio embedding vector:', result.embedding);
console.log('Embedding vector length:', result.embedding.length);
};
return ;
};
```
### Language Detection
Detect the spoken language in an audio file. Only available on the class, not the hook.
```typescript
import { CactusSTT } from 'cactus-react-native';
const cactusSTT = new CactusSTT({ model: 'whisper-small' });
const result = await cactusSTT.detectLanguage({
audio: 'path/to/audio.wav',
options: { useVad: true },
});
console.log('Language:', result.language); // e.g. 'en'
console.log('Confidence:', result.confidence);
```
## Audio Processing
The `CactusAudio` class provides voice activity detection (VAD), speaker diarization, and speaker embedding extraction.
### Voice Activity Detection
```typescript
import { CactusAudio } from 'cactus-react-native';
const cactusAudio = new CactusAudio({ model: 'silero-vad' });
const result = await cactusAudio.vad({
audio: 'path/to/audio.wav',
options: {
threshold: 0.5,
minSpeechDurationMs: 250,
minSilenceDurationMs: 100,
}
});
console.log('Speech segments:', result.segments);
// [{ start: 0, end: 16000 }, { start: 32000, end: 48000 }, ...]
console.log('Total time (ms):', result.totalTime);
```
### Speaker Diarization
```typescript
import { CactusAudio } from 'cactus-react-native';
const cactusAudio = new CactusAudio({ model: 'silero-vad' });
const result = await cactusAudio.diarize({
audio: 'path/to/audio.wav',
options: {
numSpeakers: 2,
minSpeakers: 1,
maxSpeakers: 4,
}
});
console.log('Number of speakers:', result.numSpeakers);
console.log('Scores:', result.scores);
```
### Speaker Embedding
Extract a speaker embedding vector from audio, optionally with mask weights for speaker-specific segments from diarization.
```typescript
import { CactusAudio } from 'cactus-react-native';
const cactusAudio = new CactusAudio({ model: 'silero-vad' });
const result = await cactusAudio.embedSpeaker({
audio: 'path/to/audio.wav',
});
console.log('Speaker embedding:', result.embedding);
// With mask weights from diarization
const maskedResult = await cactusAudio.embedSpeaker({
audio: 'path/to/audio.wav',
options: {
maskWeights: [1.0, 1.0, 0.0, 0.0, 1.0], // per-frame weights
maskNumFrames: 5,
},
});
```
### Hook
```tsx
import { useCactusAudio } from 'cactus-react-native';
const App = () => {
const cactusAudio = useCactusAudio({ model: 'silero-vad' });
const handleVAD = async () => {
const result = await cactusAudio.vad({
audio: 'path/to/audio.wav',
});
console.log('Speech segments:', result.segments);
};
const handleDiarize = async () => {
const result = await cactusAudio.diarize({
audio: 'path/to/audio.wav',
});
console.log('Speakers:', result.numSpeakers);
};
return (
<>
>
);
};
```
## Vector Index
The `CactusIndex` class provides a vector database for storing and querying embeddings with metadata. Enabling similarity search and retrieval.
### Creating and Initializing an Index
#### Class
```typescript
import { CactusIndex } from 'cactus-react-native';
const cactusIndex = new CactusIndex('my-index', 1024);
await cactusIndex.init();
```
#### Hook
```tsx
import { useCactusIndex } from 'cactus-react-native';
const App = () => {
const cactusIndex = useCactusIndex({
name: 'my-index',
embeddingDim: 1024
});
const handleInit = async () => {
await cactusIndex.init();
};
return
};
```
### Adding Documents
Add documents with their embeddings and metadata to the index.
#### Class
```typescript
import { CactusIndex } from 'cactus-react-native';
const cactusIndex = new CactusIndex('my-index', 1024);
await cactusIndex.init();
await cactusIndex.add({
ids: [1, 2, 3],
documents: ['First document', 'Second document', 'Third document'],
embeddings: [
[0.1, 0.2, ...],
[0.3, 0.4, ...],
[0.5, 0.6, ...]
],
metadatas: ['metadata1', 'metadata2', 'metadata3']
});
```
#### Hook
```tsx
import { useCactusIndex } from 'cactus-react-native';
const App = () => {
const cactusIndex = useCactusIndex({
name: 'my-index',
embeddingDim: 1024
});
const handleAdd = async () => {
await cactusIndex.add({
ids: [1, 2, 3],
documents: ['First document', 'Second document', 'Third document'],
embeddings: [
[0.1, 0.2, ...],
[0.3, 0.4, ...],
[0.5, 0.6, ...]
],
metadatas: ['metadata1', 'metadata2', 'metadata3']
});
};
return ;
};
```
### Querying the Index
Search for similar documents using embedding vectors.
#### Class
```typescript
import { CactusIndex } from 'cactus-react-native';
const cactusIndex = new CactusIndex('my-index', 1024);
await cactusIndex.init();
const result = await cactusIndex.query({
embeddings: [[0.1, 0.2, ...]],
options: {
topK: 5,
scoreThreshold: 0.7
}
});
console.log('IDs:', result.ids);
console.log('Scores:', result.scores);
```
#### Hook
```tsx
import { useCactusIndex } from 'cactus-react-native';
const App = () => {
const cactusIndex = useCactusIndex({
name: 'my-index',
embeddingDim: 1024
});
const handleQuery = async () => {
const result = await cactusIndex.query({
embeddings: [[0.1, 0.2, ...]],
options: {
topK: 5,
scoreThreshold: 0.7
}
});
console.log('IDs:', result.ids);
console.log('Scores:', result.scores);
};
return ;
};
```
### Retrieving Documents
Get documents by their IDs.
#### Class
```typescript
import { CactusIndex } from 'cactus-react-native';
const cactusIndex = new CactusIndex('my-index', 1024);
await cactusIndex.init();
const result = await cactusIndex.get({ ids: [1, 2, 3] });
console.log('Documents:', result.documents);
console.log('Metadatas:', result.metadatas);
console.log('Embeddings:', result.embeddings);
```
#### Hook
```tsx
import { useCactusIndex } from 'cactus-react-native';
const App = () => {
const cactusIndex = useCactusIndex({
name: 'my-index',
embeddingDim: 1024
});
const handleGet = async () => {
const result = await cactusIndex.get({ ids: [1, 2, 3] });
console.log('Documents:', result.documents);
console.log('Metadatas:', result.metadatas);
console.log('Embeddings:', result.embeddings);
};
return ;
};
```
### Deleting Documents
Mark documents as deleted by their IDs.
#### Class
```typescript
import { CactusIndex } from 'cactus-react-native';
const cactusIndex = new CactusIndex('my-index', 1024);
await cactusIndex.init();
await cactusIndex.delete({ ids: [1, 2, 3] });
```
#### Hook
```tsx
import { useCactusIndex } from 'cactus-react-native';
const App = () => {
const cactusIndex = useCactusIndex({
name: 'my-index',
embeddingDim: 1024
});
const handleDelete = async () => {
await cactusIndex.delete({ ids: [1, 2, 3] });
};
return ;
};
```
### Compacting the Index
Optimize the index by removing deleted documents and reorganizing data.
#### Class
```typescript
import { CactusIndex } from 'cactus-react-native';
const cactusIndex = new CactusIndex('my-index', 1024);
await cactusIndex.init();
await cactusIndex.compact();
```
#### Hook
```tsx
import { useCactusIndex } from 'cactus-react-native';
const App = () => {
const cactusIndex = useCactusIndex({
name: 'my-index',
embeddingDim: 1024
});
const handleCompact = async () => {
await cactusIndex.compact();
};
return ;
};
```
## API Reference
### CactusLM Class
#### Constructor
**`new CactusLM(params?: CactusLMParams)`**
**Parameters:**
- `model` - Model slug or absolute path to a model file (default: `'qwen3-0.6b'`).
- `corpusDir` - Directory containing text files for RAG (default: `undefined`).
- `cacheIndex` - Whether to cache the RAG corpus index on disk (default: `false`).
- `options` - Model options for quantization and NPU acceleration:
- `quantization` - Quantization type: `'int4'` | `'int8'` (default: `'int8'`).
- `pro` - Enable NPU-accelerated models (default: `false`).
#### Methods
**`download(params?: CactusLMDownloadParams): Promise`**
Downloads the model. If the model is already downloaded, returns immediately with progress `1`. Throws an error if a download is already in progress.
**Parameters:**
- `onProgress` - Callback for download progress (0-1).
**`init(): Promise`**
Initializes the model and prepares it for inference. Safe to call multiple times (idempotent). Throws an error if the model is not downloaded yet.
**`complete(params: CactusLMCompleteParams): Promise`**
Performs text completion with optional streaming and tool support. Automatically calls `init()` if not already initialized. Throws an error if a generation (completion or embedding) is already in progress.
**Parameters:**
- `messages` - Array of `CactusLMMessage` objects.
- `options` - Generation options:
- `temperature` - Sampling temperature.
- `topP` - Nucleus sampling threshold.
- `topK` - Top-K sampling limit.
- `maxTokens` - Maximum number of tokens to generate (default: `512`).
- `stopSequences` - Array of strings to stop generation.
- `forceTools` - Force the model to call one of the provided tools (default: `false`).
- `telemetryEnabled` - Enable telemetry for this request (default: `true`).
- `confidenceThreshold` - Confidence threshold below which cloud handoff is triggered (default: `0.7`).
- `toolRagTopK` - Number of tools to select via RAG when tool list is large (default: `2`).
- `includeStopSequences` - Whether to include stop sequences in the response (default: `false`).
- `useVad` - Whether to use VAD preprocessing (default: `true`).
- `enableThinking` - Whether to enable thinking/reasoning output if supported by the model (default: unset).
- `tools` - Array of `CactusLMTool` objects for function calling.
- `onToken` - Callback for streaming tokens.
- `audio` - Optional raw 16-bit PCM audio samples as a byte array for multimodal audio completion (e.g., Gemma 4).
**`prefill(params: CactusLMPrefillParams): Promise`**
Runs prompt prefill without generating any output tokens. Useful for measuring prefill performance or warming up the model's KV cache. Automatically calls `init()` if not already initialized. Throws an error if a generation is already in progress.
**Parameters:**
- `messages` - Array of `CactusLMMessage` objects.
- `options` - Same options as `complete`.
- `tools` - Array of `CactusLMTool` objects.
- `audio` - Optional raw 16-bit PCM audio samples as a byte array for multimodal audio prefill.
**`tokenize(params: CactusLMTokenizeParams): Promise`**
Converts text into tokens using the model's tokenizer.
**Parameters:**
- `text` - Text to tokenize.
**`scoreWindow(params: CactusLMScoreWindowParams): Promise`**
Calculates the log-probability score for a window of tokens within a sequence.
**Parameters:**
- `tokens` - Array of token IDs.
- `start` - Start index of the window.
- `end` - End index of the window.
- `context` - Number of context tokens before the window.
**`embed(params: CactusLMEmbedParams): Promise`**
Generates embeddings for the given text. Automatically calls `init()` if not already initialized. Throws an error if a generation (completion or embedding) is already in progress.
**Parameters:**
- `text` - Text to embed.
- `normalize` - Whether to normalize the embedding vector (default: `false`).
**`imageEmbed(params: CactusLMImageEmbedParams): Promise`**
Generates embeddings for the given image. Requires a vision-capable model. Automatically calls `init()` if not already initialized. Throws an error if a generation (completion or embedding) is already in progress.
**Parameters:**
- `imagePath` - Path to the image file.
**`ragQuery(params: CactusLMRagQueryParams): Promise`**
Queries the RAG corpus index directly, returning the top matching document chunks with scores. Requires the model to be initialized with a `corpusDir`. Automatically calls `init()` if not already initialized.
**Parameters:**
- `query` - Search query string.
- `topK` - Number of top results to return (default: `5`).
**`stop(): Promise`**
Stops ongoing generation.
**`reset(): Promise`**
Resets the model's internal state, clearing any cached context. Automatically calls `stop()` first.
**`destroy(): Promise`**
Releases all resources associated with the model. Automatically calls `stop()` first. Safe to call even if the model is not initialized.
**`getModels(): Promise`**
Returns available models.
**`getModelName(): string`**
Returns the computed model identifier including quantization and pro suffix (e.g., `'qwen3-0.6b-int8'`, `'lfm2-vl-450m-int4-pro'`).
### useCactusLM Hook
The `useCactusLM` hook manages a `CactusLM` instance with reactive state. When model parameters (`model`, `corpusDir`, `cacheIndex`, `options`) change, the hook creates a new instance and resets all state. The hook automatically cleans up resources when the component unmounts.
#### State
- `completion: string` - Current generated text. Automatically accumulated during streaming. Cleared before each new completion and when calling `reset()` or `destroy()`.
- `isGenerating: boolean` - Whether the model is currently running an operation. Shared by `complete`, `tokenize`, `scoreWindow`, `embed`, and `imageEmbed`.
- `isInitializing: boolean` - Whether the model is initializing.
- `isDownloaded: boolean` - Whether the model is downloaded locally. Automatically checked when the hook mounts or model changes.
- `isDownloading: boolean` - Whether the model is being downloaded.
- `downloadProgress: number` - Download progress (0-1). Reset to `0` after download completes.
- `error: string | null` - Last error message from any operation, or `null` if there is no error. Cleared before starting new operations.
#### Methods
- `download(params?: CactusLMDownloadParams): Promise` - Downloads the model. Updates `isDownloading` and `downloadProgress` state during download. Sets `isDownloaded` to `true` on success.
- `init(): Promise` - Initializes the model for inference. Sets `isInitializing` to `true` during initialization.
- `complete(params: CactusLMCompleteParams): Promise` - Generates text completions. Automatically accumulates tokens in the `completion` state during streaming. Sets `isGenerating` to `true` while generating. Clears `completion` before starting.
- `tokenize(params: CactusLMTokenizeParams): Promise` - Converts text into tokens. Sets `isGenerating` to `true` during operation.
- `scoreWindow(params: CactusLMScoreWindowParams): Promise` - Calculates log-probability scores for a window of tokens. Sets `isGenerating` to `true` during operation.
- `embed(params: CactusLMEmbedParams): Promise` - Generates embeddings for the given text. Sets `isGenerating` to `true` during operation.
- `imageEmbed(params: CactusLMImageEmbedParams): Promise` - Generates embeddings for the given image. Sets `isGenerating` to `true` while generating.
- `ragQuery(params: CactusLMRagQueryParams): Promise` - Queries the RAG corpus index directly. Sets `isGenerating` to `true` during operation.
- `stop(): Promise` - Stops ongoing generation. Clears any errors.
- `reset(): Promise` - Resets the model's internal state, clearing cached context. Also clears the `completion` state.
- `destroy(): Promise` - Releases all resources associated with the model. Clears the `completion` state. Automatically called when the component unmounts.
- `getModels(): Promise` - Returns available models.
### CactusSTT Class
#### Constructor
**`new CactusSTT(params?: CactusSTTParams)`**
**Parameters:**
- `model` - Model slug or absolute path to a model file (default: `'whisper-small'`).
- `options` - Model options for quantization and NPU acceleration:
- `quantization` - Quantization type: `'int4'` | `'int8'` (default: `'int8'`).
- `pro` - Enable NPU-accelerated models (default: `false`).
#### Methods
**`download(params?: CactusSTTDownloadParams): Promise`**
Downloads the model. If the model is already downloaded, returns immediately with progress `1`. Throws an error if a download is already in progress.
**Parameters:**
- `onProgress` - Callback for download progress (0-1).
**`init(): Promise`**
Initializes the model and prepares it for inference. Safe to call multiple times (idempotent). Throws an error if the model is not downloaded yet.
**`transcribe(params: CactusSTTTranscribeParams): Promise`**
Transcribes audio to text with optional streaming support. Accepts either a file path or raw PCM audio samples. Automatically calls `init()` if not already initialized. Throws an error if a generation is already in progress.
**Parameters:**
- `audio` - Path to the audio file or raw PCM samples as a byte array.
- `prompt` - Optional prompt to guide transcription (default: `'<|startoftranscript|><|en|><|transcribe|><|notimestamps|>'`).
- `options` - Transcription options:
- `temperature` - Sampling temperature.
- `topP` - Nucleus sampling threshold.
- `topK` - Top-K sampling limit.
- `maxTokens` - Maximum number of tokens to generate (default: `384`).
- `stopSequences` - Array of strings to stop generation.
- `useVad` - Whether to apply VAD to strip silence before transcription (default: `true`).
- `telemetryEnabled` - Enable telemetry for this request (default: `true`).
- `confidenceThreshold` - Confidence threshold for quality assessment (default: `0.7`).
- `cloudHandoffThreshold` - Max entropy threshold above which cloud handoff is triggered.
- `includeStopSequences` - Whether to include stop sequences in the response (default: `false`).
- `onToken` - Callback for streaming tokens.
**`streamTranscribeStart(options?: CactusSTTStreamTranscribeStartOptions): Promise`**
Starts a streaming transcription session. Automatically calls `init()` if not already initialized. If a session is already active, returns immediately.
**Parameters:**
- `confirmationThreshold` - Fuzzy match ratio required to confirm a transcription segment (default: `0.99`).
- `minChunkSize` - Minimum number of audio samples before processing (default: `32000`).
- `telemetryEnabled` - Enable telemetry for this session (default: `true`).
- `language` - Language code for transcription (e.g., `'en'`, `'es'`, `'fr'`). If not set, language is auto-detected.
**`streamTranscribeProcess(params: CactusSTTStreamTranscribeProcessParams): Promise`**
Feeds audio samples into the streaming session and returns the current transcription state. Throws an error if no session is active.
**Parameters:**
- `audio` - PCM audio samples as a byte array.
**`streamTranscribeStop(): Promise`**
Stops the streaming session and returns the final confirmed transcription text. Throws an error if no session is active.
**`detectLanguage(params: CactusSTTDetectLanguageParams): Promise`**
Detects the spoken language in the given audio. Automatically calls `init()` if not already initialized. Throws an error if a generation is already in progress.
**Parameters:**
- `audio` - Path to the audio file or raw PCM samples as a byte array.
- `options`:
- `useVad` - Whether to apply VAD before detection (default: `true`).
**`audioEmbed(params: CactusSTTAudioEmbedParams): Promise`**
Generates embeddings for the given audio file. Automatically calls `init()` if not already initialized. Throws an error if a generation is already in progress.
**Parameters:**
- `audioPath` - Path to the audio file.
**`stop(): Promise`**
Stops ongoing transcription or embedding generation.
**`reset(): Promise`**
Resets the model's internal state. Automatically calls `stop()` first.
**`destroy(): Promise`**
Releases all resources associated with the model. Stops any active streaming session. Automatically calls `stop()` first. Safe to call even if the model is not initialized.
**`getModels(): Promise`**
Returns available speech-to-text models.
**`getModelName(): string`**
Returns the computed model identifier including quantization and pro suffix (e.g., `'whisper-small-int8'`).
### useCactusSTT Hook
The `useCactusSTT` hook manages a `CactusSTT` instance with reactive state. When model parameters (`model`, `options`) change, the hook creates a new instance and resets all state. The hook automatically cleans up resources when the component unmounts.
#### State
- `transcription: string` - Current transcription text. Automatically accumulated during streaming. Cleared before each new transcription and when calling `reset()` or `destroy()`.
- `streamTranscribeConfirmed: string` - Accumulated confirmed text from the active streaming session. Updated after each successful `streamTranscribeProcess` call and finalized by `streamTranscribeStop`.
- `streamTranscribePending: string` - Uncommitted (in-progress) text from the current audio chunk. Cleared when the session stops.
- `isGenerating: boolean` - Whether the model is currently transcribing or embedding. Both operations share this flag.
- `isStreamTranscribing: boolean` - Whether a streaming transcription session is currently active.
- `isInitializing: boolean` - Whether the model is initializing.
- `isDownloaded: boolean` - Whether the model is downloaded locally. Automatically checked when the hook mounts or model changes.
- `isDownloading: boolean` - Whether the model is being downloaded.
- `downloadProgress: number` - Download progress (0-1). Reset to `0` after download completes.
- `error: string | null` - Last error message from any operation, or `null` if there is no error. Cleared before starting new operations.
#### Methods
- `download(params?: CactusSTTDownloadParams): Promise` - Downloads the model. Updates `isDownloading` and `downloadProgress` state during download. Sets `isDownloaded` to `true` on success.
- `init(): Promise` - Initializes the model for inference. Sets `isInitializing` to `true` during initialization.
- `transcribe(params: CactusSTTTranscribeParams): Promise` - Transcribes audio to text. Automatically accumulates tokens in the `transcription` state during streaming. Sets `isGenerating` to `true` while generating. Clears `transcription` before starting.
- `audioEmbed(params: CactusSTTAudioEmbedParams): Promise` - Generates embeddings for the given audio. Sets `isGenerating` to `true` during operation.
- `streamTranscribeStart(options?: CactusSTTStreamTranscribeStartOptions): Promise` - Starts a streaming transcription session. If a session is already active, returns immediately. Clears `streamTranscribeConfirmed` and `streamTranscribePending` before starting. Sets `isStreamTranscribing` to `true`.
- `streamTranscribeProcess(params: CactusSTTStreamTranscribeProcessParams): Promise` - Feeds audio and returns incremental results. Appends confirmed text to `streamTranscribeConfirmed` and updates `streamTranscribePending`.
- `streamTranscribeStop(): Promise` - Stops the session and returns the final result. Sets `isStreamTranscribing` to `false`. Appends final confirmed text to `streamTranscribeConfirmed` and clears `streamTranscribePending`.
- `stop(): Promise` - Stops ongoing generation. Clears any errors.
- `reset(): Promise` - Resets the model's internal state. Also clears the `transcription` state.
- `destroy(): Promise` - Releases all resources associated with the model. Clears the `transcription`, `streamTranscribeConfirmed`, and `streamTranscribePending` state. Automatically called when the component unmounts.
- `getModels(): Promise` - Returns available speech-to-text models.
### CactusAudio Class
#### Constructor
**`new CactusAudio(params?: CactusAudioParams)`**
**Parameters:**
- `model` - Model slug or absolute path to an audio model file (default: `'silero-vad'`).
- `options` - Model options:
- `quantization` - Quantization type: `'int4'` | `'int8'` (default: `'int8'`).
- `pro` - Enable NPU-accelerated models (default: `false`).
#### Methods
**`download(params?: CactusAudioDownloadParams): Promise`**
Downloads the audio model. If the model is already downloaded, returns immediately with progress `1`. Throws an error if a download is already in progress.
**Parameters:**
- `onProgress` - Callback for download progress (0-1).
**`init(): Promise`**
Initializes the audio model. Safe to call multiple times (idempotent). Throws an error if the model is not downloaded yet.
**`vad(params: CactusAudioVADParams): Promise`**
Runs voice activity detection on the given audio. Automatically calls `init()` if not already initialized.
**Parameters:**
- `audio` - Path to the audio file or raw PCM samples as a byte array.
- `options` - VAD options:
- `threshold` - Speech probability threshold (default: model default).
- `negThreshold` - Silence probability threshold.
- `minSpeechDurationMs` - Minimum speech segment duration in ms.
- `maxSpeechDurationS` - Maximum speech segment duration in seconds.
- `minSilenceDurationMs` - Minimum silence duration before ending a segment.
- `speechPadMs` - Padding added to each speech segment in ms.
- `windowSizeSamples` - Processing window size in samples.
- `samplingRate` - Audio sampling rate.
- `minSilenceAtMaxSpeech` - Minimum silence at max speech duration.
- `useMaxPossSilAtMaxSpeech` - Whether to use maximum possible silence at max speech.
**`diarize(params: CactusAudioDiarizeParams): Promise`**
Runs speaker diarization on the given audio. Automatically calls `init()` if not already initialized.
**Parameters:**
- `audio` - Path to the audio file or raw PCM samples as a byte array.
- `options` - Diarize options:
- `stepMs` - Step size in milliseconds.
- `threshold` - Diarization threshold.
- `numSpeakers` - Expected number of speakers.
- `minSpeakers` - Minimum number of speakers.
- `maxSpeakers` - Maximum number of speakers.
**`embedSpeaker(params: CactusAudioEmbedSpeakerParams): Promise`**
Extracts a speaker embedding vector from the given audio. Automatically calls `init()` if not already initialized.
**Parameters:**
- `audio` - Path to the audio file or raw PCM samples as a byte array.
- `options` - Optional speaker embedding options:
- `stepMs` - Step size in milliseconds.
- `threshold` - Embedding threshold.
- `maskWeights` - Per-frame mask weights for speaker-specific embedding extraction (from diarization).
- `maskNumFrames` - Number of frames for the mask weights.
**`destroy(): Promise`**
Releases all resources associated with the model. Safe to call even if the model is not initialized.
**`getModels(): Promise`**
Returns available audio models.
**`getModelName(): string`**
Returns the computed model identifier including quantization and pro suffix (e.g., `'silero-vad-int8'`).
### useCactusAudio Hook
The `useCactusAudio` hook manages a `CactusAudio` instance with reactive state. When model parameters (`model`, `options`) change, the hook creates a new instance and resets all state. The hook automatically cleans up resources when the component unmounts.
#### State
- `isInitializing: boolean` - Whether the model is initializing.
- `isDownloaded: boolean` - Whether the model is downloaded locally. Automatically checked when the hook mounts or model changes.
- `isDownloading: boolean` - Whether the model is being downloaded.
- `downloadProgress: number` - Download progress (0-1). Reset to `0` after download completes.
- `error: string | null` - Last error message, or `null`.
#### Methods
- `download(params?: CactusAudioDownloadParams): Promise` - Downloads the model. Updates `isDownloading` and `downloadProgress` state during download. Sets `isDownloaded` to `true` on success.
- `init(): Promise` - Initializes the model.
- `vad(params: CactusAudioVADParams): Promise` - Runs voice activity detection.
- `diarize(params: CactusAudioDiarizeParams): Promise` - Runs speaker diarization.
- `embedSpeaker(params: CactusAudioEmbedSpeakerParams): Promise` - Extracts a speaker embedding.
- `destroy(): Promise` - Releases all resources. Automatically called when the component unmounts.
- `getModels(): Promise` - Returns available audio models.
### CactusIndex Class
#### Constructor
**`new CactusIndex(name: string, embeddingDim: number)`**
**Parameters:**
- `name` - Name of the index.
- `embeddingDim` - Dimension of the embedding vectors.
#### Methods
**`init(): Promise`**
Initializes the index and prepares it for operations. Must be called before using any other methods.
**`add(params: CactusIndexAddParams): Promise`**
Adds documents with their embeddings and metadata to the index.
**Parameters:**
- `ids` - Array of document IDs.
- `documents` - Array of document texts.
- `embeddings` - Array of embedding vectors (each vector must match `embeddingDim`).
- `metadatas` - Optional array of metadata strings.
**`query(params: CactusIndexQueryParams): Promise`**
Searches for similar documents using embedding vectors.
**Parameters:**
- `embeddings` - Array of query embedding vectors.
- `options` - Query options:
- `topK` - Number of top results to return (default: 10).
- `scoreThreshold` - Minimum similarity score threshold (default: -1.0).
**`get(params: CactusIndexGetParams): Promise`**
Retrieves documents by their IDs.
**Parameters:**
- `ids` - Array of document IDs to retrieve.
**`delete(params: CactusIndexDeleteParams): Promise`**
Deletes documents from the index by their IDs.
**Parameters:**
- `ids` - Array of document IDs to delete.
**`compact(): Promise`**
Optimizes the index by removing deleted documents and reorganizing data for better performance. Call after a series of deletions.
**`destroy(): Promise`**
Releases all resources associated with the index from memory.
### useCactusIndex Hook
The `useCactusIndex` hook manages a `CactusIndex` instance with reactive state. When index parameters (`name` or `embeddingDim`) change, the hook creates a new instance and resets all state. The hook automatically cleans up resources when the component unmounts.
#### State
- `isInitializing: boolean` - Whether the index is initializing.
- `isProcessing: boolean` - Whether the index is processing an operation (add, query, get, delete, or compact).
- `error: string | null` - Last error message from any operation, or `null` if there is no error. Cleared before starting new operations.
#### Methods
- `init(): Promise` - Initializes the index. Sets `isInitializing` to `true` during initialization.
- `add(params: CactusIndexAddParams): Promise` - Adds documents to the index. Sets `isProcessing` to `true` during operation.
- `query(params: CactusIndexQueryParams): Promise` - Searches for similar documents. Sets `isProcessing` to `true` during operation.
- `get(params: CactusIndexGetParams): Promise` - Retrieves documents by IDs. Sets `isProcessing` to `true` during operation.
- `delete(params: CactusIndexDeleteParams): Promise` - Deletes documents. Sets `isProcessing` to `true` during operation.
- `compact(): Promise` - Optimizes the index. Sets `isProcessing` to `true` during operation.
- `destroy(): Promise` - Releases all resources. Automatically called when the component unmounts.
### getRegistry
**`getRegistry(): Promise<{ [key: string]: CactusModel }>`**
Returns all available models from HuggingFace, keyed by model slug. Result is cached across calls.
```typescript
import { getRegistry } from 'cactus-react-native';
const registry = await getRegistry();
const model = registry['qwen3-0.6b'];
console.log(model);
```
## Type Definitions
### CactusLMParams
```typescript
interface CactusLMParams {
model?: string;
corpusDir?: string;
cacheIndex?: boolean;
options?: CactusModelOptions;
}
```
### CactusLMDownloadParams
```typescript
interface CactusLMDownloadParams {
onProgress?: (progress: number) => void;
}
```
### CactusLMMessage
```typescript
interface CactusLMMessage {
role: 'user' | 'assistant' | 'system';
content?: string;
images?: string[];
}
```
### CactusLMCompleteOptions
```typescript
interface CactusLMCompleteOptions {
temperature?: number;
topP?: number;
topK?: number;
maxTokens?: number;
stopSequences?: string[];
forceTools?: boolean;
telemetryEnabled?: boolean;
confidenceThreshold?: number;
toolRagTopK?: number;
includeStopSequences?: boolean;
useVad?: boolean;
enableThinking?: boolean;
}
```
### CactusLMTool
```typescript
interface CactusLMTool {
name: string;
description: string;
parameters: {
type: 'object';
properties: {
[key: string]: {
type: string;
description: string;
};
};
required: string[];
};
}
```
### CactusLMCompleteParams
```typescript
interface CactusLMCompleteParams {
messages: CactusLMMessage[];
options?: CactusLMCompleteOptions;
tools?: CactusLMTool[];
onToken?: (token: string) => void;
audio?: number[];
}
```
### CactusLMPrefillParams
```typescript
interface CactusLMPrefillParams {
messages: CactusLMMessage[];
options?: CactusLMCompleteOptions;
tools?: CactusLMTool[];
audio?: number[];
}
```
### CactusLMRagQueryParams
```typescript
interface CactusLMRagQueryParams {
query: string;
topK?: number;
}
```
### CactusLMRagQueryChunk
```typescript
interface CactusLMRagQueryChunk {
score: number;
source: string;
content: string;
}
```
### CactusLMRagQueryResult
```typescript
interface CactusLMRagQueryResult {
chunks: CactusLMRagQueryChunk[];
error?: string;
}
```
### CactusLMPrefillResult
```typescript
interface CactusLMPrefillResult {
success: boolean;
error: string | null;
prefillTokens: number;
prefillTps: number;
totalTimeMs: number;
ramUsageMb: number;
}
```
### CactusLMCompleteResult
```typescript
interface CactusLMCompleteResult {
success: boolean;
response: string;
thinking?: string;
functionCalls?: {
name: string;
arguments: { [key: string]: any };
}[];
cloudHandoff?: boolean;
confidence?: number;
timeToFirstTokenMs: number;
totalTimeMs: number;
prefillTokens: number;
prefillTps: number;
decodeTokens: number;
decodeTps: number;
totalTokens: number;
ramUsageMb?: number;
}
```
### CactusLMTokenizeParams
```typescript
interface CactusLMTokenizeParams {
text: string;
}
```
### CactusLMTokenizeResult
```typescript
interface CactusLMTokenizeResult {
tokens: number[];
}
```
### CactusLMScoreWindowParams
```typescript
interface CactusLMScoreWindowParams {
tokens: number[];
start: number;
end: number;
context: number;
}
```
### CactusLMScoreWindowResult
```typescript
interface CactusLMScoreWindowResult {
score: number;
}
```
### CactusLMEmbedParams
```typescript
interface CactusLMEmbedParams {
text: string;
normalize?: boolean;
}
```
### CactusLMEmbedResult
```typescript
interface CactusLMEmbedResult {
embedding: number[];
}
```
### CactusLMImageEmbedParams
```typescript
interface CactusLMImageEmbedParams {
imagePath: string;
}
```
### CactusLMImageEmbedResult
```typescript
interface CactusLMImageEmbedResult {
embedding: number[];
}
```
### CactusModel
```typescript
interface CactusModel {
slug: string;
capabilities: string[];
quantization: {
int4: {
sizeMb: number;
url: string;
pro?: {
apple: string;
};
};
int8: {
sizeMb: number;
url: string;
pro?: {
apple: string;
};
};
};
}
```
### CactusModelOptions
```typescript
interface CactusModelOptions {
quantization?: 'int4' | 'int8';
pro?: boolean;
}
```
### CactusSTTParams
```typescript
interface CactusSTTParams {
model?: string;
options?: CactusModelOptions;
}
```
### CactusSTTDownloadParams
```typescript
interface CactusSTTDownloadParams {
onProgress?: (progress: number) => void;
}
```
### CactusSTTTranscribeOptions
```typescript
interface CactusSTTTranscribeOptions {
temperature?: number;
topP?: number;
topK?: number;
maxTokens?: number;
stopSequences?: string[];
useVad?: boolean;
telemetryEnabled?: boolean;
confidenceThreshold?: number;
cloudHandoffThreshold?: number;
includeStopSequences?: boolean;
}
```
### CactusSTTTranscribeParams
```typescript
interface CactusSTTTranscribeParams {
audio: string | number[];
prompt?: string;
options?: CactusSTTTranscribeOptions;
onToken?: (token: string) => void;
}
```
### CactusSTTTranscribeResult
```typescript
interface CactusSTTTranscribeResult {
success: boolean;
response: string;
cloudHandoff?: boolean;
confidence?: number;
timeToFirstTokenMs: number;
totalTimeMs: number;
prefillTokens: number;
prefillTps: number;
decodeTokens: number;
decodeTps: number;
totalTokens: number;
ramUsageMb?: number;
}
```
### CactusSTTAudioEmbedParams
```typescript
interface CactusSTTAudioEmbedParams {
audioPath: string;
}
```
### CactusSTTAudioEmbedResult
```typescript
interface CactusSTTAudioEmbedResult {
embedding: number[];
}
```
### CactusSTTStreamTranscribeStartOptions
```typescript
interface CactusSTTStreamTranscribeStartOptions {
confirmationThreshold?: number;
minChunkSize?: number;
telemetryEnabled?: boolean;
language?: string;
}
```
### CactusSTTStreamTranscribeProcessParams
```typescript
interface CactusSTTStreamTranscribeProcessParams {
audio: number[];
}
```
### CactusSTTStreamTranscribeProcessResult
```typescript
interface CactusSTTStreamTranscribeProcessResult {
success: boolean;
confirmed: string;
pending: string;
bufferDurationMs?: number;
confidence?: number;
cloudHandoff?: boolean;
cloudResult?: string;
cloudJobId?: number;
cloudResultJobId?: number;
timeToFirstTokenMs?: number;
totalTimeMs?: number;
prefillTokens?: number;
prefillTps?: number;
decodeTokens?: number;
decodeTps?: number;
totalTokens?: number;
ramUsageMb?: number;
}
```
### CactusSTTStreamTranscribeStopResult
```typescript
interface CactusSTTStreamTranscribeStopResult {
success: boolean;
confirmed: string;
}
```
### CactusSTTDetectLanguageOptions
```typescript
interface CactusSTTDetectLanguageOptions {
useVad?: boolean;
}
```
### CactusSTTDetectLanguageParams
```typescript
interface CactusSTTDetectLanguageParams {
audio: string | number[];
options?: CactusSTTDetectLanguageOptions;
}
```
### CactusSTTDetectLanguageResult
```typescript
interface CactusSTTDetectLanguageResult {
language: string;
confidence?: number;
}
```
### CactusAudioParams
```typescript
interface CactusAudioParams {
model?: string;
options?: CactusModelOptions;
}
```
### CactusAudioDownloadParams
```typescript
interface CactusAudioDownloadParams {
onProgress?: (progress: number) => void;
}
```
### CactusAudioVADOptions
```typescript
interface CactusAudioVADOptions {
threshold?: number;
negThreshold?: number;
minSpeechDurationMs?: number;
maxSpeechDurationS?: number;
minSilenceDurationMs?: number;
speechPadMs?: number;
windowSizeSamples?: number;
samplingRate?: number;
minSilenceAtMaxSpeech?: number;
useMaxPossSilAtMaxSpeech?: boolean;
}
```
### CactusAudioVADSegment
```typescript
interface CactusAudioVADSegment {
start: number;
end: number;
}
```
### CactusAudioVADResult
```typescript
interface CactusAudioVADResult {
segments: CactusAudioVADSegment[];
totalTime: number;
ramUsage: number;
}
```
### CactusAudioVADParams
```typescript
interface CactusAudioVADParams {
audio: string | number[];
options?: CactusAudioVADOptions;
}
```
### CactusAudioDiarizeOptions
```typescript
interface CactusAudioDiarizeOptions {
stepMs?: number;
threshold?: number;
numSpeakers?: number;
minSpeakers?: number;
maxSpeakers?: number;
}
```
### CactusAudioDiarizeParams
```typescript
interface CactusAudioDiarizeParams {
audio: string | number[];
options?: CactusAudioDiarizeOptions;
}
```
### CactusAudioDiarizeResult
```typescript
interface CactusAudioDiarizeResult {
success: boolean;
error: string | null;
numSpeakers: number;
scores: number[];
totalTimeMs: number;
ramUsageMb: number;
}
```
### CactusAudioEmbedSpeakerOptions
```typescript
interface CactusAudioEmbedSpeakerOptions {
stepMs?: number;
threshold?: number;
maskWeights?: number[];
maskNumFrames?: number;
}
```
### CactusAudioEmbedSpeakerParams
```typescript
interface CactusAudioEmbedSpeakerParams {
audio: string | number[];
options?: CactusAudioEmbedSpeakerOptions;
}
```
### CactusAudioEmbedSpeakerResult
```typescript
interface CactusAudioEmbedSpeakerResult {
success: boolean;
error: string | null;
embedding: number[];
totalTimeMs: number;
ramUsageMb: number;
}
```
### CactusIndexParams
```typescript
interface CactusIndexParams {
name: string;
embeddingDim: number;
}
```
### CactusIndexAddParams
```typescript
interface CactusIndexAddParams {
ids: number[];
documents: string[];
embeddings: number[][];
metadatas?: string[];
}
```
### CactusIndexGetParams
```typescript
interface CactusIndexGetParams {
ids: number[];
}
```
### CactusIndexGetResult
```typescript
interface CactusIndexGetResult {
documents: string[];
metadatas: string[];
embeddings: number[][];
}
```
### CactusIndexQueryOptions
```typescript
interface CactusIndexQueryOptions {
topK?: number;
scoreThreshold?: number;
}
```
### CactusIndexQueryParams
```typescript
interface CactusIndexQueryParams {
embeddings: number[][];
options?: CactusIndexQueryOptions;
}
```
### CactusIndexQueryResult
```typescript
interface CactusIndexQueryResult {
ids: number[][];
scores: number[][];
}
```
### CactusIndexDeleteParams
```typescript
interface CactusIndexDeleteParams {
ids: number[];
}
```
## Performance Tips
- **Model Selection** - Choose smaller models for faster inference on mobile devices.
- **Memory Management** - Always call `destroy()` when you're done with models to free up resources.
- **VAD** - Use `useVad: true` (the default) when transcribing audio with silence, to strip non-speech regions and speed up transcription.
## Example App
Check out [our example app](/example) for a complete React Native implementation.