An open API service indexing awesome lists of open source software.

https://github.com/cactus-compute/cactus-react-native

Cactus React Native package: Run AI locally in your React Native apps
https://github.com/cactus-compute/cactus-react-native

ai apps cactus llamacpp llm llm-inference llms react react-native

Last synced: about 2 months ago
JSON representation

Cactus React Native package: Run AI locally in your React Native apps

Awesome Lists containing this project

README

          

![Cactus Logo](assets/logo.png)

## Resources

[![cactus](https://img.shields.io/badge/cactus-000000?logo=github&logoColor=white)](https://github.com/cactus-compute/cactus) [![HuggingFace](https://img.shields.io/badge/HuggingFace-FFD21E?logo=huggingface&logoColor=black)](https://huggingface.co/Cactus-Compute/models?sort=downloads) [![Discord](https://img.shields.io/badge/Discord-5865F2?logo=discord&logoColor=white)](https://discord.gg/bNurx3AXTJ) [![Documentation](https://img.shields.io/badge/Documentation-4285F4?logo=googledocs&logoColor=white)](https://cactuscompute.com/docs/react-native)

## Installation

```bash
npm install cactus-react-native react-native-nitro-modules
```

## Quick Start

Get started with Cactus in just a few lines of code:

```typescript
import { CactusLM, type CactusLMMessage } from 'cactus-react-native';

// Create a new instance
const cactusLM = new CactusLM();

// Download the model
await cactusLM.download({
onProgress: (progress) => console.log(`Download: ${Math.round(progress * 100)}%`)
});

// Generate a completion
const messages: CactusLMMessage[] = [
{ role: 'user', content: 'What is the capital of France?' }
];

const result = await cactusLM.complete({ messages });
console.log(result.response); // "The capital of France is Paris."

// Clean up resources
await cactusLM.destroy();
```

**Using the React Hook:**

```tsx
import { useCactusLM } from 'cactus-react-native';

const App = () => {
const cactusLM = useCactusLM();

useEffect(() => {
// Download the model if not already available
if (!cactusLM.isDownloaded) {
cactusLM.download();
}
}, []);

const handleGenerate = () => {
// Generate a completion
cactusLM.complete({
messages: [{ role: 'user', content: 'Hello!' }],
});
};

if (cactusLM.isDownloading) {
return (

Downloading model: {Math.round(cactusLM.downloadProgress * 100)}%

);
}

return (
<>

{cactusLM.completion}
>
);
};
```

## Language Model

### Model Options

Choose model quantization and NPU acceleration with Pro models.

```typescript
import { CactusLM } from 'cactus-react-native';

// Use int8 for better accuracy (default)
const cactusLM = new CactusLM({
model: 'lfm2-vl-450m',
options: {
quantization: 'int8', // 'int4' or 'int8'
pro: false
}
});

// Use pro models for NPU acceleration
const cactusPro = new CactusLM({
model: 'lfm2-vl-450m',
options: {
quantization: 'int8',
pro: true
}
});
```

### Completion

Generate text responses from the model by providing a conversation history.

#### Class

```typescript
import { CactusLM, type CactusLMMessage } from 'cactus-react-native';

const cactusLM = new CactusLM();

const messages: CactusLMMessage[] = [{ role: 'user', content: 'Hello, World!' }];
const onToken = (token: string) => { console.log('Token:', token) };

const result = await cactusLM.complete({ messages, onToken });
console.log('Completion result:', result);
```

#### Hook

```tsx
import { useCactusLM, type CactusLMMessage } from 'cactus-react-native';

const App = () => {
const cactusLM = useCactusLM();

const handleComplete = async () => {
const messages: CactusLMMessage[] = [{ role: 'user', content: 'Hello, World!' }];

const result = await cactusLM.complete({ messages });
console.log('Completion result:', result);
};

return (
<>

{cactusLM.completion}
>
);
};
```

### Vision

Vision allows you to pass images along with text prompts, enabling the model to analyze and understand visual content.

#### Class

```typescript
import { CactusLM, type CactusLMMessage } from 'cactus-react-native';

// Vision-capable model
const cactusLM = new CactusLM({ model: 'lfm2-vl-450m' });

const messages: CactusLMMessage[] = [
{
role: 'user',
content: "What's in the image?",
images: ['path/to/your/image'],
},
];

const result = await cactusLM.complete({ messages });
console.log('Response:', result.response);
```

#### Hook

```tsx
import { useCactusLM, type CactusLMMessage } from 'cactus-react-native';

const App = () => {
// Vision-capable model
const cactusLM = useCactusLM({ model: 'lfm2-vl-450m' });

const handleAnalyze = async () => {
const messages: CactusLMMessage[] = [
{
role: 'user',
content: "What's in the image?",
images: ['path/to/your/image'],
},
];

await cactusLM.complete({ messages });
};

return (
<>

{cactusLM.completion}
>
);
};
```

### Tool Calling

Enable the model to generate function calls by defining available tools and their parameters.

#### Class

```typescript
import { CactusLM, type CactusLMMessage, type CactusLMTool } from 'cactus-react-native';

const tools: CactusLMTool[] = [
{
name: 'get_weather',
description: 'Get current weather for a location',
parameters: {
type: 'object',
properties: {
location: {
type: 'string',
description: 'City name',
},
},
required: ['location'],
},
},
];

const cactusLM = new CactusLM();

const messages: CactusLMMessage[] = [
{ role: 'user', content: "What's the weather in San Francisco?" },
];

const result = await cactusLM.complete({ messages, tools });
console.log('Response:', result.response);
console.log('Function calls:', result.functionCalls);
```

#### Hook

```tsx
import { useCactusLM, type CactusLMMessage, type CactusLMTool } from 'cactus-react-native';

const tools: CactusLMTool[] = [
{
name: 'get_weather',
description: 'Get current weather for a location',
parameters: {
type: 'object',
properties: {
location: {
type: 'string',
description: 'City name',
},
},
required: ['location'],
},
},
];

const App = () => {
const cactusLM = useCactusLM();

const handleComplete = async () => {
const messages: CactusLMMessage[] = [
{ role: 'user', content: "What's the weather in San Francisco?" },
];

const result = await cactusLM.complete({ messages, tools });
console.log('Response:', result.response);
console.log('Function calls:', result.functionCalls);
};

return ;
};
```

### Audio Completion

Pass raw PCM audio alongside text prompts for multimodal completion with audio-capable models like Gemma 4.

#### Class

```typescript
import { CactusLM, type CactusLMMessage } from 'cactus-react-native';

const cactusLM = new CactusLM({ model: 'gemma-4-e2b-it' });

const messages: CactusLMMessage[] = [
{ role: 'user', content: 'What do you hear in this audio?' },
];

// Pass raw 16-bit PCM samples as a byte array
const pcmAudio: number[] = [/* raw PCM bytes */];
const result = await cactusLM.complete({ messages, audio: pcmAudio });
console.log(result.response);
```

#### Hook

```tsx
import { useCactusLM, type CactusLMMessage } from 'cactus-react-native';

const App = () => {
const cactusLM = useCactusLM({ model: 'gemma-4-e2b-it' });

const handleAudioComplete = async (pcmAudio: number[]) => {
const messages: CactusLMMessage[] = [
{ role: 'user', content: 'Describe this audio.' },
];

await cactusLM.complete({ messages, audio: pcmAudio });
};

return {cactusLM.completion};
};
```

### RAG (Retrieval Augmented Generation)

RAG allows you to provide a corpus of documents that the model can reference during generation, enabling it to answer questions based on your data.

#### Class

```typescript
import { CactusLM, type CactusLMMessage } from 'cactus-react-native';

const cactusLM = new CactusLM({
corpusDir: 'path/to/your/corpus', // Directory containing .txt files
});

const messages: CactusLMMessage[] = [
{ role: 'user', content: 'What information is in the documents?' },
];

const result = await cactusLM.complete({ messages });
console.log(result.response);

// Or query the RAG index directly
const ragResult = await cactusLM.ragQuery({ query: 'search terms', topK: 5 });
console.log('Chunks:', ragResult.chunks);
// [{ score: 0.85, source: 'doc.txt', content: '...' }, ...]
```

#### Hook

```tsx
import { useCactusLM, type CactusLMMessage } from 'cactus-react-native';

const App = () => {
const cactusLM = useCactusLM({
corpusDir: 'path/to/your/corpus', // Directory containing .txt files
});

const handleAsk = async () => {
const messages: CactusLMMessage[] = [
{ role: 'user', content: 'What information is in the documents?' },
];

await cactusLM.complete({ messages });
};

const handleRagQuery = async () => {
const result = await cactusLM.ragQuery({ query: 'search terms', topK: 3 });
console.log('Chunks:', result.chunks);
};

return (
<>


{cactusLM.completion}
>
);
};
```

### Tokenization

Convert text into tokens using the model's tokenizer.

#### Class

```typescript
import { CactusLM } from 'cactus-react-native';

const cactusLM = new CactusLM();

const result = await cactusLM.tokenize({ text: 'Hello, World!' });
console.log('Token IDs:', result.tokens);
```

#### Hook

```tsx
import { useCactusLM } from 'cactus-react-native';

const App = () => {
const cactusLM = useCactusLM();

const handleTokenize = async () => {
const result = await cactusLM.tokenize({ text: 'Hello, World!' });
console.log('Token IDs:', result.tokens);
};

return ;
};
```

### Score Window

Calculate perplexity scores for a window of tokens within a sequence.

#### Class

```typescript
import { CactusLM } from 'cactus-react-native';

const cactusLM = new CactusLM();

const tokens = [123, 456, 789, 101, 112];
const result = await cactusLM.scoreWindow({
tokens,
start: 1,
end: 3,
context: 2
});
console.log('Score:', result.score);
```

#### Hook

```tsx
import { useCactusLM } from 'cactus-react-native';

const App = () => {
const cactusLM = useCactusLM();

const handleScoreWindow = async () => {
const tokens = [123, 456, 789, 101, 112];
const result = await cactusLM.scoreWindow({
tokens,
start: 1,
end: 3,
context: 2
});
console.log('Score:', result.score);
};

return ;
};
```

### Embedding

Convert text and images into numerical vector representations that capture semantic meaning, useful for similarity search and semantic understanding.

#### Text Embedding

##### Class

```typescript
import { CactusLM } from 'cactus-react-native';

const cactusLM = new CactusLM();

const result = await cactusLM.embed({ text: 'Hello, World!' });
console.log('Embedding vector:', result.embedding);
console.log('Embedding vector length:', result.embedding.length);
```

##### Hook

```tsx
import { useCactusLM } from 'cactus-react-native';

const App = () => {
const cactusLM = useCactusLM();

const handleEmbed = async () => {
const result = await cactusLM.embed({ text: 'Hello, World!' });
console.log('Embedding vector:', result.embedding);
console.log('Embedding vector length:', result.embedding.length);
};

return ;
};
```

#### Image Embedding

##### Class

```typescript
import { CactusLM } from 'cactus-react-native';

const cactusLM = new CactusLM({ model: 'lfm2-vl-450m' });

const result = await cactusLM.imageEmbed({ imagePath: 'path/to/your/image.jpg' });
console.log('Image embedding vector:', result.embedding);
console.log('Embedding vector length:', result.embedding.length);
```

##### Hook

```tsx
import { useCactusLM } from 'cactus-react-native';

const App = () => {
const cactusLM = useCactusLM({ model: 'lfm2-vl-450m' });

const handleImageEmbed = async () => {
const result = await cactusLM.imageEmbed({ imagePath: 'path/to/your/image.jpg' });
console.log('Image embedding vector:', result.embedding);
console.log('Embedding vector length:', result.embedding.length);
};

return ;
};
```

## Speech-to-Text (STT)

The `CactusSTT` class provides audio transcription and audio embedding capabilities using speech-to-text models such as Whisper and Moonshine.

### Transcription

Transcribe audio to text with streaming support. Accepts either a file path or raw PCM audio samples.

#### Class

```typescript
import { CactusSTT } from 'cactus-react-native';

const cactusSTT = new CactusSTT({ model: 'whisper-small' });

// Transcribe from file path
const result = await cactusSTT.transcribe({
audio: 'path/to/audio.wav',
onToken: (token) => console.log('Token:', token)
});

console.log('Transcription:', result.response);

// Or transcribe from raw PCM samples
const pcmSamples: number[] = [/* ... */];
const result2 = await cactusSTT.transcribe({
audio: pcmSamples,
onToken: (token) => console.log('Token:', token)
});

console.log('Transcription:', result2.response);
```

#### Hook

```tsx
import { useCactusSTT } from 'cactus-react-native';

const App = () => {
const cactusSTT = useCactusSTT({ model: 'whisper-small' });

const handleTranscribe = async () => {
// Transcribe from file path
const result = await cactusSTT.transcribe({
audio: 'path/to/audio.wav',
});
console.log('Transcription:', result.response);

const pcmSamples: number[] = [/* ... */];
const result2 = await cactusSTT.transcribe({
audio: pcmSamples,
});
console.log('Transcription:', result2.response);
};

return (
<>

{cactusSTT.transcription}
>
);
};
```

### Streaming Transcription

Transcribe audio in real-time with incremental results. Each call to `streamTranscribeProcess` feeds an audio chunk and returns the currently confirmed and pending text.

#### Class

```typescript
import { CactusSTT } from 'cactus-react-native';

const cactusSTT = new CactusSTT({ model: 'whisper-small' });

await cactusSTT.streamTranscribeStart({
confirmationThreshold: 0.99, // confidence required to confirm text
minChunkSize: 32000, // minimum samples before processing
});

const audioChunk: number[] = [/* PCM samples as bytes */];
const result = await cactusSTT.streamTranscribeProcess({ audio: audioChunk });

console.log('Confirmed:', result.confirmed);
console.log('Pending:', result.pending);

const final = await cactusSTT.streamTranscribeStop();
console.log('Final confirmed:', final.confirmed);
```

#### Hook

```tsx
import { useCactusSTT } from 'cactus-react-native';

const App = () => {
const cactusSTT = useCactusSTT({ model: 'whisper-small' });

const handleStart = async () => {
await cactusSTT.streamTranscribeStart({ confirmationThreshold: 0.99 });
};

const handleChunk = async (audioChunk: number[]) => {
const result = await cactusSTT.streamTranscribeProcess({ audio: audioChunk });
console.log('Confirmed:', result.confirmed);
console.log('Pending:', result.pending);
};

const handleStop = async () => {
const final = await cactusSTT.streamTranscribeStop();
console.log('Final:', final.confirmed);
};

return (
<>


{cactusSTT.streamTranscribeConfirmed}
{cactusSTT.streamTranscribePending}
>
);
};
```

### Audio Embedding

Generate embeddings from audio files for audio understanding.

#### Class

```typescript
import { CactusSTT } from 'cactus-react-native';

const cactusSTT = new CactusSTT();

const result = await cactusSTT.audioEmbed({
audioPath: 'path/to/audio.wav'
});

console.log('Audio embedding vector:', result.embedding);
console.log('Embedding vector length:', result.embedding.length);
```

#### Hook

```tsx
import { useCactusSTT } from 'cactus-react-native';

const App = () => {
const cactusSTT = useCactusSTT();

const handleAudioEmbed = async () => {
const result = await cactusSTT.audioEmbed({
audioPath: 'path/to/audio.wav'
});
console.log('Audio embedding vector:', result.embedding);
console.log('Embedding vector length:', result.embedding.length);
};

return ;
};
```

### Language Detection

Detect the spoken language in an audio file. Only available on the class, not the hook.

```typescript
import { CactusSTT } from 'cactus-react-native';

const cactusSTT = new CactusSTT({ model: 'whisper-small' });

const result = await cactusSTT.detectLanguage({
audio: 'path/to/audio.wav',
options: { useVad: true },
});

console.log('Language:', result.language); // e.g. 'en'
console.log('Confidence:', result.confidence);
```

## Audio Processing

The `CactusAudio` class provides voice activity detection (VAD), speaker diarization, and speaker embedding extraction.

### Voice Activity Detection

```typescript
import { CactusAudio } from 'cactus-react-native';

const cactusAudio = new CactusAudio({ model: 'silero-vad' });

const result = await cactusAudio.vad({
audio: 'path/to/audio.wav',
options: {
threshold: 0.5,
minSpeechDurationMs: 250,
minSilenceDurationMs: 100,
}
});

console.log('Speech segments:', result.segments);
// [{ start: 0, end: 16000 }, { start: 32000, end: 48000 }, ...]
console.log('Total time (ms):', result.totalTime);
```

### Speaker Diarization

```typescript
import { CactusAudio } from 'cactus-react-native';

const cactusAudio = new CactusAudio({ model: 'silero-vad' });

const result = await cactusAudio.diarize({
audio: 'path/to/audio.wav',
options: {
numSpeakers: 2,
minSpeakers: 1,
maxSpeakers: 4,
}
});

console.log('Number of speakers:', result.numSpeakers);
console.log('Scores:', result.scores);
```

### Speaker Embedding

Extract a speaker embedding vector from audio, optionally with mask weights for speaker-specific segments from diarization.

```typescript
import { CactusAudio } from 'cactus-react-native';

const cactusAudio = new CactusAudio({ model: 'silero-vad' });

const result = await cactusAudio.embedSpeaker({
audio: 'path/to/audio.wav',
});

console.log('Speaker embedding:', result.embedding);

// With mask weights from diarization
const maskedResult = await cactusAudio.embedSpeaker({
audio: 'path/to/audio.wav',
options: {
maskWeights: [1.0, 1.0, 0.0, 0.0, 1.0], // per-frame weights
maskNumFrames: 5,
},
});
```

### Hook

```tsx
import { useCactusAudio } from 'cactus-react-native';

const App = () => {
const cactusAudio = useCactusAudio({ model: 'silero-vad' });

const handleVAD = async () => {
const result = await cactusAudio.vad({
audio: 'path/to/audio.wav',
});
console.log('Speech segments:', result.segments);
};

const handleDiarize = async () => {
const result = await cactusAudio.diarize({
audio: 'path/to/audio.wav',
});
console.log('Speakers:', result.numSpeakers);
};

return (
<>


>
);
};
```

## Vector Index

The `CactusIndex` class provides a vector database for storing and querying embeddings with metadata. Enabling similarity search and retrieval.

### Creating and Initializing an Index

#### Class

```typescript
import { CactusIndex } from 'cactus-react-native';

const cactusIndex = new CactusIndex('my-index', 1024);
await cactusIndex.init();
```

#### Hook

```tsx
import { useCactusIndex } from 'cactus-react-native';

const App = () => {
const cactusIndex = useCactusIndex({
name: 'my-index',
embeddingDim: 1024
});

const handleInit = async () => {
await cactusIndex.init();
};

return
};
```

### Adding Documents

Add documents with their embeddings and metadata to the index.

#### Class

```typescript
import { CactusIndex } from 'cactus-react-native';

const cactusIndex = new CactusIndex('my-index', 1024);
await cactusIndex.init();

await cactusIndex.add({
ids: [1, 2, 3],
documents: ['First document', 'Second document', 'Third document'],
embeddings: [
[0.1, 0.2, ...],
[0.3, 0.4, ...],
[0.5, 0.6, ...]
],
metadatas: ['metadata1', 'metadata2', 'metadata3']
});
```

#### Hook

```tsx
import { useCactusIndex } from 'cactus-react-native';

const App = () => {
const cactusIndex = useCactusIndex({
name: 'my-index',
embeddingDim: 1024
});

const handleAdd = async () => {
await cactusIndex.add({
ids: [1, 2, 3],
documents: ['First document', 'Second document', 'Third document'],
embeddings: [
[0.1, 0.2, ...],
[0.3, 0.4, ...],
[0.5, 0.6, ...]
],
metadatas: ['metadata1', 'metadata2', 'metadata3']
});
};

return ;
};
```

### Querying the Index

Search for similar documents using embedding vectors.

#### Class

```typescript
import { CactusIndex } from 'cactus-react-native';

const cactusIndex = new CactusIndex('my-index', 1024);
await cactusIndex.init();

const result = await cactusIndex.query({
embeddings: [[0.1, 0.2, ...]],
options: {
topK: 5,
scoreThreshold: 0.7
}
});

console.log('IDs:', result.ids);
console.log('Scores:', result.scores);
```

#### Hook

```tsx
import { useCactusIndex } from 'cactus-react-native';

const App = () => {
const cactusIndex = useCactusIndex({
name: 'my-index',
embeddingDim: 1024
});

const handleQuery = async () => {
const result = await cactusIndex.query({
embeddings: [[0.1, 0.2, ...]],
options: {
topK: 5,
scoreThreshold: 0.7
}
});
console.log('IDs:', result.ids);
console.log('Scores:', result.scores);
};

return ;
};
```

### Retrieving Documents

Get documents by their IDs.

#### Class

```typescript
import { CactusIndex } from 'cactus-react-native';

const cactusIndex = new CactusIndex('my-index', 1024);
await cactusIndex.init();

const result = await cactusIndex.get({ ids: [1, 2, 3] });
console.log('Documents:', result.documents);
console.log('Metadatas:', result.metadatas);
console.log('Embeddings:', result.embeddings);
```

#### Hook

```tsx
import { useCactusIndex } from 'cactus-react-native';

const App = () => {
const cactusIndex = useCactusIndex({
name: 'my-index',
embeddingDim: 1024
});

const handleGet = async () => {
const result = await cactusIndex.get({ ids: [1, 2, 3] });
console.log('Documents:', result.documents);
console.log('Metadatas:', result.metadatas);
console.log('Embeddings:', result.embeddings);
};

return ;
};
```

### Deleting Documents

Mark documents as deleted by their IDs.

#### Class

```typescript
import { CactusIndex } from 'cactus-react-native';

const cactusIndex = new CactusIndex('my-index', 1024);
await cactusIndex.init();

await cactusIndex.delete({ ids: [1, 2, 3] });
```

#### Hook

```tsx
import { useCactusIndex } from 'cactus-react-native';

const App = () => {
const cactusIndex = useCactusIndex({
name: 'my-index',
embeddingDim: 1024
});

const handleDelete = async () => {
await cactusIndex.delete({ ids: [1, 2, 3] });
};

return ;
};
```

### Compacting the Index

Optimize the index by removing deleted documents and reorganizing data.

#### Class

```typescript
import { CactusIndex } from 'cactus-react-native';

const cactusIndex = new CactusIndex('my-index', 1024);
await cactusIndex.init();

await cactusIndex.compact();
```

#### Hook

```tsx
import { useCactusIndex } from 'cactus-react-native';

const App = () => {
const cactusIndex = useCactusIndex({
name: 'my-index',
embeddingDim: 1024
});

const handleCompact = async () => {
await cactusIndex.compact();
};

return ;
};
```

## API Reference

### CactusLM Class

#### Constructor

**`new CactusLM(params?: CactusLMParams)`**

**Parameters:**
- `model` - Model slug or absolute path to a model file (default: `'qwen3-0.6b'`).
- `corpusDir` - Directory containing text files for RAG (default: `undefined`).
- `cacheIndex` - Whether to cache the RAG corpus index on disk (default: `false`).
- `options` - Model options for quantization and NPU acceleration:
- `quantization` - Quantization type: `'int4'` | `'int8'` (default: `'int8'`).
- `pro` - Enable NPU-accelerated models (default: `false`).

#### Methods

**`download(params?: CactusLMDownloadParams): Promise`**

Downloads the model. If the model is already downloaded, returns immediately with progress `1`. Throws an error if a download is already in progress.

**Parameters:**
- `onProgress` - Callback for download progress (0-1).

**`init(): Promise`**

Initializes the model and prepares it for inference. Safe to call multiple times (idempotent). Throws an error if the model is not downloaded yet.

**`complete(params: CactusLMCompleteParams): Promise`**

Performs text completion with optional streaming and tool support. Automatically calls `init()` if not already initialized. Throws an error if a generation (completion or embedding) is already in progress.

**Parameters:**
- `messages` - Array of `CactusLMMessage` objects.
- `options` - Generation options:
- `temperature` - Sampling temperature.
- `topP` - Nucleus sampling threshold.
- `topK` - Top-K sampling limit.
- `maxTokens` - Maximum number of tokens to generate (default: `512`).
- `stopSequences` - Array of strings to stop generation.
- `forceTools` - Force the model to call one of the provided tools (default: `false`).
- `telemetryEnabled` - Enable telemetry for this request (default: `true`).
- `confidenceThreshold` - Confidence threshold below which cloud handoff is triggered (default: `0.7`).
- `toolRagTopK` - Number of tools to select via RAG when tool list is large (default: `2`).
- `includeStopSequences` - Whether to include stop sequences in the response (default: `false`).
- `useVad` - Whether to use VAD preprocessing (default: `true`).
- `enableThinking` - Whether to enable thinking/reasoning output if supported by the model (default: unset).
- `tools` - Array of `CactusLMTool` objects for function calling.
- `onToken` - Callback for streaming tokens.
- `audio` - Optional raw 16-bit PCM audio samples as a byte array for multimodal audio completion (e.g., Gemma 4).

**`prefill(params: CactusLMPrefillParams): Promise`**

Runs prompt prefill without generating any output tokens. Useful for measuring prefill performance or warming up the model's KV cache. Automatically calls `init()` if not already initialized. Throws an error if a generation is already in progress.

**Parameters:**
- `messages` - Array of `CactusLMMessage` objects.
- `options` - Same options as `complete`.
- `tools` - Array of `CactusLMTool` objects.
- `audio` - Optional raw 16-bit PCM audio samples as a byte array for multimodal audio prefill.

**`tokenize(params: CactusLMTokenizeParams): Promise`**

Converts text into tokens using the model's tokenizer.

**Parameters:**
- `text` - Text to tokenize.

**`scoreWindow(params: CactusLMScoreWindowParams): Promise`**

Calculates the log-probability score for a window of tokens within a sequence.

**Parameters:**
- `tokens` - Array of token IDs.
- `start` - Start index of the window.
- `end` - End index of the window.
- `context` - Number of context tokens before the window.

**`embed(params: CactusLMEmbedParams): Promise`**

Generates embeddings for the given text. Automatically calls `init()` if not already initialized. Throws an error if a generation (completion or embedding) is already in progress.

**Parameters:**
- `text` - Text to embed.
- `normalize` - Whether to normalize the embedding vector (default: `false`).

**`imageEmbed(params: CactusLMImageEmbedParams): Promise`**

Generates embeddings for the given image. Requires a vision-capable model. Automatically calls `init()` if not already initialized. Throws an error if a generation (completion or embedding) is already in progress.

**Parameters:**
- `imagePath` - Path to the image file.

**`ragQuery(params: CactusLMRagQueryParams): Promise`**

Queries the RAG corpus index directly, returning the top matching document chunks with scores. Requires the model to be initialized with a `corpusDir`. Automatically calls `init()` if not already initialized.

**Parameters:**
- `query` - Search query string.
- `topK` - Number of top results to return (default: `5`).

**`stop(): Promise`**

Stops ongoing generation.

**`reset(): Promise`**

Resets the model's internal state, clearing any cached context. Automatically calls `stop()` first.

**`destroy(): Promise`**

Releases all resources associated with the model. Automatically calls `stop()` first. Safe to call even if the model is not initialized.

**`getModels(): Promise`**

Returns available models.

**`getModelName(): string`**

Returns the computed model identifier including quantization and pro suffix (e.g., `'qwen3-0.6b-int8'`, `'lfm2-vl-450m-int4-pro'`).

### useCactusLM Hook

The `useCactusLM` hook manages a `CactusLM` instance with reactive state. When model parameters (`model`, `corpusDir`, `cacheIndex`, `options`) change, the hook creates a new instance and resets all state. The hook automatically cleans up resources when the component unmounts.

#### State

- `completion: string` - Current generated text. Automatically accumulated during streaming. Cleared before each new completion and when calling `reset()` or `destroy()`.
- `isGenerating: boolean` - Whether the model is currently running an operation. Shared by `complete`, `tokenize`, `scoreWindow`, `embed`, and `imageEmbed`.
- `isInitializing: boolean` - Whether the model is initializing.
- `isDownloaded: boolean` - Whether the model is downloaded locally. Automatically checked when the hook mounts or model changes.
- `isDownloading: boolean` - Whether the model is being downloaded.
- `downloadProgress: number` - Download progress (0-1). Reset to `0` after download completes.
- `error: string | null` - Last error message from any operation, or `null` if there is no error. Cleared before starting new operations.

#### Methods

- `download(params?: CactusLMDownloadParams): Promise` - Downloads the model. Updates `isDownloading` and `downloadProgress` state during download. Sets `isDownloaded` to `true` on success.
- `init(): Promise` - Initializes the model for inference. Sets `isInitializing` to `true` during initialization.
- `complete(params: CactusLMCompleteParams): Promise` - Generates text completions. Automatically accumulates tokens in the `completion` state during streaming. Sets `isGenerating` to `true` while generating. Clears `completion` before starting.
- `tokenize(params: CactusLMTokenizeParams): Promise` - Converts text into tokens. Sets `isGenerating` to `true` during operation.
- `scoreWindow(params: CactusLMScoreWindowParams): Promise` - Calculates log-probability scores for a window of tokens. Sets `isGenerating` to `true` during operation.
- `embed(params: CactusLMEmbedParams): Promise` - Generates embeddings for the given text. Sets `isGenerating` to `true` during operation.
- `imageEmbed(params: CactusLMImageEmbedParams): Promise` - Generates embeddings for the given image. Sets `isGenerating` to `true` while generating.
- `ragQuery(params: CactusLMRagQueryParams): Promise` - Queries the RAG corpus index directly. Sets `isGenerating` to `true` during operation.
- `stop(): Promise` - Stops ongoing generation. Clears any errors.
- `reset(): Promise` - Resets the model's internal state, clearing cached context. Also clears the `completion` state.
- `destroy(): Promise` - Releases all resources associated with the model. Clears the `completion` state. Automatically called when the component unmounts.
- `getModels(): Promise` - Returns available models.

### CactusSTT Class

#### Constructor

**`new CactusSTT(params?: CactusSTTParams)`**

**Parameters:**
- `model` - Model slug or absolute path to a model file (default: `'whisper-small'`).
- `options` - Model options for quantization and NPU acceleration:
- `quantization` - Quantization type: `'int4'` | `'int8'` (default: `'int8'`).
- `pro` - Enable NPU-accelerated models (default: `false`).

#### Methods

**`download(params?: CactusSTTDownloadParams): Promise`**

Downloads the model. If the model is already downloaded, returns immediately with progress `1`. Throws an error if a download is already in progress.

**Parameters:**
- `onProgress` - Callback for download progress (0-1).

**`init(): Promise`**

Initializes the model and prepares it for inference. Safe to call multiple times (idempotent). Throws an error if the model is not downloaded yet.

**`transcribe(params: CactusSTTTranscribeParams): Promise`**

Transcribes audio to text with optional streaming support. Accepts either a file path or raw PCM audio samples. Automatically calls `init()` if not already initialized. Throws an error if a generation is already in progress.

**Parameters:**
- `audio` - Path to the audio file or raw PCM samples as a byte array.
- `prompt` - Optional prompt to guide transcription (default: `'<|startoftranscript|><|en|><|transcribe|><|notimestamps|>'`).
- `options` - Transcription options:
- `temperature` - Sampling temperature.
- `topP` - Nucleus sampling threshold.
- `topK` - Top-K sampling limit.
- `maxTokens` - Maximum number of tokens to generate (default: `384`).
- `stopSequences` - Array of strings to stop generation.
- `useVad` - Whether to apply VAD to strip silence before transcription (default: `true`).
- `telemetryEnabled` - Enable telemetry for this request (default: `true`).
- `confidenceThreshold` - Confidence threshold for quality assessment (default: `0.7`).
- `cloudHandoffThreshold` - Max entropy threshold above which cloud handoff is triggered.
- `includeStopSequences` - Whether to include stop sequences in the response (default: `false`).
- `onToken` - Callback for streaming tokens.

**`streamTranscribeStart(options?: CactusSTTStreamTranscribeStartOptions): Promise`**

Starts a streaming transcription session. Automatically calls `init()` if not already initialized. If a session is already active, returns immediately.

**Parameters:**
- `confirmationThreshold` - Fuzzy match ratio required to confirm a transcription segment (default: `0.99`).
- `minChunkSize` - Minimum number of audio samples before processing (default: `32000`).
- `telemetryEnabled` - Enable telemetry for this session (default: `true`).
- `language` - Language code for transcription (e.g., `'en'`, `'es'`, `'fr'`). If not set, language is auto-detected.

**`streamTranscribeProcess(params: CactusSTTStreamTranscribeProcessParams): Promise`**

Feeds audio samples into the streaming session and returns the current transcription state. Throws an error if no session is active.

**Parameters:**
- `audio` - PCM audio samples as a byte array.

**`streamTranscribeStop(): Promise`**

Stops the streaming session and returns the final confirmed transcription text. Throws an error if no session is active.

**`detectLanguage(params: CactusSTTDetectLanguageParams): Promise`**

Detects the spoken language in the given audio. Automatically calls `init()` if not already initialized. Throws an error if a generation is already in progress.

**Parameters:**
- `audio` - Path to the audio file or raw PCM samples as a byte array.
- `options`:
- `useVad` - Whether to apply VAD before detection (default: `true`).

**`audioEmbed(params: CactusSTTAudioEmbedParams): Promise`**

Generates embeddings for the given audio file. Automatically calls `init()` if not already initialized. Throws an error if a generation is already in progress.

**Parameters:**
- `audioPath` - Path to the audio file.

**`stop(): Promise`**

Stops ongoing transcription or embedding generation.

**`reset(): Promise`**

Resets the model's internal state. Automatically calls `stop()` first.

**`destroy(): Promise`**

Releases all resources associated with the model. Stops any active streaming session. Automatically calls `stop()` first. Safe to call even if the model is not initialized.

**`getModels(): Promise`**

Returns available speech-to-text models.

**`getModelName(): string`**

Returns the computed model identifier including quantization and pro suffix (e.g., `'whisper-small-int8'`).

### useCactusSTT Hook

The `useCactusSTT` hook manages a `CactusSTT` instance with reactive state. When model parameters (`model`, `options`) change, the hook creates a new instance and resets all state. The hook automatically cleans up resources when the component unmounts.

#### State

- `transcription: string` - Current transcription text. Automatically accumulated during streaming. Cleared before each new transcription and when calling `reset()` or `destroy()`.
- `streamTranscribeConfirmed: string` - Accumulated confirmed text from the active streaming session. Updated after each successful `streamTranscribeProcess` call and finalized by `streamTranscribeStop`.
- `streamTranscribePending: string` - Uncommitted (in-progress) text from the current audio chunk. Cleared when the session stops.
- `isGenerating: boolean` - Whether the model is currently transcribing or embedding. Both operations share this flag.
- `isStreamTranscribing: boolean` - Whether a streaming transcription session is currently active.
- `isInitializing: boolean` - Whether the model is initializing.
- `isDownloaded: boolean` - Whether the model is downloaded locally. Automatically checked when the hook mounts or model changes.
- `isDownloading: boolean` - Whether the model is being downloaded.
- `downloadProgress: number` - Download progress (0-1). Reset to `0` after download completes.
- `error: string | null` - Last error message from any operation, or `null` if there is no error. Cleared before starting new operations.

#### Methods

- `download(params?: CactusSTTDownloadParams): Promise` - Downloads the model. Updates `isDownloading` and `downloadProgress` state during download. Sets `isDownloaded` to `true` on success.
- `init(): Promise` - Initializes the model for inference. Sets `isInitializing` to `true` during initialization.
- `transcribe(params: CactusSTTTranscribeParams): Promise` - Transcribes audio to text. Automatically accumulates tokens in the `transcription` state during streaming. Sets `isGenerating` to `true` while generating. Clears `transcription` before starting.
- `audioEmbed(params: CactusSTTAudioEmbedParams): Promise` - Generates embeddings for the given audio. Sets `isGenerating` to `true` during operation.
- `streamTranscribeStart(options?: CactusSTTStreamTranscribeStartOptions): Promise` - Starts a streaming transcription session. If a session is already active, returns immediately. Clears `streamTranscribeConfirmed` and `streamTranscribePending` before starting. Sets `isStreamTranscribing` to `true`.
- `streamTranscribeProcess(params: CactusSTTStreamTranscribeProcessParams): Promise` - Feeds audio and returns incremental results. Appends confirmed text to `streamTranscribeConfirmed` and updates `streamTranscribePending`.
- `streamTranscribeStop(): Promise` - Stops the session and returns the final result. Sets `isStreamTranscribing` to `false`. Appends final confirmed text to `streamTranscribeConfirmed` and clears `streamTranscribePending`.
- `stop(): Promise` - Stops ongoing generation. Clears any errors.
- `reset(): Promise` - Resets the model's internal state. Also clears the `transcription` state.
- `destroy(): Promise` - Releases all resources associated with the model. Clears the `transcription`, `streamTranscribeConfirmed`, and `streamTranscribePending` state. Automatically called when the component unmounts.
- `getModels(): Promise` - Returns available speech-to-text models.

### CactusAudio Class

#### Constructor

**`new CactusAudio(params?: CactusAudioParams)`**

**Parameters:**
- `model` - Model slug or absolute path to an audio model file (default: `'silero-vad'`).
- `options` - Model options:
- `quantization` - Quantization type: `'int4'` | `'int8'` (default: `'int8'`).
- `pro` - Enable NPU-accelerated models (default: `false`).

#### Methods

**`download(params?: CactusAudioDownloadParams): Promise`**

Downloads the audio model. If the model is already downloaded, returns immediately with progress `1`. Throws an error if a download is already in progress.

**Parameters:**
- `onProgress` - Callback for download progress (0-1).

**`init(): Promise`**

Initializes the audio model. Safe to call multiple times (idempotent). Throws an error if the model is not downloaded yet.

**`vad(params: CactusAudioVADParams): Promise`**

Runs voice activity detection on the given audio. Automatically calls `init()` if not already initialized.

**Parameters:**
- `audio` - Path to the audio file or raw PCM samples as a byte array.
- `options` - VAD options:
- `threshold` - Speech probability threshold (default: model default).
- `negThreshold` - Silence probability threshold.
- `minSpeechDurationMs` - Minimum speech segment duration in ms.
- `maxSpeechDurationS` - Maximum speech segment duration in seconds.
- `minSilenceDurationMs` - Minimum silence duration before ending a segment.
- `speechPadMs` - Padding added to each speech segment in ms.
- `windowSizeSamples` - Processing window size in samples.
- `samplingRate` - Audio sampling rate.
- `minSilenceAtMaxSpeech` - Minimum silence at max speech duration.
- `useMaxPossSilAtMaxSpeech` - Whether to use maximum possible silence at max speech.

**`diarize(params: CactusAudioDiarizeParams): Promise`**

Runs speaker diarization on the given audio. Automatically calls `init()` if not already initialized.

**Parameters:**
- `audio` - Path to the audio file or raw PCM samples as a byte array.
- `options` - Diarize options:
- `stepMs` - Step size in milliseconds.
- `threshold` - Diarization threshold.
- `numSpeakers` - Expected number of speakers.
- `minSpeakers` - Minimum number of speakers.
- `maxSpeakers` - Maximum number of speakers.

**`embedSpeaker(params: CactusAudioEmbedSpeakerParams): Promise`**

Extracts a speaker embedding vector from the given audio. Automatically calls `init()` if not already initialized.

**Parameters:**
- `audio` - Path to the audio file or raw PCM samples as a byte array.
- `options` - Optional speaker embedding options:
- `stepMs` - Step size in milliseconds.
- `threshold` - Embedding threshold.
- `maskWeights` - Per-frame mask weights for speaker-specific embedding extraction (from diarization).
- `maskNumFrames` - Number of frames for the mask weights.

**`destroy(): Promise`**

Releases all resources associated with the model. Safe to call even if the model is not initialized.

**`getModels(): Promise`**

Returns available audio models.

**`getModelName(): string`**

Returns the computed model identifier including quantization and pro suffix (e.g., `'silero-vad-int8'`).

### useCactusAudio Hook

The `useCactusAudio` hook manages a `CactusAudio` instance with reactive state. When model parameters (`model`, `options`) change, the hook creates a new instance and resets all state. The hook automatically cleans up resources when the component unmounts.

#### State

- `isInitializing: boolean` - Whether the model is initializing.
- `isDownloaded: boolean` - Whether the model is downloaded locally. Automatically checked when the hook mounts or model changes.
- `isDownloading: boolean` - Whether the model is being downloaded.
- `downloadProgress: number` - Download progress (0-1). Reset to `0` after download completes.
- `error: string | null` - Last error message, or `null`.

#### Methods

- `download(params?: CactusAudioDownloadParams): Promise` - Downloads the model. Updates `isDownloading` and `downloadProgress` state during download. Sets `isDownloaded` to `true` on success.
- `init(): Promise` - Initializes the model.
- `vad(params: CactusAudioVADParams): Promise` - Runs voice activity detection.
- `diarize(params: CactusAudioDiarizeParams): Promise` - Runs speaker diarization.
- `embedSpeaker(params: CactusAudioEmbedSpeakerParams): Promise` - Extracts a speaker embedding.
- `destroy(): Promise` - Releases all resources. Automatically called when the component unmounts.
- `getModels(): Promise` - Returns available audio models.

### CactusIndex Class

#### Constructor

**`new CactusIndex(name: string, embeddingDim: number)`**

**Parameters:**
- `name` - Name of the index.
- `embeddingDim` - Dimension of the embedding vectors.

#### Methods

**`init(): Promise`**

Initializes the index and prepares it for operations. Must be called before using any other methods.

**`add(params: CactusIndexAddParams): Promise`**

Adds documents with their embeddings and metadata to the index.

**Parameters:**
- `ids` - Array of document IDs.
- `documents` - Array of document texts.
- `embeddings` - Array of embedding vectors (each vector must match `embeddingDim`).
- `metadatas` - Optional array of metadata strings.

**`query(params: CactusIndexQueryParams): Promise`**

Searches for similar documents using embedding vectors.

**Parameters:**
- `embeddings` - Array of query embedding vectors.
- `options` - Query options:
- `topK` - Number of top results to return (default: 10).
- `scoreThreshold` - Minimum similarity score threshold (default: -1.0).

**`get(params: CactusIndexGetParams): Promise`**

Retrieves documents by their IDs.

**Parameters:**
- `ids` - Array of document IDs to retrieve.

**`delete(params: CactusIndexDeleteParams): Promise`**

Deletes documents from the index by their IDs.

**Parameters:**
- `ids` - Array of document IDs to delete.

**`compact(): Promise`**

Optimizes the index by removing deleted documents and reorganizing data for better performance. Call after a series of deletions.

**`destroy(): Promise`**

Releases all resources associated with the index from memory.

### useCactusIndex Hook

The `useCactusIndex` hook manages a `CactusIndex` instance with reactive state. When index parameters (`name` or `embeddingDim`) change, the hook creates a new instance and resets all state. The hook automatically cleans up resources when the component unmounts.

#### State

- `isInitializing: boolean` - Whether the index is initializing.
- `isProcessing: boolean` - Whether the index is processing an operation (add, query, get, delete, or compact).
- `error: string | null` - Last error message from any operation, or `null` if there is no error. Cleared before starting new operations.

#### Methods

- `init(): Promise` - Initializes the index. Sets `isInitializing` to `true` during initialization.
- `add(params: CactusIndexAddParams): Promise` - Adds documents to the index. Sets `isProcessing` to `true` during operation.
- `query(params: CactusIndexQueryParams): Promise` - Searches for similar documents. Sets `isProcessing` to `true` during operation.
- `get(params: CactusIndexGetParams): Promise` - Retrieves documents by IDs. Sets `isProcessing` to `true` during operation.
- `delete(params: CactusIndexDeleteParams): Promise` - Deletes documents. Sets `isProcessing` to `true` during operation.
- `compact(): Promise` - Optimizes the index. Sets `isProcessing` to `true` during operation.
- `destroy(): Promise` - Releases all resources. Automatically called when the component unmounts.

### getRegistry

**`getRegistry(): Promise<{ [key: string]: CactusModel }>`**

Returns all available models from HuggingFace, keyed by model slug. Result is cached across calls.

```typescript
import { getRegistry } from 'cactus-react-native';

const registry = await getRegistry();
const model = registry['qwen3-0.6b'];
console.log(model);
```

## Type Definitions

### CactusLMParams

```typescript
interface CactusLMParams {
model?: string;
corpusDir?: string;
cacheIndex?: boolean;
options?: CactusModelOptions;
}
```

### CactusLMDownloadParams

```typescript
interface CactusLMDownloadParams {
onProgress?: (progress: number) => void;
}
```

### CactusLMMessage

```typescript
interface CactusLMMessage {
role: 'user' | 'assistant' | 'system';
content?: string;
images?: string[];
}
```

### CactusLMCompleteOptions

```typescript
interface CactusLMCompleteOptions {
temperature?: number;
topP?: number;
topK?: number;
maxTokens?: number;
stopSequences?: string[];
forceTools?: boolean;
telemetryEnabled?: boolean;
confidenceThreshold?: number;
toolRagTopK?: number;
includeStopSequences?: boolean;
useVad?: boolean;
enableThinking?: boolean;
}
```

### CactusLMTool

```typescript
interface CactusLMTool {
name: string;
description: string;
parameters: {
type: 'object';
properties: {
[key: string]: {
type: string;
description: string;
};
};
required: string[];
};
}
```

### CactusLMCompleteParams

```typescript
interface CactusLMCompleteParams {
messages: CactusLMMessage[];
options?: CactusLMCompleteOptions;
tools?: CactusLMTool[];
onToken?: (token: string) => void;
audio?: number[];
}
```

### CactusLMPrefillParams

```typescript
interface CactusLMPrefillParams {
messages: CactusLMMessage[];
options?: CactusLMCompleteOptions;
tools?: CactusLMTool[];
audio?: number[];
}
```

### CactusLMRagQueryParams

```typescript
interface CactusLMRagQueryParams {
query: string;
topK?: number;
}
```

### CactusLMRagQueryChunk

```typescript
interface CactusLMRagQueryChunk {
score: number;
source: string;
content: string;
}
```

### CactusLMRagQueryResult

```typescript
interface CactusLMRagQueryResult {
chunks: CactusLMRagQueryChunk[];
error?: string;
}
```

### CactusLMPrefillResult

```typescript
interface CactusLMPrefillResult {
success: boolean;
error: string | null;
prefillTokens: number;
prefillTps: number;
totalTimeMs: number;
ramUsageMb: number;
}
```

### CactusLMCompleteResult

```typescript
interface CactusLMCompleteResult {
success: boolean;
response: string;
thinking?: string;
functionCalls?: {
name: string;
arguments: { [key: string]: any };
}[];
cloudHandoff?: boolean;
confidence?: number;
timeToFirstTokenMs: number;
totalTimeMs: number;
prefillTokens: number;
prefillTps: number;
decodeTokens: number;
decodeTps: number;
totalTokens: number;
ramUsageMb?: number;
}
```

### CactusLMTokenizeParams

```typescript
interface CactusLMTokenizeParams {
text: string;
}
```

### CactusLMTokenizeResult

```typescript
interface CactusLMTokenizeResult {
tokens: number[];
}
```

### CactusLMScoreWindowParams

```typescript
interface CactusLMScoreWindowParams {
tokens: number[];
start: number;
end: number;
context: number;
}
```

### CactusLMScoreWindowResult

```typescript
interface CactusLMScoreWindowResult {
score: number;
}
```

### CactusLMEmbedParams

```typescript
interface CactusLMEmbedParams {
text: string;
normalize?: boolean;
}
```

### CactusLMEmbedResult

```typescript
interface CactusLMEmbedResult {
embedding: number[];
}
```

### CactusLMImageEmbedParams

```typescript
interface CactusLMImageEmbedParams {
imagePath: string;
}
```

### CactusLMImageEmbedResult

```typescript
interface CactusLMImageEmbedResult {
embedding: number[];
}
```

### CactusModel

```typescript
interface CactusModel {
slug: string;
capabilities: string[];
quantization: {
int4: {
sizeMb: number;
url: string;
pro?: {
apple: string;
};
};
int8: {
sizeMb: number;
url: string;
pro?: {
apple: string;
};
};
};
}
```

### CactusModelOptions

```typescript
interface CactusModelOptions {
quantization?: 'int4' | 'int8';
pro?: boolean;
}
```

### CactusSTTParams

```typescript
interface CactusSTTParams {
model?: string;
options?: CactusModelOptions;
}
```

### CactusSTTDownloadParams

```typescript
interface CactusSTTDownloadParams {
onProgress?: (progress: number) => void;
}
```

### CactusSTTTranscribeOptions

```typescript
interface CactusSTTTranscribeOptions {
temperature?: number;
topP?: number;
topK?: number;
maxTokens?: number;
stopSequences?: string[];
useVad?: boolean;
telemetryEnabled?: boolean;
confidenceThreshold?: number;
cloudHandoffThreshold?: number;
includeStopSequences?: boolean;
}
```

### CactusSTTTranscribeParams

```typescript
interface CactusSTTTranscribeParams {
audio: string | number[];
prompt?: string;
options?: CactusSTTTranscribeOptions;
onToken?: (token: string) => void;
}
```

### CactusSTTTranscribeResult

```typescript
interface CactusSTTTranscribeResult {
success: boolean;
response: string;
cloudHandoff?: boolean;
confidence?: number;
timeToFirstTokenMs: number;
totalTimeMs: number;
prefillTokens: number;
prefillTps: number;
decodeTokens: number;
decodeTps: number;
totalTokens: number;
ramUsageMb?: number;
}
```

### CactusSTTAudioEmbedParams

```typescript
interface CactusSTTAudioEmbedParams {
audioPath: string;
}
```

### CactusSTTAudioEmbedResult

```typescript
interface CactusSTTAudioEmbedResult {
embedding: number[];
}
```

### CactusSTTStreamTranscribeStartOptions

```typescript
interface CactusSTTStreamTranscribeStartOptions {
confirmationThreshold?: number;
minChunkSize?: number;
telemetryEnabled?: boolean;
language?: string;
}
```

### CactusSTTStreamTranscribeProcessParams

```typescript
interface CactusSTTStreamTranscribeProcessParams {
audio: number[];
}
```

### CactusSTTStreamTranscribeProcessResult

```typescript
interface CactusSTTStreamTranscribeProcessResult {
success: boolean;
confirmed: string;
pending: string;
bufferDurationMs?: number;
confidence?: number;
cloudHandoff?: boolean;
cloudResult?: string;
cloudJobId?: number;
cloudResultJobId?: number;
timeToFirstTokenMs?: number;
totalTimeMs?: number;
prefillTokens?: number;
prefillTps?: number;
decodeTokens?: number;
decodeTps?: number;
totalTokens?: number;
ramUsageMb?: number;
}
```

### CactusSTTStreamTranscribeStopResult

```typescript
interface CactusSTTStreamTranscribeStopResult {
success: boolean;
confirmed: string;
}
```

### CactusSTTDetectLanguageOptions

```typescript
interface CactusSTTDetectLanguageOptions {
useVad?: boolean;
}
```

### CactusSTTDetectLanguageParams

```typescript
interface CactusSTTDetectLanguageParams {
audio: string | number[];
options?: CactusSTTDetectLanguageOptions;
}
```

### CactusSTTDetectLanguageResult

```typescript
interface CactusSTTDetectLanguageResult {
language: string;
confidence?: number;
}
```

### CactusAudioParams

```typescript
interface CactusAudioParams {
model?: string;
options?: CactusModelOptions;
}
```

### CactusAudioDownloadParams

```typescript
interface CactusAudioDownloadParams {
onProgress?: (progress: number) => void;
}
```

### CactusAudioVADOptions

```typescript
interface CactusAudioVADOptions {
threshold?: number;
negThreshold?: number;
minSpeechDurationMs?: number;
maxSpeechDurationS?: number;
minSilenceDurationMs?: number;
speechPadMs?: number;
windowSizeSamples?: number;
samplingRate?: number;
minSilenceAtMaxSpeech?: number;
useMaxPossSilAtMaxSpeech?: boolean;
}
```

### CactusAudioVADSegment

```typescript
interface CactusAudioVADSegment {
start: number;
end: number;
}
```

### CactusAudioVADResult

```typescript
interface CactusAudioVADResult {
segments: CactusAudioVADSegment[];
totalTime: number;
ramUsage: number;
}
```

### CactusAudioVADParams

```typescript
interface CactusAudioVADParams {
audio: string | number[];
options?: CactusAudioVADOptions;
}
```

### CactusAudioDiarizeOptions

```typescript
interface CactusAudioDiarizeOptions {
stepMs?: number;
threshold?: number;
numSpeakers?: number;
minSpeakers?: number;
maxSpeakers?: number;
}
```

### CactusAudioDiarizeParams

```typescript
interface CactusAudioDiarizeParams {
audio: string | number[];
options?: CactusAudioDiarizeOptions;
}
```

### CactusAudioDiarizeResult

```typescript
interface CactusAudioDiarizeResult {
success: boolean;
error: string | null;
numSpeakers: number;
scores: number[];
totalTimeMs: number;
ramUsageMb: number;
}
```

### CactusAudioEmbedSpeakerOptions

```typescript
interface CactusAudioEmbedSpeakerOptions {
stepMs?: number;
threshold?: number;
maskWeights?: number[];
maskNumFrames?: number;
}
```

### CactusAudioEmbedSpeakerParams

```typescript
interface CactusAudioEmbedSpeakerParams {
audio: string | number[];
options?: CactusAudioEmbedSpeakerOptions;
}
```

### CactusAudioEmbedSpeakerResult

```typescript
interface CactusAudioEmbedSpeakerResult {
success: boolean;
error: string | null;
embedding: number[];
totalTimeMs: number;
ramUsageMb: number;
}
```

### CactusIndexParams

```typescript
interface CactusIndexParams {
name: string;
embeddingDim: number;
}
```

### CactusIndexAddParams

```typescript
interface CactusIndexAddParams {
ids: number[];
documents: string[];
embeddings: number[][];
metadatas?: string[];
}
```

### CactusIndexGetParams

```typescript
interface CactusIndexGetParams {
ids: number[];
}
```

### CactusIndexGetResult

```typescript
interface CactusIndexGetResult {
documents: string[];
metadatas: string[];
embeddings: number[][];
}
```

### CactusIndexQueryOptions

```typescript
interface CactusIndexQueryOptions {
topK?: number;
scoreThreshold?: number;
}
```

### CactusIndexQueryParams

```typescript
interface CactusIndexQueryParams {
embeddings: number[][];
options?: CactusIndexQueryOptions;
}
```

### CactusIndexQueryResult

```typescript
interface CactusIndexQueryResult {
ids: number[][];
scores: number[][];
}
```

### CactusIndexDeleteParams

```typescript
interface CactusIndexDeleteParams {
ids: number[];
}
```

## Performance Tips

- **Model Selection** - Choose smaller models for faster inference on mobile devices.
- **Memory Management** - Always call `destroy()` when you're done with models to free up resources.
- **VAD** - Use `useVad: true` (the default) when transcribing audio with silence, to strip non-speech regions and speed up transcription.

## Example App

Check out [our example app](/example) for a complete React Native implementation.