https://github.com/osteele/speech-provider

A unified TypeScript interface for browser speech synthesis and Eleven Labs TTS voices
https://github.com/osteele/speech-provider

browser eleven-labs speech-synthesis text-to-speech tts typescript voice

Last synced: 4 months ago
JSON representation

A unified TypeScript interface for browser speech synthesis and Eleven Labs TTS voices

Host: GitHub
URL: https://github.com/osteele/speech-provider
Owner: osteele
License: mit
Created: 2025-03-27T19:46:10.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2025-11-08T07:22:59.000Z (8 months ago)
Last Synced: 2025-11-08T08:25:26.554Z (8 months ago)
Topics: browser, eleven-labs, speech-synthesis, text-to-speech, tts, typescript, voice
Language: TypeScript
Homepage: https://osteele.github.io/speech-provider/
Size: 95.7 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # speech-provider

A unified interface for browser speech synthesis and Eleven Labs voices.

## Installation

```bash

# Using npm

npm install speech-provider

# Using yarn

yarn add speech-provider

# Using bun

bun add speech-provider

```

## Documentation

Full API documentation is available at [https://osteele.github.io/speech-provider/](https://osteele.github.io/speech-provider/).

## Usage

```typescript

import { getVoiceProvider } from 'speech-provider';

// Use browser voices only

const provider = getVoiceProvider({});

// Use Eleven Labs voices if API key is available

const provider = getVoiceProvider({ elevenLabsApiKey: 'your-api-key' });

// Use Eleven Labs with custom cache duration

const provider = getVoiceProvider({

  elevenLabsApiKey: 'your-api-key',

  cacheMaxAge: 86400 // Cache for 1 day

});

// Get voices for a specific language

const voices = await provider.getVoices({ lang: 'en-US', minVoices: 1 });

// Get default voice for a language

const defaultVoice = await provider.getDefaultVoice({ lang: 'en-US' });

// Create and play an utterance

if (defaultVoice) {

  const utterance = defaultVoice.createUtterance('Hello, world!');

  utterance.onstart = () => console.log('Started speaking');

  utterance.onend = () => console.log('Finished speaking');

  utterance.start();

}

```

## Features

- Unified interface for both browser speech synthesis and Eleven Labs voices

- Automatic fallback to browser voices when Eleven Labs API key is not provided

- Typesafe API with TypeScript support

- Simple voice selection by language

- Event listeners for speech start and end events

- Automatic caching of Eleven Labs API responses to reduce API calls

- Configurable cache duration for Eleven Labs responses

## Used In

This package is used in [Mandarin Sentence

Practice](https://mandarin-sentence-practice.osteele.com), a web application for

practicing Mandarin Chinese with listening and translation exercises. The app

uses this package to provide high-quality text-to-speech for Mandarin sentences,

with automatic fallback to browser voices when Eleven Labs is not available.

## API

### `getVoiceProvider(options)`

Creates a voice provider based on the available API keys. Falls back to browser speech synthesis if no API keys are provided.

```typescript

function getVoiceProvider(options: {

  elevenLabsApiKey?: string | null;

  cacheMaxAge?: number; // Cache duration in seconds (default: 1 hour)

}): VoiceProvider;

```

### `createElevenLabsVoiceProvider(apiKey, options?)`

Creates an Eleven Labs voice provider with optional configuration.

```typescript

function createElevenLabsVoiceProvider(

  apiKey: string,

  options?: {

    validateResponses?: boolean;

    printVoiceProperties?: boolean;

    cacheMaxAge?: number; // Cache duration in seconds (default: 1 hour)

  }

): VoiceProvider;

```

### Caching

The library implements automatic caching for Eleven Labs API responses:

- Browser voices are cached automatically by the browser's speech synthesis engine

- Eleven Labs responses are cached using IndexedDB with a default duration of 1 hour

- Cache duration can be configured when creating the provider

- Cached responses are automatically invalidated after the specified duration

- Cache can be disabled by setting `cacheMaxAge: null` in the provider options

Examples of cache configuration:

```typescript

// Use default 1-hour cache

const provider = getVoiceProvider({ elevenLabsApiKey: 'your-api-key' });

// Cache for 1 day

const provider = getVoiceProvider({

  elevenLabsApiKey: 'your-api-key',

  cacheMaxAge: 86400 // 24 hours in seconds

});

// Cache for 1 week

const provider = getVoiceProvider({

  elevenLabsApiKey: 'your-api-key',

  cacheMaxAge: 604800 // 7 days in seconds

});

// Disable caching (preferred approach)

const provider = getVoiceProvider({

  elevenLabsApiKey: 'your-api-key',

  cacheMaxAge: null

});

// Alternative way to disable caching

const provider = getVoiceProvider({

  elevenLabsApiKey: 'your-api-key',

  cacheMaxAge: 0

});

```

### `VoiceProvider` Interface

```typescript

interface VoiceProvider {

  name: string;

  getVoices({ lang, minVoices }: { lang: string; minVoices: number }): Promise;

  getDefaultVoice({ lang }: { lang: string }): Promise;

}

```

### `Voice` Interface

```typescript

interface Voice {

  name: string;

  id: string;

  lang: string;

  provider: VoiceProvider;

  description: string | null;

  createUtterance(text: string): Utterance;

}

```

### `Utterance` Interface

```typescript

interface Utterance {

  start(): void;

  stop(): void;

  set onstart(callback: () => void);

  set onend(callback: () => void);

}

```

## Browser Compatibility

### Browser Speech Synthesis

The browser speech synthesis provider (`BrowserVoiceProvider`) is supported in all modern browsers:

- **Chrome/Edge**: Full support (voices load asynchronously)

- **Firefox**: Full support

- **Safari**: Full support (iOS and macOS)

- **Opera**: Full support

**Note**: Voice availability and quality vary by browser and operating system. Chrome and Edge typically offer the best selection of voices.

### ElevenLabs Provider

The ElevenLabs provider (`ElevenLabsVoiceProvider`) requires:

- **IndexedDB**: For caching API responses (supported in all modern browsers)

- **Fetch API**: For making API requests (supported in all modern browsers)

- **Audio API**: For playing synthesized speech (supported in all modern browsers)

### Minimum Requirements

- Modern browser with ES2022 support

- IndexedDB support (for ElevenLabs caching)

- No Internet Explorer support

### Server-Side Rendering (SSR)

The library is designed for client-side use. When used in SSR environments:

- Browser voice provider gracefully handles the absence of `window.speechSynthesis`

- Returns empty arrays when browser APIs are unavailable

- Safe to import in SSR frameworks (Next.js, Nuxt, etc.) but should only be used client-side

## Contributing

Contributions are welcome! Please read the [CONTRIBUTING.md](CONTRIBUTING.md) guide for details on our code of conduct and the process for submitting pull requests.

## Changelog

See [CHANGELOG.md](CHANGELOG.md) for a list of changes and version history.

## License

Copyright 2025 by Oliver Steele

Available under the MIT License

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/osteele/speech-provider

Awesome Lists containing this project

README