https://github.com/ericc-ch/edge-tts

Use Microsoft Edge's online text-to-speech service from JS code directly!
https://github.com/ericc-ch/edge-tts

reverse-engineering tts

Last synced: 7 months ago
JSON representation

Use Microsoft Edge's online text-to-speech service from JS code directly!

Host: GitHub
URL: https://github.com/ericc-ch/edge-tts
Owner: ericc-ch
License: mpl-2.0
Created: 2024-08-23T09:04:41.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-02-07T14:00:01.000Z (about 1 year ago)
Last Synced: 2025-04-09T09:51:43.028Z (12 months ago)
Topics: reverse-engineering, tts
Language: TypeScript
Homepage: https://npm.im/@echristian/edge-tts
Size: 231 KB
Stars: 12
Watchers: 1
Forks: 4
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # Edge TTS

> A TypeScript library for generating speech using Microsoft Edge's text-to-speech API

Generate speech from text using Microsoft Edge's text-to-speech service. This library provides access to Edge's TTS capabilities with subtitle generation support and voice customization options.

## Installation

```bash

npm install @echristian/edge-tts

```

## CLI Usage

```bash

# List all available voices grouped by locale

npx @echristian/edge-tts voices

# Generate audio from text

npx @echristian/edge-tts synthesize "Hello world" --audio output.mp3 --voice en-US-AvaNeural

# Generate audio with subtitles

npx @echristian/edge-tts synthesize "Hello world" --audio output.mp3 --subtitle output.srt --voice en-US-AvaNeural

```

## API Usage

```typescript

import { synthesize, synthesizeStream, getVoices } from "@echristian/edge-tts";

// Get available voices

const voices = await getVoices();

console.log(voices); // Array of available voice options

// Basic usage with synthesize()

const { audio, subtitle } = await synthesize({

  text: "Hello, world!",

});

// Stream processing usage

const generator = synthesizeStream({ text: "Hello world" });

for await (const chunk of generator) {

  // chunk is a Uint8Array of raw audio data

  // Process or save each chunk as needed

}

// Collecting all streamed chunks

const chunks: Uint8Array[] = [];

for await (const chunk of synthesizeStream({ text: "Hello world" })) {

  chunks.push(chunk);

}

```

## API

### getVoices(): Promise>

Returns an array of available voices with their properties.

#### Voice Object

| Property     | Type   | Description                    |

| ------------ | ------ | ------------------------------ |

| Name         | string | Full name of the voice         |

| ShortName    | string | Short identifier for the voice |

| Gender       | string | Voice gender (Male/Female)     |

| Locale       | string | Language code and region       |

| FriendlyName | string | Display name for the voice     |

### synthesize(options): Promise

Main function to generate speech from text.

### synthesizeStream(options): AsyncGenerator

Creates an async generator that yields chunks of processed audio data. Each chunk has metadata headers automatically removed.

Uses the same options as `synthesize()`, but without subtitle support:

| Option       | Type   | Default                           | Description               |

| ------------ | ------ | --------------------------------- | ------------------------- |

| text         | string | (required)                        | Text to convert to speech |

| voice        | string | "en-US-AvaNeural"                 | Voice ID to use           |

| language     | string | "en-US"                           | Language code             |

| outputFormat | string | "audio-24khz-96kbitrate-mono-mp3" | Audio format              |

| rate         | string | "default"                         | Speaking rate             |

| pitch        | string | "default"                         | Voice pitch               |

| volume       | string | "default"                         | Audio volume              |

For detailed configuration options, refer to Microsoft's documentation:

- [Available voices and language support](https://learn.microsoft.com/en-us/azure/ai-services/speech-service/language-support?tabs=tts)

- [Audio output formats](https://learn.microsoft.com/en-us/dotnet/api/microsoft.cognitiveservices.speech.speechsynthesisoutputformat?view=azure-dotnet)

- [Pitch, rate, and volumes](https://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-synthesis-markup-voice)

Note: Some options may be limited by Microsoft Edge's service capabilities.

#### GenerateOptions

| Option       | Type            | Default                              | Description               |

| ------------ | --------------- | ------------------------------------ | ------------------------- |

| text         | string          | (required)                           | Text to convert to speech |

| voice        | string          | "en-US-AvaNeural"                    | Voice ID to use           |

| language     | string          | "en-US"                              | Language code             |

| outputFormat | string          | "audio-24khz-96kbitrate-mono-mp3"    | Audio format              |

| rate         | string          | "default"                            | Speaking rate             |

| pitch        | string          | "default"                            | Voice pitch               |

| volume       | string          | "default"                            | Audio volume              |

| subtitle     | SubtitleOptions | { splitBy: "word", wordsPerCue: 10 } | Subtitle options          |

#### SubtitleOptions

| Option         | Type                 | Default | Description                          |

| -------------- | -------------------- | ------- | ------------------------------------ |

| splitBy        | "word" \| "duration" | "word"  | How to split subtitles               |

| wordsPerCue    | number               | 10      | Words per subtitle when using 'word' |

| durationPerCue | number               | 5000    | Duration (ms) when using 'duration'  |

#### GenerateResult

| Property | Type                  | Description          |

| -------- | --------------------- | -------------------- |

| audio    | Blob                  | Generated audio data |

| subtitle | Array | Generated subtitles  |

#### SubtitleResult

| Property | Type   | Description     |

| -------- | ------ | --------------- |

| text     | string | Subtitle text   |

| start    | number | Start time (ms) |

| end      | number | End time (ms)   |

| duration | number | Duration (ms)   |

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ericc-ch/edge-tts

Awesome Lists containing this project

README