https://github.com/milosgajdos/playht_rs

PlayHT TTS Rust crate
https://github.com/milosgajdos/playht_rs
ai rust rust-lang speech-synthesis text-to-speech tts tts-api
Last synced: about 2 months ago
JSON representation
PlayHT TTS Rust crate
Host: GitHub
URL: https://github.com/milosgajdos/playht_rs
Owner: milosgajdos
License: apache-2.0
Created: 2024-04-02T14:12:33.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2024-04-14T07:47:32.000Z (about 1 year ago)
Last Synced: 2024-04-14T10:58:35.956Z (about 1 year ago)
Topics: ai, rust, rust-lang, speech-synthesis, text-to-speech, tts, tts-api
Language: Rust
Homepage:
Size: 65.4 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

        # playht_rs

[![Crates.io Version](https://img.shields.io/crates/v/playht_rs.svg)](https://crates.io/crates/playht_rs)

[![Build Status](https://github.com/milosgajdos/playht_rs/actions/workflows/ci.yaml/badge.svg?branch=main)](https://github.com/milosgajdos/playht_rs/actions?query=workflow%3ACI)

[![License: Apache-2.0](https://img.shields.io/badge/License-Apache--2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)

An unofficial [play.ht](https://play.ht) Rust API client crate. Similar to the [Go module](https://github.com/milosgajdos/go-playht) implementation.

In order to use this create you must create an account on [play.ht](https://play.ht), generate an API secret and retrieve your User ID.

See the official docs [here](https://docs.play.ht/reference/api-authentication) for more info.

# Basics

There are two ways to create audio/speech from the text using the API:

- Job: audio generation is done in async; when you create a job you can monitor its progress via [SSE](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events)

- Stream: a real-time audio stream available immediately as soon as the stream has been created via the API

The API also allows you to clone a voice using a small sample of limited size. See the [docs](https://docs.play.ht/reference/api-create-instant-voice-clone).

# Get started

> [!IMPORTANT]

> Before you attempt to run any of the samples you must set a couple of environment variables.

> These are automatically read by the client when it gets created; you can override them in your own code.

- `PLAYHT_SECRET_KEY`: API secret key

- `PLAYHT_USER_ID`: Play.HT User ID

Check the crate:

```

cargo check

```

Build the crate:

```shell

cargo build

```

## Examples

There are quite a few examples available in the [examples](./examples) directory so please do have a look. They could give you some idea about how to use this crate. Below we list a few code samples:

### Clone Voice

Clone a new voice from a sample audio file.

> [!NOTE]

> You must pass the sample file and the mime type as cli arguments

```rust

//! `cargo run --example clone_voices`

use playht_rs::{

    api::{self, voice::CloneVoiceFileRequest, voice::DeleteClonedVoiceRequest},

    prelude::*,

};

use tokio;

#[tokio::main]

async fn main() -> Result<()> {

    let mut args = std::env::args().skip(1);

    let sample_file = args.next().unwrap();

    let mime_type = args.next().unwrap();

    let req = CloneVoiceFileRequest {

        sample_file,

        mime_type,

        voice_name: "foo-bar".to_owned(),

    };

    let client = api::Client::new();

    let voice = client.clone_voice_from_file(req).await?;

    println!("Got voice clone: {:?}", voice);

    let cloned_voices = client.get_cloned_voices().await?;

    println!("Got voice clones: {:?}", cloned_voices);

    let req = DeleteClonedVoiceRequest { voice_id: voice.id };

    let delete_resp = client.delete_cloned_voice(req).await?;

    println!("Got delete response: {:?}", delete_resp);

    Ok(())

}

```

### Create async TTS Jobs

Create an async TTS job and fetch its metadata.

> [!NOTE]

> The async TTS job progress can be monitored via the PlayHT API.

```rust

//! `cargo run --example tts_jobs`

use playht_rs::{

    api::{self, job::TTSJobReq, tts::Quality},

    prelude::*,

};

use tokio;

#[tokio::main]

async fn main() -> Result<()> {

    let client = api::Client::new();

    let voices = client.get_stock_voices().await?;

    if voices.is_empty() {

        return Err("No voices available".into());

    }

    let req = TTSJobReq {

        text: Some("What is life?".to_owned()),

        voice: Some(voices[0].id.clone()),

        quality: Some(Quality::Low),

        speed: Some(1.0),

        sample_rate: Some(24000),

        ..Default::default()

    };

    let tts_job = client.create_tts_job(req).await?;

    println!("TTS job created: {:?}", tts_job);

    let tts_job = client.get_tts_job(tts_job.id).await?;

    println!("Got TTS job: {:?}", tts_job);

    Ok(())

}

```

### Stream TTS Audio

Stream TTS audio in real-time into a file.

The file is provided via a cli argument but you can pass async writer implementation such as an audio device tokio wrapper, etc.

> [!NOTE]

> You must pass the output file path as cli argument.

```rust

//! `cargo run --example tts_write_audio_stream -- "foobar.mp3"`

use playht_rs::{

    api::{self, stream::TTSStreamReq, tts::Quality},

    prelude::*,

};

use tokio::{fs::File, io::BufWriter};

#[tokio::main]

async fn main() -> Result<()> {

    let mut args = std::env::args().skip(1);

    let file_path = args.next().unwrap();

    let client = api::Client::new();

    let voices = client.get_stock_voices().await?;

    if voices.is_empty() {

        return Err("No voices available".into());

    }

    let req = TTSStreamReq {

        text: Some("What is life?".to_owned()),

        voice: Some(voices[0].id.to_owned()),

        quality: Some(Quality::Low),

        speed: Some(1.0),

        sample_rate: Some(24000),

        ..Default::default()

    };

    let file = File::create(file_path.clone()).await?;

    let mut w = BufWriter::new(file);

    client.write_audio_stream(&mut w, req).await?;

    println!("Done streaming into {}", file_path);

    Ok(())

}

```

### Play the TTS audio from a file

```rust

//! `cargo run --example play_audio -- "/path/to/audio.mp3"`

use rodio::{Decoder, OutputStream, Sink};

use std::{fs::File, io::BufReader};

fn main() {

    let mut args = std::env::args().skip(1);

    let sound_file = args.next().unwrap();

    let (_stream, stream_handle) = OutputStream::try_default().unwrap();

    let file = BufReader::new(File::open(&sound_file).unwrap());

    let source = Decoder::new(file).unwrap();

    let sink = Sink::try_new(&stream_handle).unwrap();

    sink.append(source);

    sink.sleep_until_end();

}

```

### Play TTS audio stream data

> [!NOTE]

> This does NOT actually do streaming playback!

> It feteches all the data into a buffer and then sends it

> for the playback. If you need a real-time playback stream

> check the `tts_stream_audio` example below.

```rust

//! `cargo run --example tts_play_audio_stream`

use playht_rs::{

    api::{self, stream::TTSStreamReq, tts::Quality},

    prelude::*,

};

use rodio::{Decoder, OutputStream, Sink};

use std::io::Cursor;

#[tokio::main]

async fn main() -> Result<()> {

    let client = api::Client::new();

    let voices = client.get_stock_voices().await?;

    if voices.is_empty() {

        return Err("No voices available".into());

    }

    let req = TTSStreamReq {

        text: Some("What is life?".to_owned()),

        voice: Some(voices[0].id.to_owned()),

        quality: Some(Quality::Low),

        speed: Some(1.0),

        sample_rate: Some(24000),

        ..Default::default()

    };

    let (_stream, stream_handle) = OutputStream::try_default().unwrap();

    let sink = Sink::try_new(&stream_handle).unwrap();

    let mut buffer = Vec::new();

    client.write_audio_stream(&mut buffer, req).await?;

    let source = Decoder::new(Cursor::new(buffer)).unwrap();

    sink.append(source);

    sink.sleep_until_end();

    Ok(())

}

```

### Stream TTS audio in real-time

```rust

//! ` cargo run --example tts_stream_audio`

use bytes::BytesMut;

use playht_rs::{

    api::{self, stream::TTSStreamReq, tts::Quality},

    prelude::*,

};

use rodio::{Decoder, OutputStream, Sink};

use std::io::Cursor;

use tokio_stream::StreamExt;

// NOTE: this might need to be adjusted

const BUFFER_SIZE: usize = 1024 * 10;

#[tokio::main]

async fn main() -> Result<()> {

    let client = api::Client::new();

    let voices = client.get_stock_voices().await?;

    if voices.is_empty() {

        return Err("No voices available for playback".into());

    }

    let client = api::Client::new();

    let req = TTSStreamReq {

        text: Some("What is life?".to_owned()),

        voice: Some(voices[0].id.to_owned()),

        quality: Some(Quality::Low),

        speed: Some(1.0),

        sample_rate: Some(24000),

        ..Default::default()

    };

    let (_stream, stream_handle) = OutputStream::try_default().unwrap();

    let sink = Sink::try_new(&stream_handle).unwrap();

    let mut stream = client.stream_audio(req).await?;

    let mut accumulated = BytesMut::new();

    while let Some(res) = stream.next().await {

        match res {

            Ok(chunk) => {

                accumulated.extend_from_slice(&chunk);

                // Check if there's enough data to attempt decoding

                if accumulated.len() > BUFFER_SIZE {

                    let cursor = Cursor::new(accumulated.clone().freeze().to_vec());

                    match Decoder::new(cursor) {

                        Ok(source) => {

                            sink.append(source);

                            accumulated.clear(); // Clear the buffer on successful append

                        }

                        Err(e) => {

                            eprintln!("Failed to decode received audio: {}", e);

                        }

                    }

                }

            }

            Err(err) => return Err(format!("Playback error: {}", err).into()),

        }

    }

    // Flush any remaining data at the end

    if !accumulated.is_empty() {

        let cursor = Cursor::new(accumulated.to_vec());

        match Decoder::new(cursor) {

            Ok(source) => sink.append(source),

            Err(e) => println!("Remaining data could not be decoded: {}", e),

        }

    }

    sink.sleep_until_end();

    Ok(())

}

```

## Nix

There is a Nix flake vailable which lets you work on the Rust create in a nix shell.

Just run the following command and you are in the business:

```shell

nix develop

```

# TODO

- [ ] gRPC streaming

- [ ] clean up the messy code
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/milosgajdos/playht_rs

Awesome Lists containing this project

README