An open API service indexing awesome lists of open source software.

https://github.com/ave-sergeev/morph

Voice-to-Vector embedding service (Rust, Tonic) (Q2:2025)
https://github.com/ave-sergeev/morph

audio embeddings grpc rust vector-search voice-to-vector

Last synced: 7 months ago
JSON representation

Voice-to-Vector embedding service (Rust, Tonic) (Q2:2025)

Awesome Lists containing this project

README

          

## Morph

---

[Russian version](https://github.com/Ave-Sergeev/Morph/blob/main/README.ru.md)

### Description

Example of gRPC server implementation in Rust, designed to convert voice into vector representations (embeddings).
This service can be part of a search engine, vector analysis systems, etc.
It uses an offline model (preloaded), and performs computations on CPU (without using GPU).

P.S. If you want to enable support for GPU calculations, you can easily do it by making small changes to the project.

### Models

This service is implemented specifically for the input/output of this model [TitaNet-L](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/titanet_large).

First you need to download the model, and convert it to `ONNX` format.
Then it should be placed in the `/model` directory of the project.

### Configuration

In `config.yaml` you can set the value for the fields:

- `Server`
- `host` - host to start gRPC server.
- `port` - port to start gRPC server.
- `Logging`
- `log_level` - logging level of detail.
- `ModelSettings`
- `path` - model path.
- `sample_rate` - sampling frequency (Hz).
- `window_length` - size of the analysis window.
- `frame_length` - frame size.
- `frame_step` - step between frames.
- `fft_size` - FFT size (number of points).
- `n_mels` - number of mel filters.
- `ref_value` - reference value.
- `amin` - minimum amplitude value.

### Usage

When using your own audio file, make sure that it has the correct format - WAV PCM_S16LE mono (sample_rate 16000Hz).

Otherwise, you will get an error as the service checks the audio file for consistency.
If you have `ffmpeg` installed, you can use it, for conversion.

To send a request to the server, take `voice_embeddings.proto` (from the `./proto` directory), and use it in your client.
You can check if it works, for example, via `Postman`.

Query structure for rpc `EmbedText`:
```Json
{
"id": "123e4567-e89b-12d3-a456-426614174000",
"content": "audio in base64 format"
}
```

As a result of the recognition, the server will return JSON of the form:
```Json
{
"id": "123e4567-e89b-12d3-a456-426614174000",
"embeddings": [
-0.028818072751164436,
0.04423859715461731,
-0.004005502909421921,
"...",
-0.044924844056367874,
-0.02066687121987343
]
}
```

### Local startup

1) To install `Rust` on Unix-like systems (MacOS, Linux, ...) - run the command in the terminal.
After the download is complete, you will get the latest stable version of Rust for your platform, as well as the latest version of Cargo.

```shell
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
```

2) Run the following command in the terminal to verify.

```shell
cargo --version
```

3) Open the project and run the commands.

Check the code to see if it can be compiled (without running it).
```shell
cargo check
```

Build + run the project (in release mode with optimizations).
```shell
cargo run --release
```

UDP: If you have Windows, see [Instructions here](https://forge.rust-lang.org/infra/other-installation-methods.html).

### License

This project is licensed under the [MIT License](https://opensource.org/licenses/MIT) or [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0), your choice.