https://github.com/zsxkib/cog-hibiki

Cog wrapper for Hibiki: High-Fidelity Simultaneous Speech-To-Speech Translation
https://github.com/zsxkib/cog-hibiki

Last synced: 2 months ago
JSON representation

Cog wrapper for Hibiki: High-Fidelity Simultaneous Speech-To-Speech Translation

Host: GitHub
URL: https://github.com/zsxkib/cog-hibiki
Owner: zsxkib
License: mit
Created: 2025-02-10T16:59:31.000Z (8 months ago)
Default Branch: main
Last Pushed: 2025-02-11T09:27:03.000Z (8 months ago)
Last Synced: 2025-02-11T10:31:27.179Z (8 months ago)
Language: Python
Size: 6.84 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Hibiki: Real-Time Speech Translation

[[Paper]][hibiki] | [[Samples]](https://huggingface.co/spaces/kyutai/hibiki-samples) | [[HuggingFace Models]](https://huggingface.co/collections/kyutai/hibiki-fr-en-67a48835a3d50ee55d37c2b5)

Hibiki is a state-of-the-art model for **real-time speech-to-speech translation** that maintains voice characteristics while translating. It works with French-to-English translation and can run locally on consumer hardware.

## Quick Start

Run translation with a single command using Cog:

```bash
sudo cog predict -i audio_input=@sample_fr_hibiki_crepes.mp3
```

This will translate the sample French audio file to English while preserving voice characteristics. Replace with your own `.mp3` file for custom translations.

## Key Features
- 🎙️ **Voice preservation** through classifier-free guidance
- ⏱️ **Real-time processing** with 12.5Hz framerate
- 🔊 **Natural-sounding output** in target language
- 📜 Simultaneous text transcription

[hibiki]: https://arxiv.org/abs/2502.03382

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/zsxkib/cog-hibiki

Awesome Lists containing this project

README