https://github.com/haoheliu/semanticodec-inference

Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.
https://github.com/haoheliu/semanticodec-inference

Last synced: about 2 months ago
JSON representation

Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.

Host: GitHub
URL: https://github.com/haoheliu/semanticodec-inference
Owner: haoheliu
License: mit
Created: 2024-05-04T16:56:43.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2025-03-07T20:52:21.000Z (3 months ago)
Last Synced: 2025-04-09T19:16:27.707Z (about 2 months ago)
Language: Python
Homepage:
Size: 1.97 MB
Stars: 194
Watchers: 5
Forks: 15
Open Issues: 8
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        [![arXiv](https://img.shields.io/badge/arXiv-2405.00233-brightgreen.svg?style=flat-square)](https://arxiv.org/abs/2405.00233)  [![githubio](https://img.shields.io/badge/GitHub.io-Audio_Samples-blue?logo=Github&style=flat-square)](https://haoheliu.github.io/SemantiCodec/) 

# SemantiCodec

Ultra-low bitrate neural audio codec with a better semantic in the latent space.

**Highlight**

- Bitrate: 0.31 kbps - 1.40 kbps

- Token rate: 25, 50, or 100 per second

- cpu, cuda, and mps are supported

# Usage

## Installation

```bash

pip install git+https://github.com/haoheliu/SemantiCodec-inference.git

```

## Encoding and decoding

**Checkpoints will be automatically downloaded when you initialize the SemantiCodec with the following code.**

```python

from semanticodec import SemantiCodec

semanticodec = SemantiCodec(token_rate=100, semantic_vocab_size=16384) 

filepath = "test/test.wav" # audio with arbitrary length

tokens = semanticodec.encode(filepath)

waveform = semanticodec.decode(tokens)

# Save the reconstruction file

import soundfile as sf

sf.write("output.wav", waveform[0,0], 16000)

```

## Other Settings

```python

from semanticodec import SemantiCodec

###############Choose one of the following######################

semanticodec = SemantiCodec(token_rate=100, semantic_vocab_size=32768) # 1.40 kbps

semanticodec = SemantiCodec(token_rate=50, semantic_vocab_size=32768) # 0.70 kbps

semanticodec = SemantiCodec(token_rate=25, semantic_vocab_size=32768) # 0.35 kbps

semanticodec = SemantiCodec(token_rate=100, semantic_vocab_size=16384) # 1.35 kbps

semanticodec = SemantiCodec(token_rate=50, semantic_vocab_size=16384) # 0.68 kbps

semanticodec = SemantiCodec(token_rate=25, semantic_vocab_size=16384) # 0.34 kbps

semanticodec = SemantiCodec(token_rate=100, semantic_vocab_size=8192) # 1.30 kbps

semanticodec = SemantiCodec(token_rate=50, semantic_vocab_size=8192) # 0.65 kbps

semanticodec = SemantiCodec(token_rate=25, semantic_vocab_size=8192) # 0.33 kbps

semanticodec = SemantiCodec(token_rate=100, semantic_vocab_size=4096) # 1.25 kbps

semanticodec = SemantiCodec(token_rate=50, semantic_vocab_size=4096) # 0.63 kbps

semanticodec = SemantiCodec(token_rate=25, semantic_vocab_size=4096) # 0.31 kbps

#####################################

filepath = "test/test.wav"

tokens = semanticodec.encode(filepath)

waveform = semanticodec.decode(tokens)

import soundfile as sf

sf.write("output.wav", waveform[0,0], 16000)

```

If you are interested in reusing the same evaluation pipeline and data in the paper, please refer to this [zenodo repo](https://zenodo.org/records/11047204).

## Citation

If you find this repo helpful, please consider citing in the following format:

```bibtex

@ARTICLE{semanticodec2024,

  author={Liu, Haohe and Xu, Xuenan and Yuan, Yi and Wu, Mengyue and Wang, Wenwu and Plumbley, Mark D.},

  journal={IEEE Journal of Selected Topics in Signal Processing}, 

  title={SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound}, 

  year={2024},

  volume={18},

  number={8},

  pages={1448-1461},

  doi={10.1109/JSTSP.2024.3506286}

}

```

![result](result.png)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/haoheliu/semanticodec-inference

Awesome Lists containing this project

README