https://github.com/magenta/magenta-realtime
https://github.com/magenta/magenta-realtime
Last synced: 12 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/magenta/magenta-realtime
- Owner: magenta
- License: apache-2.0
- Created: 2025-06-12T17:09:21.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-06-20T21:00:14.000Z (12 months ago)
- Last Synced: 2025-06-20T22:18:17.730Z (12 months ago)
- Language: Python
- Size: 1.06 MB
- Stars: 23
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
- awesome-rainmana - magenta/magenta-realtime - Magenta RealTime 2: An Open-Weights Live Music Model (Python)
- awesome-creative-agentic-coding - magenta-realtime - weights live-music model from the Lyria RealTime research line; performed live by Toro y Moi at Google I/O 2025. (Multi-Agent Creative Pipelines / Design, Drawing & Web Surfaces)
- StarryDivineSky - magenta/magenta-realtime - realtime是Google Magenta项目下的一个开源项目,专注于利用机器学习模型实现实时音乐生成。该项目基于TensorFlow框架,提供Python库支持,核心功能包括实时旋律生成(Real-time Melody RNN)和实时鼓点生成(Real-time Drum RNN),能够通过Web Audio API或MIDI接口输出音频,适用于现场表演、互动应用等场景。其工作原理是通过训练好的神经网络模型,实时处理用户输入(如键盘或MIDI设备)并生成对应的音乐片段,支持自定义节奏、音高和风格参数。项目特点包括低延迟音频处理、模块化代码结构以及与主流音乐软件的兼容性,用户可基于预训练模型快速构建实时音乐生成系统。此外,magenta-realtime还提供示例代码和教程,帮助开发者理解模型训练与部署流程,适合音乐技术研究者及开发者使用。项目持续更新,支持多种音乐生成任务,并通过开源社区推动实时音乐AI技术发展。 (语音合成 / 资源传输下载)
README
# Magenta RT: Streaming music generation!
Magenta RealTime is a Python library for streaming music audio generation on
your local device. It is the open source / on device companion to
[MusicFX DJ Mode](https://labs.google/fx/tools/music-fx-dj) and the
[Lyria RealTime API](https://ai.google.dev/gemini-api/docs/music-generation).
This is a 👀 sneak preview of the Magenta RT project. We will have
[more to share](#coming-soon) in the coming weeks including a technical report
and additional features!
See our [blog post](https://g.co/magenta/rt) and
[model card](https://github.com/magenta/magenta-realtime/blob/main/MODEL.md) for
more info.

## Getting started
The fastest way to get started with Magenta RT is to try our official
[Colab Demo](https://colab.research.google.com/github/magenta/magenta-realtime/blob/main/notebooks/Magenta_RT_Demo.ipynb)
which runs in real time on freely available TPUs! Here is a quick
[video walkthrough](https://www.youtube.com/watch?v=SVTuEdeepVs).
If you have a machine with a TPU or GPU, you may also following the installation
instructions below for running locally.
## Local installation
Install the latest version:
```sh
# With GPU support:
pip install 'git+https://github.com/magenta/magenta-realtime#egg=magenta_rt[gpu]'
# With TPU support:
pip install 'git+https://github.com/magenta/magenta-realtime#egg=magenta_rt[tpu]'
# CPU only
pip install 'git+https://github.com/magenta/magenta-realtime'
```
Or, clone and install for local editing:
```sh
git clone https://github.com/magenta/magenta-realtime.git && cd magenta-realtime
pip install -e .[gpu]
```
## Examples
### Generating audio with Magenta RT
Magenta RT generates audio in short chunks (2s) given a finite amount of past
context (10s). We use crossfading to mitigate boundary artifacts between chunks.
More details on our model are coming soon in a technical report!
```py
from magenta_rt import audio, system
from IPython.display import display, Audio
num_seconds = 10
mrt = system.MagentaRT()
style = system.embed_style('funk')
chunks = []
state = None
for i in range(round(num_seconds / mrt.config.chunk_length)):
state, chunk = mrt.generate_chunk(state=state, style=style)
chunks.append(chunk)
generated = audio.concatenate(crossfade_time=mrt.crossfade_length)
display(Audio(generated.samples.swapaxes(0, 1), rate=mrt.sample_rate))
```
### Blending text and audio styles with MusicCoCa
MusicCoCa is a joint embedding model of text and audio styles. Magenta RT is
conditioned on MusicCoCa embeddings allowing for seamless blending of styles
using any number of text and audio prompts.
```py
from magenta_rt import audio, musiccoca
style_model = musiccoca.MusicCoCa()
my_audio = audio.Waveform.from_file('myjam.mp3')
weighted_styles = [
(2.0, my_audio),
(1.0, 'heavy metal'),
]
weights = np.array([w for w, _ in weighted_styles])
styles = style_model.embed([s for _, s in weighted_styles])
weights_norm = weights / weights.sum()
blended = (weights_norm[:, np.newaxis] * styles).mean(axis=0)
```
### Tokenizing audio with SpectroStream
SpectroStream is a discrete audio codec model operating on high-fidelity music
audio (stereo, 48kHz). Under the hood, Magenta RT models SpectroStream audio
tokens using a language model.
```py
from magenta_rt import audio, spectrostream
codec = spectrostream.SpectroStream()
my_audio = audio.Waveform.from_file('jam.mp3')
my_tokens = codec.encode(my_audio)
my_audio_reconstruction = codec.decode(tokens)
```
## Running tests
Unit tests:
```sh
pip install -e .[test]
pytest .
```
Integration tests:
```sh
python test/musiccoca_end2end_test.py
python test/spectrostream_end2end_test.py
python test/magenta_rt_end2end_test.py
```
## Coming soon!
The following is a list of features we have planned for the near future (subject
to change). Please open an issue if there are features you would like to see, or
open a pull request if you would like to contribute!
- Technical report
- Colab for fine tuning
- Colab for conditioning on real-time audio input
## Citing this work
A technical report is coming soon. For now, please cite our blog post:
```
@article{magenta_rt,
title={Magenta RealTime},
url={https://g.co/magenta/rt},
publisher={Google DeepMind},
author={Lyria Team},
year={2025}
}
```
## License and disclaimer
Magenta RealTime is offered under a combination of licenses: the codebase is
licensed under
[Apache 2.0](https://github.com/magenta/magenta-realtime/blob/main/LICENSE),
and the model weights under
[Creative Commons Attribution 4.0 International](https://creativecommons.org/licenses/by/4.0/legalcode).
In addition, we specify the following usage terms:
Copyright 2025 Google LLC
Use these materials responsibly and do not generate content, including outputs,
that infringe or violate the rights of others, including rights in copyrighted
content.
Google claims no rights in outputs you generate using Magenta RealTime. You and
your users are solely responsible for outputs and their subsequent uses.
Unless required by applicable law or agreed to in writing, all software and
materials distributed here under the Apache 2.0 or CC-BY licenses are
distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND,
either express or implied. See the licenses for the specific language governing
permissions and limitations under those licenses. You are solely responsible for
determining the appropriateness of using, reproducing, modifying, performing,
displaying or distributing the software and materials, and any outputs, and
assume any and all risks associated with your use or distribution of any of the
software and materials, and any outputs, and your exercise of rights and
permissions under the licenses.