An open API service indexing awesome lists of open source software.

https://github.com/kyegomez/audioflamingo

Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities"
https://github.com/kyegomez/audioflamingo

ai artificial-intelligence attention-is-all-you-need attention-mechanism attention-model audio audio-ai deeplearning llm machine-learning ml transformer

Last synced: 5 months ago
JSON representation

Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities"

Awesome Lists containing this project

README

          

[![Multi-Modality](agorabanner.png)](https://discord.gg/qUtxnK2NMf)

# AudioFlamingo
Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities". [PAPER LINK](https://arxiv.org/pdf/2402.01831.pdf)

## Install
`pip3 install audio-flamingo`

## Usage
```python
import torch
from audio_flamingo.model import AudioFlamingo

# Generate a random input sequence
text = torch.randint(0, 256, (1, 1024))
audio = torch.randn(1, 16000)

# Initialize AudioFlamingo model
model = AudioFlamingo(
dim=512,
num_tokens=256,
max_seq_len=1024,
heads=8,
depth=6,
dim_head=64,
dropout=0.1,
context_dim=512,
)

# Pass the input sequence through the model
output = model(text, audio) # (1, 1024, 256)

# Print the output shape
print(output.shape)
# Path: audio_flamingo/model.py

```

# License
MIT