https://github.com/kyegomez/audioflamingo
Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities"
https://github.com/kyegomez/audioflamingo
ai artificial-intelligence attention-is-all-you-need attention-mechanism attention-model audio audio-ai deeplearning llm machine-learning ml transformer
Last synced: 5 months ago
JSON representation
Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities"
- Host: GitHub
- URL: https://github.com/kyegomez/audioflamingo
- Owner: kyegomez
- License: mit
- Created: 2024-02-06T21:37:32.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-01-27T16:43:48.000Z (8 months ago)
- Last Synced: 2025-04-15T12:49:51.362Z (6 months ago)
- Topics: ai, artificial-intelligence, attention-is-all-you-need, attention-mechanism, attention-model, audio, audio-ai, deeplearning, llm, machine-learning, ml, transformer
- Language: Python
- Homepage: https://discord.gg/GYbXvDGevY
- Size: 2.17 MB
- Stars: 40
- Watchers: 5
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
README
[](https://discord.gg/qUtxnK2NMf)
# AudioFlamingo
Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities". [PAPER LINK](https://arxiv.org/pdf/2402.01831.pdf)## Install
`pip3 install audio-flamingo`## Usage
```python
import torch
from audio_flamingo.model import AudioFlamingo# Generate a random input sequence
text = torch.randint(0, 256, (1, 1024))
audio = torch.randn(1, 16000)# Initialize AudioFlamingo model
model = AudioFlamingo(
dim=512,
num_tokens=256,
max_seq_len=1024,
heads=8,
depth=6,
dim_head=64,
dropout=0.1,
context_dim=512,
)# Pass the input sequence through the model
output = model(text, audio) # (1, 1024, 256)# Print the output shape
print(output.shape)
# Path: audio_flamingo/model.py```
# License
MIT