https://github.com/kyegomez/audioflamingo

Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities"
https://github.com/kyegomez/audioflamingo

ai artificial-intelligence attention-is-all-you-need attention-mechanism attention-model audio audio-ai deeplearning llm machine-learning ml transformer

Last synced: about 1 year ago
JSON representation

Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities"

Host: GitHub
URL: https://github.com/kyegomez/audioflamingo
Owner: kyegomez
License: mit
Created: 2024-02-06T21:37:32.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2025-01-27T16:43:48.000Z (over 1 year ago)
Last Synced: 2025-04-15T12:49:51.362Z (about 1 year ago)
Topics: ai, artificial-intelligence, attention-is-all-you-need, attention-mechanism, attention-model, audio, audio-ai, deeplearning, llm, machine-learning, ml, transformer
Language: Python
Homepage: https://discord.gg/GYbXvDGevY
Size: 2.17 MB
Stars: 40
Watchers: 5
Forks: 1
Open Issues: 1
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE

Awesome Lists containing this project

README

          [![Multi-Modality](agorabanner.png)](https://discord.gg/qUtxnK2NMf)

# AudioFlamingo

Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities". [PAPER LINK](https://arxiv.org/pdf/2402.01831.pdf)

## Install

`pip3 install audio-flamingo`

## Usage

```python

import torch

from audio_flamingo.model import AudioFlamingo

# Generate a random input sequence

text = torch.randint(0, 256, (1, 1024))

audio = torch.randn(1, 16000)

# Initialize AudioFlamingo model

model = AudioFlamingo(

    dim=512,

    num_tokens=256,

    max_seq_len=1024,

    heads=8,

    depth=6,

    dim_head=64,

    dropout=0.1,

    context_dim=512,

)

# Pass the input sequence through the model

output = model(text, audio)  # (1, 1024, 256)

# Print the output shape

print(output.shape)

# Path: audio_flamingo/model.py

```

# License

MIT

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kyegomez/audioflamingo

Awesome Lists containing this project

README