https://github.com/archinetai/audio-encoders-pytorch

A collection of audio autoencoders, in PyTorch.
https://github.com/archinetai/audio-encoders-pytorch

artificial-intelligence audio deep-learning

Last synced: 3 days ago
JSON representation

A collection of audio autoencoders, in PyTorch.

Host: GitHub
URL: https://github.com/archinetai/audio-encoders-pytorch
Owner: archinetai
License: mit
Created: 2022-10-26T12:18:29.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2023-03-07T00:08:24.000Z (over 2 years ago)
Last Synced: 2025-03-21T03:34:23.974Z (4 months ago)
Topics: artificial-intelligence, audio, deep-learning
Language: Python
Homepage:
Size: 79.1 KB
Stars: 40
Watchers: 4
Forks: 6
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        ## Audio Encoders - PyTorch

A collection of audio autoencoders, in PyTorch. Pretrained models can be found at [`archisound`](https://github.com/archinetai/archisound).

## Install

```bash

pip install audio-encoders-pytorch

```

[![PyPI - Python Version](https://img.shields.io/pypi/v/audio-encoders-pytorch?style=flat&colorA=black&colorB=black)](https://pypi.org/project/audio-encoders-pytorch/)

## Usage

### AutoEncoder1d

```py

from audio_encoders_pytorch import AutoEncoder1d

autoencoder = AutoEncoder1d(

    in_channels=2,              # Number of input channels

    channels=32,                # Number of base channels

    multipliers=[1, 1, 2, 2],   # Channel multiplier between layers (i.e. channels * multiplier[i] -> channels * multiplier[i+1])

    factors=[4, 4, 4],          # Downsampling/upsampling factor per layer

    num_blocks=[2, 2, 2]        # Number of resnet blocks per layer

)

x = torch.randn(1, 2, 2**18)    # [1, 2, 262144]

x_recon = autoencoder(x)        # [1, 2, 262144]

```

### Discriminator1d

```py

from audio_encoders_pytorch import Discriminator1d

discriminator = Discriminator1d(

    in_channels=2,                  # Number of input channels

    channels=32,                    # Number of base channels

    multipliers=[1, 1, 2, 2],       # Channel multiplier between layers (i.e. channels * multiplier[i] -> channels * multiplier[i+1])

    factors=[8, 8, 8],              # Downsampling factor per layer

    num_blocks=[2, 2, 2],           # Number of resnet blocks per layer

    use_loss=[True, True, True]     # Whether to use this layer as GAN loss

)

wave_true = torch.randn(1, 2, 2**18)

wave_fake = torch.randn(1, 2, 2**18)

loss_generator, loss_discriminator = discriminator(wave_true, wave_fake)

# tensor(0.613949, grad_fn=) tensor(0.097330, grad_fn=)

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/archinetai/audio-encoders-pytorch

Awesome Lists containing this project

README