https://github.com/andimarafioti/audiocontextencoder

A context encoder for audio inpainting
https://github.com/andimarafioti/audiocontextencoder

context-encoder machine-learning paper

Last synced: about 1 year ago
JSON representation

A context encoder for audio inpainting

Host: GitHub
URL: https://github.com/andimarafioti/audiocontextencoder
Owner: andimarafioti
Created: 2018-02-06T15:03:10.000Z (over 8 years ago)
Default Branch: master
Last Pushed: 2023-03-24T22:29:42.000Z (over 3 years ago)
Last Synced: 2025-04-04T02:11:06.713Z (about 1 year ago)
Topics: context-encoder, machine-learning, paper
Language: Jupyter Notebook
Homepage:
Size: 6.82 MB
Stars: 25
Watchers: 7
Forks: 2
Open Issues: 6
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Audio inpainting with a context encoder

This project accompanies the research work on audio inpainting of small gaps done at the Acoustics Research Institute in Vienna collaborating with the Swiss Data Science Center. The paper was [published at IEEE TASLP](https://ieeexplore.ieee.org/document/8867915) available now: https://ieeexplore.ieee.org/document/8867915.

# Installation

Install the requirements with `pip install -r requirements.txt`. For windows users, the numpy version should be 1.14.0+mkl (find it [here](https://www.lfd.uci.edu/~gohlke/pythonlibs/)). For the FMA dataset, librosa requires ffmpeg as an mp3 backend.

# Instructions
The paper uses both google's Nsynth dataset and the FMA dataset. In order to recreate the used dataset, execute in the parent folder either `python make_nsynthdataset.py` or `python make_fmadataset.py`. The output of the scripts are three `tfrecord` files for training, validating and testing the model.

The default parameters for the network come pickled in the file `magnitude_network_parameters.pkl` and `complex_network_parameters.pkl`. In order to make other architectures use [saveParameters.py](utils/saveParameters.py).

To train the network, execute in the parent folder `python trainMagnitudeNetwork.py` or `python trainComplexNetwork.py`. This will train the network for 600k steps with a learning rate of 1e-3. You can select on which tfrecords to train the network, the script assumes you have created the nsynth dataset.

## Sound examples

- To hear examples please go to the [accompanying website](https://andimarafioti.github.io/audioContextEncoder/).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/andimarafioti/audiocontextencoder

Awesome Lists containing this project

README