https://github.com/mdeff/dlaudio_report

Master thesis: Structured Auto-Encoder with application to Music Genre Recognition (report)
https://github.com/mdeff/dlaudio_report

auto-encoders deep-learning music-information-retrieval

Last synced: 9 months ago
JSON representation

Master thesis: Structured Auto-Encoder with application to Music Genre Recognition (report)

Host: GitHub
URL: https://github.com/mdeff/dlaudio_report
Owner: mdeff
License: cc-by-4.0
Created: 2015-10-28T11:26:21.000Z (over 10 years ago)
Default Branch: master
Last Pushed: 2020-04-18T15:06:26.000Z (about 6 years ago)
Last Synced: 2025-01-22T14:08:39.723Z (over 1 year ago)
Topics: auto-encoders, deep-learning, music-information-retrieval
Language: TeX
Homepage: https://infoscience.epfl.ch/record/218019
Size: 303 KB
Stars: 6
Watchers: 3
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE.txt

Awesome Lists containing this project

README

# Master thesis: Structured Auto-Encoder with application to Music Genre Recognition

[Michaël Defferrard](https://deff.ch).
Supervized by [Xavier Bresson](https://www.ntu.edu.sg/home/xbresson),
[Johan Paratte](https://www.linkedin.com/in/johan-paratte-a2070039),
[Pierre Vandergheynst](https://people.epfl.ch/pierre.vandergheynst).

> In this work, we present a technique that learns discriminative audio
> features for Music Information Retrieval (MIR). The novelty of the proposed
> technique is to design auto-encoders that make use of data structures to
> learn enhanced sparse data representations. The data structure is borrowed
> from the Manifold Learning field, that is data are supposed to be sampled
> from smooth manifolds, which are here represented by graphs of proximities of
> the input data. As a consequence, the proposed auto-encoders finds sparse
> data representations that are quite robust w.r.t. perturbations. The model is
> formulated as a non-convex optimization problem. However, it can be
> decomposed into iterative sub-optimization problems that are convex and for
> which well-posed iterative schemes are provided in the context of the Fast
> Iterative Shrinkage-Thresholding (FISTA) framework. Our numerical experiments
> show two main results. Firstly, our graph-based auto-encoders improve the
> classification accuracy by 2% over the auto-encoders without graph structure
> for the popular GTZAN music dataset. Secondly, our model is significantly
> more robust as it is 8% more accurate than the standard model in the presence
> of 10% of perturbations.

## Content

This repository contains the Latex sources of my master thesis's report.

Related resources:
* Report:
* Slides:
* Code:
* Experimental results:
* Latex sources of the report:

## Compilation

```
make
```

PDF available at .

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/mdeff/dlaudio_report

Awesome Lists containing this project

README