An open API service indexing awesome lists of open source software.

https://github.com/oscarknagg/voicemap

Identifying people from small audio fragments
https://github.com/oscarknagg/voicemap

convolutional-neural-networks machine-learning speaker-identification speaker-recognition

Last synced: about 1 month ago
JSON representation

Identifying people from small audio fragments

Awesome Lists containing this project

README

        

# voicemap

This repository contains code to build deep learning models to identify
different speakers based on audio samples containg their voice.

The eventual aim is for this repository to become a pip-installable
python package for quickly and easily performing speaker identification
related tasks.

**This tensorflow/Keras/python2.7 branch is discontinued. Work is
continuing on the pytorch-python-3.6 branch which will become the
master branch.**

## Instructions
#### Requirements
Make a new virtualenv and install requirements from `requirements.txt`
with the following command.
```
pip install -r requirements.txt
```
This project was written in Python 2.7.12 so I cannot guarantee it works
on any other version.

#### Data
Get training data here: http://www.openslr.org/12
- train-clean-100.tar.gz
- train-clean-360.tar.gz
- dev-clean.tar.gz

Place the unzipped training data into the `data/` folder so the file
structure is as follows:
```
data/
LibriSpeech/
dev-clean/
train-clean-100/
train-clean-360/
SPEAKERS.TXT
```

Please use the `SPEAKERS.TXT` supplied in the repo as I've made a few
corrections to the one found at openslr.org.

#### Run tests

This requires the LibriSpeech data.
```
python -m unittest tests.tests
```

## Contents
### voicemap
This package contains re-usable code for defining network architectures,
interacting with datasets and many utility functions.

### experiments
This package contains experiments in the form of python scripts.

### notebooks
This folder contains Jupyter notebooks used for interactive
visualisation and analysis.