https://github.com/myurasov/kaggle-tf-speech

Code for TensorFlow Speech Recognition Challenge: https://www.kaggle.com/c/tensorflow-speech-recognition-challenge
https://github.com/myurasov/kaggle-tf-speech

deep-learning kaggle keras speech-recognition tensorflow

Last synced: 25 days ago
JSON representation

Code for TensorFlow Speech Recognition Challenge: https://www.kaggle.com/c/tensorflow-speech-recognition-challenge

Host: GitHub
URL: https://github.com/myurasov/kaggle-tf-speech
Owner: myurasov
Created: 2017-12-08T07:35:13.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2018-01-25T02:45:28.000Z (over 7 years ago)
Last Synced: 2025-04-06T16:45:34.924Z (27 days ago)
Topics: deep-learning, kaggle, keras, speech-recognition, tensorflow
Language: Jupyter Notebook
Homepage:
Size: 66.3 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Kaggle: TensorFlow Speech Recognition Challenge
https://www.kaggle.com/c/tensorflow-speech-recognition-challenge/

![Level 1 models training graphs](docs/l1-folds.png)

## V1 Flow

### Generating training data

1. Sample one of valid labels (+ unknown, silence)
1. Pick one of the clips or...
1. ...If 'silence' picked, generate silence clips from background noise provided
1. Randomly mix sample with background noise provided, transform pitch/speed/volume
1. Compute mel-scaled spectrogram
1. Scale to match mean, std dev with a pre-fit scaler
1. ...
1. profit!

### Inference

1. Output model activations (after softmax) to CSV for multiple training runs/model variations
1. Generate submission with voting/averaging strategy
1. Predict same file many times with different transfromations and average/vote result (?, if performance allows)

## Ideas

- Record more noise

## V2 Flow

1. Generate holdout set
1. Generate 10 folds from filenames
1. Generate training set excl. holdout set
1. Train 10 L1 models, predict on test and holdout sets
1. Train L2 model from predictions on holdout set
1. Predict using L1 test prtedictions as inputs

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/myurasov/kaggle-tf-speech

Awesome Lists containing this project

README