https://github.com/dachosen1/common-voice
Audio Classification with machine learning
https://github.com/dachosen1/common-voice
audio-processing machine-learning streaming-audio voice
Last synced: about 1 year ago
JSON representation
Audio Classification with machine learning
- Host: GitHub
- URL: https://github.com/dachosen1/common-voice
- Owner: dachosen1
- License: mit
- Created: 2020-02-19T02:14:20.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2023-03-06T22:03:49.000Z (over 3 years ago)
- Last Synced: 2025-04-20T02:55:38.070Z (about 1 year ago)
- Topics: audio-processing, machine-learning, streaming-audio, voice
- Language: Python
- Homepage:
- Size: 67.5 MB
- Stars: 18
- Watchers: 1
- Forks: 5
- Open Issues: 14
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
## Common Voice
[](https://lgtm.com/projects/g/dachosen1/Common-Voice/alerts/)
[](https://lgtm.com/projects/g/dachosen1/Common-Voice/context:javascript)
[](https://lgtm.com/projects/g/dachosen1/Common-Voice/context:python)
**Live Deployment**: [Commvoice](https://commvoice.me/)
### Data
The data for this project is sourced from [Common Voice](https://commonvoice.mozilla.org/en), which is a crowdsourcing project started by Mozilla to create a free database for speech recognition software. The project is supported by volunteers who record sample sentences with a microphone and review recordings of other users. The transcribed sentences will be collected in a voice database available under the public domain license CC0. This license ensures that developers can use the database for voice-to-text applications without restrictions or costs. Common Voice appeared as a response to the language assistants of large companies such as Amazon Echo, Siri or Google Assistant
# Overview
The goal for this project is to create an end to end machine learning appliacation that records and processes audio in real time and stream prediction via a socket API. There's a 1 second delay delay between the audio recording and the output prediction.

The application generates prediction in 3 categories: Gender, Age and Country of Origin.
## Todo:
Traing and implement models for Country and Age
# Getting Started:
## Train Model
Modify the Data Directory to your own direcory
```
class DataDirectory:
DATA_DIR = r"C:\Users\ander\Documents\common-voice-data"
DEV_DIR = r"C:\Users\ander\Documents\common-voice-dev"
CLIPS_DIR = r"C:\Users\ander\Documents\common-voice-data\clips"
```
Run machine learning pipeline
```
python run_pipeline.py
```
## Deploy Model
Windows Machine
```
pip3 install -r requirements.txt
python3 run_app.py
```
Linux
Install the below on the server prior to running the docker images
**Step 1:** Install Docker and Docker Compose
```
# Install Docker compose
sudo curl -L "https://github.com/docker/compose/releases/download/1.28.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose
sudo ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose
# Install Docker
sudo apt-get update
sudo apt-get install \
apt-transport-https \
ca-certificates \
curl \
gnupg-agent \
software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo apt-key fingerprint 0EBFCD88
sudo add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io
```
**Step 2:** Install Nginx
```
# Install NGINX
sudo apt install certbot python3-certbot-nginx
sudo nano /etc/nginx/sites-available/commvoice.me
...
server_name WEBSITE_NAME WEBSITE_NAME;
...
```
**Step 3:** Install Certbot
```
# Install Certbot
sudo certbot --nginx -d commvoice.me -d www.commvoice.me
# Install Dhparam
openssl dhparam -out /etc/nginx/dhparam.pem 2048
# Install Certbot Auto Renew
systemctl status certbot.timer
```
**Step 4:** Install and audio Drive
Enabled a snd-aloop modules
```
modprobe snd-aloop
```
The below devices should have been added to you dev/snd directory.
```
ls /dev/snd/
- pcmC0D0c
- pcmC0D0p
- pcmC0D1c
```
**Step 5:** Run Docker Compose
```
docker-commpose up --build
```