Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/chris-santiago/emonet
CNN-LSTM model for audio emotion detection in children with adverse childhood events.
https://github.com/chris-santiago/emonet
adverse-childhood-events audio-classification cnn-lstm emotion-detection emotion-recognition melspectrogram pytorch pytorch-lightning torchaudio torchvision
Last synced: about 1 month ago
JSON representation
CNN-LSTM model for audio emotion detection in children with adverse childhood events.
- Host: GitHub
- URL: https://github.com/chris-santiago/emonet
- Owner: chris-santiago
- License: mit
- Created: 2022-07-19T17:15:55.000Z (over 2 years ago)
- Default Branch: master
- Last Pushed: 2022-07-20T14:55:12.000Z (over 2 years ago)
- Last Synced: 2024-12-07T19:36:10.264Z (about 1 month ago)
- Topics: adverse-childhood-events, audio-classification, cnn-lstm, emotion-detection, emotion-recognition, melspectrogram, pytorch, pytorch-lightning, torchaudio, torchvision
- Language: Jupyter Notebook
- Homepage: https://chris-santiago.github.io/emonet/
- Size: 28.8 MB
- Stars: 1
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
- Citation: CITATION.cff
Awesome Lists containing this project
README
# emonet
[![DOI](https://zenodo.org/badge/515677040.svg)](https://zenodo.org/badge/latestdoi/515677040)
A package to model negative emotion severity in patients with adverse childhood events (ACE).
**Contributors**: [@chris-santiago](https://github.com/chris-santiago), [@gonzalezeric](https://github.com/gonzalezeric), [@costargc](https://github.com/costargc)
## Installation & Environment Setup
### Using Docker
1. Open Terminal on Mac or PowerShell on Windows from the **root directory**.
2. When the application is open, in the command line, build the image using the following: `docker build -t cjsantiago/emonet-model -f docker/model/Dockerfile .`
3. Run container `docker run -it --name emonet -p 8888:8888 -v "${HOME}"/emonet-data:/home/jovyan/emonet-data cjsantiago/emonet-model`
- **Important**: Training the model or running batch inference presumes that you have a `emonet-data` directory within your home folder, containing the original `voice_labeling_report` directory. This will allow you to replicate all batch preprocessing done prior to model training.
- You can score file(s) or signal(s), either on their own or with your own custom DataLoader, without the data directory (described above).
- See `docker/model/README.md` for more.
4. Once the container has been created, you may access the files using one of the URLs generated in the CLI.### Using Conda
1. Clone this repo, then `cd emonet`
2. Create a virtual environment
- For training and notebooks, use `conda env create -f env-base-plus.yml`
- For scoring, only, use `conda env create -f env-base-yml`
3. Install `emonet` package, `pip install -e .`**NOTE**: We're installing in editable mode (`-e` flag) as we expect to run training and/or scoring
from this cloned repo. Editable mode will symlink source code from the cloned repo directory to the
appropriate Python interpreter, enabling source code edits and easy-access to our saved models under
the `saved_models` directory.#### Installing ffmpeg
`ffmpeg` is required to convert `.m4a` to `.wav`. On Mac this can be installed via [Homebrew](https://formulae.brew.sh/formula/ffmpeg). *Skip this if you're running via Docker.*
## Data Setup
To use our original datset splits, we recommend downloading directly from our S3 bucket. This also
removes the need to complete some time-consuming preprocessing steps.### Download from S3
*Assumes that you have the AWS CLI tool installed on your machine (and that you have our credentials :grinning:)*.
Within your home folder, create a directory called `emonet-data`. You could also use our directory
setup script `python emonet/dir_setup.py`.From the `emonet-data` directory, run this command to sync (copy) the required directories and files
directly from our S3 bucket.*Note that this assumes our credentials are located within `~/.aws/credentials`*
```bash
aws s3 sync s3://gatech-emonet/eval/ .
```### From Scratch
Once you've setup the environment and installed the `emonet` package:
Run `python emonet/data_setup.py`
**Note:** you can pass an optional number of max_workers to this command; the default is 8 (threads).
`python emonet/data_setup.py 16`
This script will run and perform the following:
1. dir_setup.py: Set up a data directory within the home folder
2. m4a_to_wav.py: Convert any `.m4a` files to `.wav`
3. batch_resample.py: Resample files to 16,000Hz
4. batch_vad.py: Run voice activity detection (VAD)
5. data_prep.py: Create train/valid/test splits and respective manifests
6. wav_splitter.py: Split `.wav` files into 8-second chunks, the create new train/valid/test manifests that use the chunked `.wav` files## Training
Now that files have all been converted to WAV, preprocessed with VAD and split training, validation and
testing sets, and chunked into 8-second segments:### Command Line Tool
The easiest way to run the model is via the CLI:Run
```bash
python emonet/train.py
```and pass in the desired number of `workers`, `epochs` and `emotion`.
Example:
```bash
python train.py 12 300 anger
```### Python
You can also train the model in Python:Open a .py file or notebook and run
```python
from emonet import DATA_DIR
from emonet.train import mainmain(workers=12, epochs=300, emotion="sadness", use_wandb=False, data_dir=DATA_DIR)
```and pass in the desired number of `workers`, `epochs` and `emotion`; you can log runs to Weights &
Biases by setting `use_wandb=True`, and change the default data directory using the `data_dir` parameter.## Pretrained Models
**No pretrained models are included in this public-facing repo.**
## Scoring
### Command Line Tool
The easiest way to score (using our pretrained models) is via our CLI tool, `emonet`. The syntax for
this tool is:```bash
emonet
```Example:
```bash
emonet anger /Users/christophersantiago/emonet-data/wavs/6529_53113_1602547200.wav
```**Note**: This CLI tool will run VAD on the `.wav` file, and can accept arbitrary length-- despite
model being trained on 8-second chunks. Therefore, you should use an original `.wav` of the sample
you wish to score, **not** a `.wav` that's been preprocessed with VAD.### Python
You can also score via Python:
```python
from emonet import DATA_DIR, ROOT
from emonet.model import EmotionRegressordef get_saved(emotion: str):
return ROOT.joinpath('saved_models', f'{emotion}.ckpt')emotion = 'fear'
model = EmotionRegressor.load_from_checkpoint(get_saved(emotion))file = 'path-to-my-file'
model.score_file(file=file, sample_rate=16000, vad=True)
```See `inference.py` for an example of how we scored our testing sets.