Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/rezachu/emotion_recognition_cnn
https://github.com/rezachu/emotion_recognition_cnn
Last synced: 7 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/rezachu/emotion_recognition_cnn
- Owner: rezachu
- License: mit
- Created: 2019-05-24T09:44:17.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2022-11-21T22:30:10.000Z (almost 2 years ago)
- Last Synced: 2024-08-02T17:36:27.348Z (3 months ago)
- Language: Jupyter Notebook
- Size: 1.38 MB
- Stars: 90
- Watchers: 4
- Forks: 30
- Open Issues: 8
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Speech Emotion Recognition with Convolution Neural Network
#### *Author @rezachu*## I. Introduction
- This is a CNN Speech Emotion Recognition Model I found on [GitHub](https://github.com/MITESHPUTHRANNEU/Speech-Emotion-Analyzer).
- There is no paper to reference according to the repository author.## II. Package Required
- Please refers to the requirement file: [requirements.txt](./requirements.txt)
## III. To Run
- Please run the notebook named: **CNN_emotion_recognition.ipynb**
- Please create a `./data/` folder and put all of the data inside.
- Please create a `./model/` folder and set it as the model weight saving directory.## IV. Preparation: Understanding the Data from Repo
Data Set: [The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS)](https://zenodo.org/record/1188976#.XN0fwnUzZhE)
- 12 Actors & 12 Actresses recorded speech and song version respectively.
- Actor no.18 does not have song version data.
- Emotion `Disgust`, `Neutral` and `Surprised` are not included in the song version data.Total Class:
| Emotion | Speech Sample Count | Song Sample Count | Summed Count |
| ---- | ---- | ---- | ---- |
| Neutral | 96 | 92 | 188 |
| Calm | 192 | 184 | 376 |
| Happy | 192 | 184 | 376 |
| Sad | 192 | 184 | 376 |
| Angry | 192 | 184 | 376 |
| Fearful | 192 | 184 | 376 |
| Disgust | 192 | 0 | 192 |
| Surprised | 192 | 0 | 192 |
| Total | 1440 | 1012 | 2452 |### Sample Distribution:
- Originally, there are 16 target classes (8 emotions and each emotion split to male and female.) in total for 1440 samples (Speech Only). The author removed the `disgust`, `surprised` and `neutral` from both gender which reduced the target classes to 10.
## V.Preparation: Understanding the Model
**Model Architecture:**
```
# Model
model = Sequential()
model.add(Conv1D(256, 8, padding='same',input_shape=(X_train.shape[1],1))) #1
model.add(Activation('relu'))
model.add(Conv1D(256, 8, padding='same')) #2
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.25))
model.add(MaxPooling1D(pool_size=(8)))
model.add(Conv1D(128, 8, padding='same')) #3
model.add(Activation('relu'))
model.add(Conv1D(128, 8, padding='same')) #4
model.add(Activation('relu'))
model.add(Conv1D(128, 8, padding='same')) #5
model.add(Activation('relu'))
model.add(Conv1D(128, 8, padding='same')) #6
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.25))
model.add(MaxPooling1D(pool_size=(8)))
model.add(Conv1D(64, 8, padding='same')) #7
model.add(Activation('relu'))
model.add(Conv1D(64, 8, padding='same')) #8
model.add(Activation('relu'))
model.add(Flatten())
model.add(Dense(target_class)) #9
model.add(Activation('softmax'))
opt = keras.optimizers.SGD(lr=0.0001, momentum=0.0, decay=0.0, nesterov=False)
```## VI. Project Summary
- Please refers to my Blog on Medium: [Speech Emotion Recognition with Convolution Neural Network](https://medium.com/@rezachu/speech-emotion-recognition-with-convolution-neural-network-1e6bb7130ce3)