Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/najdbinrabah/deep-learning-with-tensorflow-and-keras

This project centers on recognizing emotions from audio data, utilizing feature extraction and assessing LSTM and 1D CNN effectiveness.
https://github.com/najdbinrabah/deep-learning-with-tensorflow-and-keras

1d-convolutional-neural-network chroma-features convolutional-neural-networks data-science deep-learning feature-extraction keras librosa long-short-term-memory-network machine-learning mel-frequency-cepstral-coefficients mel-spectrogram python recurrent-neural-networks tensorflow

Last synced: 3 days ago
JSON representation

This project centers on recognizing emotions from audio data, utilizing feature extraction and assessing LSTM and 1D CNN effectiveness.

Awesome Lists containing this project

README

        

Deep Learning with TensorFlow & Keras: Emotion Detection in Audio

This personal project uses Deep Learning, to classify emotions in audio data, employing TensorFlow, Keras, and Librosa.

---

This project explores the use of Deep Learning for classifying emotions in audio data, leveraging the capabilities of TensorFlow and Keras alongside Librosa for audio analysis. The aim is to address an audio classification problem, exploring how different Deep Learning models such as LSTMs and 1D CNNs handle the unique challenges of audio data.

Libraries Used

- **TensorFlow & Keras:** For building, training, and evaluating Deep Learning models.
- **Librosa:** For audio analysis, particularly feature extraction, which converts audio files into numerical representations that the model can interpret.

You can view the detailed Notebook for this project [here](https://github.com/NajdBinrabah/Deep-Learning-with-TensorFlow-and-Keras/blob/main/Deep_Learning_with_TensorFlow_%26_Keras_Emotion_Detection_in_Audio.ipynb).

Models and Techniques Explored

**1. Feature Extraction with Librosa**
- Mel-frequency cepstral coefficients (MFCCs)
- Chroma features
- Mel spectrogram

**2. Long Short-Term Memory (LSTM) and 1D Convolutional Neural Network (1D CNN) Models**

Two Deep Learning models were evaluated:
- **LSTM (Long Short-Term Memory):** LSTM networks were developed to address the vanishing/exploding gradient problem in traditional RNNs. Known for capturing long-term dependencies, LSTMs are suited for sequential data. However, they may be overly complex for short, simple audio clips, potentially leading to overfitting in a dataset like this one.
- **1D CNN (1D Convolutional Neural Network):** A simpler model architecture adapted for sequential data, such as audio. 1D CNNs apply convolution filters along a single axis, processing information across time or sequence dimensions. This makes them ideal for tasks like time-series analysis or audio classification, where short-term dependencies are key. In this case, the 1D CNN proved more compatible with the dataset, demonstrating smoother convergence and improved accuracy.