https://github.com/axlerquiza/mental-state-recognizer
A web-based/gui-based mental state recognizer that analyzes audio recordings and predicts the user's mental state using ML models trained on speech features like MFCC.
https://github.com/axlerquiza/mental-state-recognizer
audio-processing deep-learning machine-learning mental-health python speech-analysis
Last synced: about 1 year ago
JSON representation
A web-based/gui-based mental state recognizer that analyzes audio recordings and predicts the user's mental state using ML models trained on speech features like MFCC.
- Host: GitHub
- URL: https://github.com/axlerquiza/mental-state-recognizer
- Owner: axlerquiza
- Created: 2025-02-05T09:12:47.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-02-07T07:54:04.000Z (over 1 year ago)
- Last Synced: 2025-02-07T08:26:58.667Z (over 1 year ago)
- Topics: audio-processing, deep-learning, machine-learning, mental-health, python, speech-analysis
- Language: Python
- Homepage:
- Size: 611 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Mental State Recognizer
This project predicts a person's mental state based on their speech using a deep learning model. It utilizes a hybrid model combining **BiLSTM** and **CNN** to analyze features extracted from audio files. You can read more about the published paper [here](https://ieeexplore.ieee.org/document/10690712).
## Technologies Used
- **Python**: Primary language for development
- **Flask**: For backend and serving the model
- **TensorFlow**: For model building and inference
- **librosa**: For audio processing (MFCC extraction)
- **NumPy, SciPy**: For mathematical operations
- **scikit-learn**: For evaluation and metrics
## Installation
1. **Clone the repository**:
```bash
git clone https://github.com/axlerquiza/mental-state-recognizer.git
cd mental-state-recognizer
```
2. **Set up a virtual environment** (optional but recommended):
```bash
python -m venv venv
source venv/bin/activate # For Windows, use `venv\Scripts\activate`
```
3. **Install the required dependencies**:
```bash
pip install -r requirements.txt
```
4. **Run the Flask server**:
```bash
python server.py
```
5. Open your browser and navigate to `http://127.0.0.1:5000/` to start using the web interface.
## Project Structure
The project follows this directory structure:
```
/mental-state-recognizer
├── /assets
│ ├── /scripts # JavaScript functionalities
│ │ ├── app.js
│ ├── /styles
│ │ ├── style.css # CSS styling
├── /models # Folder containing the trained models
├── /mfcc_images # Folder for storing MFCC images (generated from audio)
├── /preprocessed_audio # Folder with sample preprocessed audio files
│ ├── 308_AUDIO_processed.wav # Level 4 example
│ ├── 346_AUDIO_processed.wav # Level 1 example
├── /uploads # Folder for storing uploaded audio files
├── index.html # Main HTML file for the web interface
├── server.py # Flask backend server script
└── requirements.txt # List of dependencies
```
## Usage
1. **Upload an audio file** (preferably a 10-second speech recording) via the web interface (`index.html`).
2. The backend processes the audio file, extracting MFCC features and passing them through the trained model.
3. The model predicts the mental state, and the result is displayed on the webpage.
## Example
1. Click the "Upload Audio" button on the web interface.
2. Choose a 10-second audio file (preferably a speech recording).
3. Wait for the model to process the file and display the predicted mental state.
### Sample Audio Files
The `/preprocessed_audio` directory contains sample audio files:
- `308_AUDIO_processed.wav`: A **Level 4** example (indicating a specific mental state).
- `346_AUDIO_processed.wav`: A **Level 1** example (indicating a different mental state).
These files can be used for testing the system or understanding different levels of mental states.