https://github.com/niranjanrao07/adhd-ml-project
This project used machine learning to classify ADHD based on EEG data. We preprocessed the EEG signals, extracted various features, and used LDA for dimensionality reduction. A voting ensemble of classifiers achieved 72% accuracy in distinguishing between ADHD and control groups.
https://github.com/niranjanrao07/adhd-ml-project
adhd ensemble feature-engineering machine-learning preprocessing
Last synced: about 1 year ago
JSON representation
This project used machine learning to classify ADHD based on EEG data. We preprocessed the EEG signals, extracted various features, and used LDA for dimensionality reduction. A voting ensemble of classifiers achieved 72% accuracy in distinguishing between ADHD and control groups.
- Host: GitHub
- URL: https://github.com/niranjanrao07/adhd-ml-project
- Owner: NiranjanRao07
- Created: 2025-04-28T04:07:37.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-04-30T02:13:21.000Z (about 1 year ago)
- Last Synced: 2025-04-30T03:23:54.411Z (about 1 year ago)
- Topics: adhd, ensemble, feature-engineering, machine-learning, preprocessing
- Language: Jupyter Notebook
- Homepage:
- Size: 1.35 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ADHD Detection from EEG Data
This repository contains all code, data, and artifacts for our project on automated ADHD detection using 19-channel EEG recordings and classical machine learning techniques.
## ๐ Overview
1. **Data Exploration**
- MATLAB `.mat` EEG files (61 ADHD, 60 control), 19 channels, 128 Hz sampling
- Scripts compute channel-wise stats: mean, variance, skewness, kurtosis, ptp
2. **Preprocessing**
- Butterworth band-pass filter (0.5โ45 Hz) removes drift & noise
- Z-score normalization per channel (mean = 0, std = 1)
- Amplitude-based artifact removal (ยฑ100 ยตV threshold)
3. **Feature Extraction**
- **Time-domain:** mean, std, skewness, kurtosis, RMS, zero crossings, peak-to-peak
- **Frequency-domain:** Welch PSD, delta/theta/alpha/beta/gamma band powers, spectral entropy, SEF, PSD slope
- **Non-linear:** approximate entropy, Higuchi fractal dimension, Hjorth mobility/complexity, Hurst exponent
4. **Dimensionality Reduction**
- PCA (95 % variance) tested โ suboptimal
- **Final:** supervised LDA applied only on training split โ single discriminant axis
5. **Modeling & Evaluation**
- Classifiers on LDA output: SVM, Decision Tree, Random Forest, KNN, Logistic Regression
- Hyperparameter tuning: grid search + 5-fold CV
- Held-out 20 % test split for final metrics
- Voting ensemble of all five models
6. **Results**
- **Ensemble Test Performance:**
- Accuracy: 72.0 %
- Precision: 68.8 %
- Recall: 84.6 %
- F1-Score: 75.9 %
- ROC AUC: 78.8 %
Explore raw data: run `analyze_files.py`
Preprocess & extract features: open and execute `data_preprocess.ipynb` and `feature_extraction.ipynb`
Train & evaluate models: implement LDA on train split, train classifiers, build ensemble
## ๐ค Collaboration & Version Control
Progress and deliverables were tracked through regular team syncs and shared document updates.