https://github.com/archismwanchatterjee/parkinson_detection

audio-processing bagging-classifier classification-algorithm ensemble-learning feature-extraction feature-selection librosa mrmr networkx-graph streamlit

Last synced: about 2 months ago
JSON representation

Host: GitHub
URL: https://github.com/archismwanchatterjee/parkinson_detection
Owner: ArchismwanChatterjee
License: mit
Created: 2024-05-09T18:05:36.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2024-05-28T18:00:46.000Z (about 2 years ago)
Last Synced: 2025-07-26T04:57:40.619Z (11 months ago)
Topics: audio-processing, bagging-classifier, classification-algorithm, ensemble-learning, feature-extraction, feature-selection, librosa, mrmr, networkx-graph, streamlit
Language: Jupyter Notebook
Homepage:
Size: 9.94 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          ## File Description :

1. Parkinsson Disease.csv : Dataset 1 with 22 features.

2. pd_speech_features.csv : Dataset 2 with 753 features.

3. Model_1st_Dataset.ipynb : Contains the unique bagging model trained on dataset 1 [Note: Used MRMR as feature selection ]

4. Model_2nd_Dataset.ipynb : Contains the unique bagging model trained on dataset 2 [Note: Used MRMR as feature selection ]

5. model_testing.ipynb : Contains the metrics of the classifier models, without FS, that are used in the bagging classifier. 

6. pearson_selection_resampling.ipynb : Contains the code of the new model [ New feature selection + sample resampling + new bagging model where the bags are based on the samples ]

7. FS_classif.ipynb : Contains the metrics of the classifier models, with FS, that are used in the bagging classifier.

8. audio_extractor.py : A Streamlit application which basically process the audio file, to extract the features, you upload. [Check it out here](https://audio-extractor.streamlit.app/)

## Few points to note:

1. Files 3 and 4 are for understanding the classifier model. Note that here the classification model includes feature resampling and mrmr as feature selection method. 

2. Currently the files 5,6,7 contains "pd_speech_features.csv" as the dataset.

3. To use "Parkinson Disease.csv" simply replace the dataset and replace 'id' with 'name', 'class' with 'status'.

```python

import numpy as np

import pandas as pd

df = pd.read_csv("Parkinsson disease.csv")

# Data Cleaning

df.drop('name', axis=1, inplace=True)

# Data preprocessing

X= df.drop('status', axis=1)

Y= df['status']

```

```python

import numpy as np

import pandas as pd

df = pd.read_csv('pd_speech_features.csv')

# Data cleaning

df = df.drop('id', axis=1)  # Remove the 'name' column

# Data preprocessing

X = df.drop('class', axis=1) 

Y = df['class']  

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/archismwanchatterjee/parkinson_detection

Awesome Lists containing this project

README