https://github.com/ha7890846/mental_fitness_tracker

The Mental Health Fitness Tracker project focuses on analyzing and predicting mental fitness levels of individuals from various countries with different mental disorders. It utilizes regression techniques to provide insights into mental health and make predictions based on the available data.
https://github.com/ha7890846/mental_fitness_tracker
artificial-intelligence data-science forest graph jupiter-notebook matplotlib plot prediction-model python regression-algorithms seaborn
Last synced: 19 days ago
JSON representation
Host: GitHub
URL: https://github.com/ha7890846/mental_fitness_tracker
Owner: ha7890846
Created: 2023-07-21T02:16:41.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2023-07-21T02:25:26.000Z (almost 2 years ago)
Last Synced: 2025-02-15T05:24:56.316Z (2 months ago)
Topics: artificial-intelligence, data-science, forest, graph, jupiter-notebook, matplotlib, plot, prediction-model, python, regression-algorithms, seaborn
Language: Jupyter Notebook
Homepage:
Size: 2.81 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
Awesome Lists containing this project

README

        # mental_fitness_tracker

Mental Fitness Tracker ( AI )

# Mental Health Fitness Tracker

The Mental Health Fitness Tracker project focuses on analyzing and predicting mental fitness levels of individuals from various countries with different mental disorders. It utilizes regression techniques to provide insights into mental health and make predictions based on the available data.

## INSTALLATION

To use the code and run the examples, follow these steps:

1. Ensure that you have Python 3.x installed on your system.

2. Install the required libraries by running the following command:

```bash

pip install pandas numpy matplotlib seaborn scikit-learn xgboost

```

    

3. Download the project files and navigate to the project directory.

   

## USAGE

1. IMPORT THE NECESSARY LIBRARIES

```bash

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

import seaborn as sns

from sklearn.model_selection import train_test_split

from sklearn.linear_model import Ridge, Lasso, ElasticNet, LinearRegressi from sklearn.svm import SVR

from sklearn.tree import DecisionTreeRegressor

from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegre from sklearn.preprocessing import PolynomialFeatures

from sklearn.metrics import mean_squared_error, r2_score

from xgboost import XGBRegressor

from sklearn.neighbors import KNeighborsRegressor

from sklearn.neural_network import MLPRegressor

```

2. READ THE DATA FROM THE CSV FILES

```bash

df1 = pd.read_csv('mental-and-substance-use-as-share-of-disease.csv')

df2 = pd.read_csv('prevalence-by-mental-and-substance-use-disorder.csv')

```

3. FILL MISSING VALUES IN NUMERIC COLUMNS OF DATAFRAMES df1 AND df2 WITH THE MEAN OF THEIR RESPECTIVE COLUMNS

```bash

numeric_columns = df1.select_dtypes(include=[np.number]).columns

df1[numeric_columns] = df1[numeric_columns].fillna(df1[numeric_columns].mean())

numeric_columns = df2.select_dtypes(include=[np.number]).columns

df2[numeric_columns] = df2[numeric_columns].fillna(df2[numeric_columns].mean())

```

4. CONVERT DATA TYPES

```bash

df1['DALYs (Disability-Adjusted Life Years) - Mental disorders - Sex: Both - Age: All Ages (Percent)'] = df1['DALYs (Disability-Adjusted Life Years) - Mental disorders - Sex: Both - Age: All Ages (Percent)'].astype(float)

df2['Schizophrenia disorders (share of population) - Sex: Both - Age: Age-standardized'] = df2['Schizophrenia disorders (share of population) - Sex: Both - Age: Age-standardized'].astype(float)

df2['Bipolar disorders (share of population) - Sex: Both - Age: Age-standardized'] = df2['Bipolar disorders (share of population) - Sex: Both - Age: Age-standardized'].astype(float)

df2['Eating disorders (share of population) - Sex: Both - Age: Age-standardized'] = df2['Eating disorders (share of population) - Sex: Both - Age: Age-standardized'].astype(float)

df2['Anxiety disorders (share of population) - Sex: Both - Age: Age-standardized'] = df2['Anxiety disorders (share of population) - Sex: Both - Age: Age-standardized'].astype(float)

df2['Prevalence - Drug use disorders - Sex: Both - Age: Age-standardized (Percent)'] = df2['Prevalence - Drug use disorders - Sex: Both - Age: Age-standardized (Percent)'].astype(float)

df2['Depressive disorders (share of population) - Sex: Both - Age: Age-standardized'] = df2['Depressive disorders (share of population) - Sex: Both - Age: Age-standardized'].astype(float)

df2['Prevalence - Alcohol use disorders - Sex: Both - Age: Age-standardized (Percent)'] = df2['Prevalence - Alcohol use disorders - Sex: Both - Age: Age-standardized (Percent)'].astype(float)

```

5. MERGE THE TWO DATAFRAMES ON A COMMON COLUMN

```bash

merged_df = pd.merge(df1, df2, on=['Entity', 'Code', 'Year'])

```

6. FEATURE THE MATRIX X AND THE VARIABLE y

```bash

X = merged_df[['Schizophrenia disorders (share of population) - Sex: Both - Age: Age-standardized',

               'Bipolar disorders (share of population) - Sex: Both - Age: Age-standardized',

               'Eating disorders (share of population) - Sex: Both - Age: Age-standardized',

               'Anxiety disorders (share of population) - Sex: Both - Age: Age-standardized',

               'Prevalence - Drug use disorders - Sex: Both - Age: Age-standardized (Percent)',

               'Depressive disorders (share of population) - Sex: Both - Age: Age-standardized',

               'Prevalence - Alcohol use disorders - Sex: Both - Age: Age-standardized (Percent)']]

y = merged_df['DALYs (Disability-Adjusted Life Years) - Mental disorders - Sex: Both - Age: All Ages (Percent)']

```

7. SPLIT THE DATA INTO TRAINING AND TESTING SETS

```bash

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

```

8. VISUALISING THE CORRELATION HEATMAP OF DISEASES AND MENTAL FITNESS

```bash

# Compute the correlation matrix

corr_matrix = merged_df[['Schizophrenia disorders (share of population) - Sex: Both - Age: Age-standardized',

                         'Bipolar disorders (share of population) - Sex: Both - Age: Age-standardized',

                         'Eating disorders (share of population) - Sex: Both - Age: Age-standardized',

                         'Anxiety disorders (share of population) - Sex: Both - Age: Age-standardized',

                         'Prevalence - Drug use disorders - Sex: Both - Age: Age-standardized (Percent)',

                         'Depressive disorders (share of population) - Sex: Both - Age: Age-standardized',

                         'Prevalence - Alcohol use disorders - Sex: Both - Age: Age-standardized (Percent)',

                         'DALYs (Disability-Adjusted Life Years) - Mental disorders - Sex: Both - Age: All Ages (Percent)'

                        ]].corr()

# Create the heatmap

plt.figure(figsize=(12, 8))

sns.heatmap(corr_matrix, annot=True, cmap='coolwarm')

plt.title('Correlation Heatmap - Diseases and Mental Fitness')

plt.xticks(rotation=45)

plt.yticks(rotation=0)

plt.show()

```

9. FIT THE LINEAR REGRESSION MODEL

    

```bash

model = LinearRegression()

model.fit(X_train, y_train)

```

10. MAKE A PREDICTION USING TRAINED MODEL

```bash

y_pred = model.predict(X_test)

```

11. PRINTING MODEL PERFOMANCE METRICS

```bash

# Create a dictionary to store the model performance

model_performance = {}

# Ridge Regression

ridge_model = Ridge(alpha=0.5)

ridge_model.fit(X_train, y_train)

ridge_y_pred = ridge_model.predict(X_test)

ridge_mse = mean_squared_error(y_test, ridge_y_pred)

ridge_r2 = r2_score(y_test, ridge_y_pred)

model_performance['1. Ridge Regression'] = {'MSE': ridge_mse, 'R-squared': ridge_r2}

# Lasso Regression

lasso_model = Lasso(alpha=0.5)

lasso_model.fit(X_train, y_train)

lasso_y_pred = lasso_model.predict(X_test)

lasso_mse = mean_squared_error(y_test, lasso_y_pred)

lasso_r2 = r2_score(y_test, lasso_y_pred)

model_performance['2. Lasso Regression'] = {'MSE': lasso_mse, 'R-squared': lasso_r2}

# Elastic Net Regression

elastic_net_model = ElasticNet(alpha=0.5, l1_ratio=0.5)

elastic_net_model.fit(X_train, y_train)

elastic_net_y_pred = elastic_net_model.predict(X_test)

elastic_net_mse = mean_squared_error(y_test, elastic_net_y_pred)

elastic_net_r2 = r2_score(y_test, elastic_net_y_pred)

model_performance['3. Elastic Net Regression'] = {'MSE': elastic_net_mse, 'R-squared': elastic_net_r2}

# Polynomial Regression

poly_features = PolynomialFeatures(degree=2)

X_poly = poly_features.fit_transform(X_train)

poly_model = LinearRegression()

poly_model.fit(X_poly, y_train)

X_test_poly = poly_features.transform(X_test)

poly_y_pred = poly_model.predict(X_test_poly)

poly_mse = mean_squared_error(y_test, poly_y_pred)

poly_r2 = r2_score(y_test, poly_y_pred)

model_performance['4. Polynomial Regression'] = {'MSE': poly_mse, 'R-squared': poly_r2}

# Decision Tree Regression

tree_model = DecisionTreeRegressor()

tree_model.fit(X_train, y_train)

tree_y_pred = tree_model.predict(X_test)

tree_mse = mean_squared_error(y_test, tree_y_pred)

tree_r2 = r2_score(y_test, tree_y_pred)

model_performance['5. Decision Tree Regression'] = {'MSE': tree_mse, 'R-squared': tree_r2}

# Random Forest Regression

forest_model = RandomForestRegressor()

forest_model.fit(X_train, y_train)

forest_y_pred = forest_model.predict(X_test)

forest_mse = mean_squared_error(y_test, forest_y_pred)

forest_r2 = r2_score(y_test, forest_y_pred)

model_performance['6. Random Forest Regression'] = {'MSE': forest_mse, 'R-squared': forest_r2}

# SVR (Support Vector Regression)

svr_model = SVR()

svr_model.fit(X_train, y_train)

svr_y_pred = svr_model.predict(X_test)

svr_mse = mean_squared_error(y_test, svr_y_pred)

svr_r2 = r2_score(y_test, svr_y_pred)

model_performance['7. Support Vector Regression'] = {'MSE': svr_mse, 'R-squared': svr_r2}

# XGBoost Regression

xgb_model = XGBRegressor()

xgb_model.fit(X_train, y_train)

xgb_y_pred = xgb_model.predict(X_test)

xgb_mse = mean_squared_error(y_test, xgb_y_pred)

xgb_r2 = r2_score(y_test, xgb_y_pred)

model_performance['8. XGBoost Regression'] = {'MSE': xgb_mse, 'R-squared': xgb_r2}

# K-Nearest Neighbors Regression

knn_model = KNeighborsRegressor()

knn_model.fit(X_train, y_train)

knn_y_pred = knn_model.predict(X_test)

knn_mse = mean_squared_error(y_test, knn_y_pred)

knn_r2 = r2_score(y_test, knn_y_pred)

model_performance['9. K-Nearest Neighbors Regression'] = {'MSE': knn_mse, 'R-squared': knn_r2}

# Bayesian Regression

bayesian_model = BayesianRidge()

bayesian_model.fit(X_train, y_train)

bayesian_y_pred = bayesian_model.predict(X_test)

bayesian_mse = mean_squared_error(y_test, bayesian_y_pred)

bayesian_r2 = r2_score(y_test, bayesian_y_pred)

model_performance['10. Bayesian Regression'] = {'MSE': bayesian_mse, 'R-squared': bayesian_r2}

# Neural Network Regression

nn_model = MLPRegressor(max_iter=1000)

nn_model.fit(X_train, y_train)

nn_y_pred = nn_model.predict(X_test)

nn_mse = mean_squared_error(y_test, nn_y_pred)

nn_r2 = r2_score(y_test, nn_y_pred)

model_performance['11. Neural Network Regression'] = {'MSE': nn_mse, 'R-squared': nn_r2}

# Gradient Boosting Regression

gb_model = GradientBoostingRegressor()

gb_model.fit(X_train, y_train)

gb_y_pred = gb_model.predict(X_test)

gb_mse = mean_squared_error(y_test, gb_y_pred)

gb_r2 = r2_score(y_test, gb_y_pred)

model_performance['12. Gradient Boosting Regression'] = {'MSE': gb_mse, 'R-squared': gb_r2}

# Print model performance

for model, performance in model_performance.items():

    print(f"Model: {model}")

    print("   Mean Squared Error (MSE):", performance['MSE'])

    print("   R-squared Score:", performance['R-squared'])

    print()

```

12. PLOTTING PREDECTED vs ACTUAL VALUES GRAPH

```bash

# Create a dictionary to store the model performance

model_performance = {

    'Ridge Regression': {'Predicted': ridge_y_pred, 'Actual': y_test},

    'Lasso Regression': {'Predicted': lasso_y_pred, 'Actual': y_test},

    'Elastic Net Regression': {'Predicted': elastic_net_y_pred, 'Actual': y_test},

    'Polynomial Regression': {'Predicted': poly_y_pred, 'Actual': y_test},

    'Decision Tree Regression': {'Predicted': tree_y_pred, 'Actual': y_test},

    'Random Forest Regression': {'Predicted': forest_y_pred, 'Actual': y_test},

    'Support Vector Regression': {'Predicted': svr_y_pred, 'Actual': y_test},

    'XGBoost Regression': {'Predicted': xgb_y_pred, 'Actual': y_test},

    'K-Nearest Neighbors Regression': {'Predicted': knn_y_pred, 'Actual': y_test},

    'Bayesian Regression': {'Predicted': bayesian_y_pred, 'Actual': y_test},

    'Neural Network Regression': {'Predicted': nn_y_pred, 'Actual': y_test},

    'Gradient Boosting Regression': {'Predicted': gb_y_pred, 'Actual': y_test}

}

# Set up figure and axes

num_models = len(model_performance)

num_rows = (num_models // 3) + (1 if num_models % 3 != 0 else 0)

fig, axes = plt.subplots(num_rows, 3, figsize=(15, num_rows * 5))

# Define color palette

color_palette = plt.cm.Set1(range(num_models))

# Iterate over the models and plot the predicted vs actual values

for i, (model, performance) in enumerate(model_performance.items()):

    row = i // 3

    col = i % 3

    ax = axes[row, col] if num_rows > 1 else axes[col]

    # Get the predicted and actual values

    y_pred = performance['Predicted']

    y_actual = performance['Actual']

    # Scatter plot of predicted vs actual values

    ax.scatter(y_actual, y_pred, color=color_palette[i], alpha=0.5, marker='o')

    # Add a diagonal line for reference

    ax.plot([y_actual.min(), y_actual.max()], [y_actual.min(), y_actual.max()], color='r')

    # Set the title and labels

    ax.set_title(model)

    ax.set_xlabel('Actual')

    ax.set_ylabel('Predicted')

    # Add gridlines

    ax.grid(True)

# Adjust spacing between subplots

fig.tight_layout()

# Create a legend

plt.legend(model_performance.keys(), loc='upper right')

# Show the plot

plt.show()

```

13. IT PRINTS REGRESSION MODEL IN ORDER OF PRECISION AND A FINAL RESULT TELLING WHICH REGRESSION MODEL HAS THE MOST PRECISE VALUE AND WHICH REGRESSION MODEL HAS LEAST PRECISE VALUE

```bash

# Store the regression models and their scores in a dictionary

regression_scores = {

    "Ridge Regression": (ridge_mse, ridge_r2),

    "Elastic Net Regression": (elastic_net_mse, elastic_net_r2),

    "Polynomial Regression": (poly_mse, poly_r2),

    "Random Forest Regression": (forest_mse, forest_r2),

    "Gradient Boosting Regression": (gb_mse, gb_r2),

    "Decision Tree Regression": (tree_mse, tree_r2),

    "Lasso Regression": (lasso_mse, lasso_r2),

    "Support Vector Regression": (svr_mse, svr_r2),

    "XGBoost Regression": (xgb_mse, xgb_r2),

    "K-Nearest Neighbors Regression": (knn_mse, knn_r2),

    "Bayesian Regression": (bayesian_mse, bayesian_r2),

    "Neural Network Regression": (nn_mse, nn_r2),

}

# Sort the regression models based on MSE in ascending order and R-squared score in descending order

sorted_models = sorted(regression_scores.items(), key=lambda x: (x[1][0], -x[1][1]))

print("Regression Models in Order of Precision:")

for i, (model, scores) in enumerate(sorted_models, start=1):

    print(f"{i}. {model}")

    print("   Mean Squared Error (MSE):", scores[0])

    print("   R-squared Score:", scores[1])

    print()

most_precise_model = sorted_models[0][0]

least_precise_model = sorted_models[-1][0]

print(f"The most precise model is: {most_precise_model}")

print(f"The least precise model is: {least_precise_model}")

```

## SUMMARY

- Developed a fully functional Mental Health Fitness Tracker ML model using Python and scikit-learn.

- Utilized 12 types of regression algorithms to achieve precise results in analyzing and predicting mental fitness levels from over 150 countries.

- Cleaned, preprocessed, and engineered features to enhance the model's predictive capabilities.

- Optimized the model's performance by fine-tuning hyperparameters and implementing ensemble methods, resulting in an impressive accuracy percentage of 98.50%.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ha7890846/mental_fitness_tracker

Awesome Lists containing this project

README