{"id":27183104,"url":"https://github.com/realchaula/studentdepression","last_synced_at":"2026-04-28T08:05:59.534Z","repository":{"id":283178658,"uuid":"930245467","full_name":"RealChAuLa/StudentDepression","owner":"RealChAuLa","description":"Machine Learning Algorithms Comparison to Predict  Student Depression with Highest Accuracy","archived":false,"fork":false,"pushed_at":"2025-03-25T22:36:22.000Z","size":8643,"stargazers_count":1,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-04-09T15:51:41.997Z","etag":null,"topics":["decision-tree","k-nearest-neighbours","knn","logistic-regression","machine-learning","neural-network","python","random-forest","streamlit","support-vector-machines","svm"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/RealChAuLa.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-02-10T10:16:31.000Z","updated_at":"2025-03-25T22:36:26.000Z","dependencies_parsed_at":null,"dependency_job_id":"8579c9ef-0412-497c-94ee-ce1c88a1e4f1","html_url":"https://github.com/RealChAuLa/StudentDepression","commit_stats":null,"previous_names":["realchaula/studentdepression"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/RealChAuLa/StudentDepression","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RealChAuLa%2FStudentDepression","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RealChAuLa%2FStudentDepression/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RealChAuLa%2FStudentDepression/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RealChAuLa%2FStudentDepression/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/RealChAuLa","download_url":"https://codeload.github.com/RealChAuLa/StudentDepression/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RealChAuLa%2FStudentDepression/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32371727,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-27T20:07:02.737Z","status":"online","status_checked_at":"2026-04-28T02:00:07.250Z","response_time":56,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["decision-tree","k-nearest-neighbours","knn","logistic-regression","machine-learning","neural-network","python","random-forest","streamlit","support-vector-machines","svm"],"created_at":"2025-04-09T15:41:17.701Z","updated_at":"2026-04-28T08:05:59.519Z","avatar_url":"https://github.com/RealChAuLa.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Student Depression Prediction System\n\n## Table of Contents\n- [Overview \u0026 Demo](#overview)\n- [Features](#features)\n- [Dataset](#dataset)\n- [Project Structure](#project-structure)\n- [Installation](#installation)\n- [Usage](#usage)\n- [Project Chapters](#project-chapters)\n  - [Data Exploration](#data-exploration)\n  - [Data Preprocessing](#data-preprocessing)\n  - [Feature Engineering](#feature-engineering)\n  - [Model Selections \u0026 Training](#model-selections--training)\n  - [Hyperparameter Tuning](#hyperparameter-tuning)\n  - [Model Deployment](#model-deployment)\n- [Technologies Used](#technologies-used)\n- [Future Improvements](#future-improvements)\n- [Contributors](#contributors)\n\n## Overview\nThis project aims to predict depression risk in students using machine learning techniques. By analyzing various academic, lifestyle, and personal factors, the system identifies students who may be at risk of depression, enabling timely intervention and support.\n\nThe project implements a full machine learning pipeline from data preprocessing to model deployment as an interactive web application using Streamlit, making it accessible to users without technical expertise.\n\nhttps://github.com/user-attachments/assets/1b6be219-738c-45a3-b91e-cc978b451010\n\n## Features\n- **Data Analysis \u0026 Preprocessing**: Comprehensive data cleaning, categorical encoding, and feature engineering\n- **Model Training**: Logistic Regression , Decision Tree , SVM , KNN , Random Forest , Neural Network models for binary classification of depression risk\n- **Hyperparameter Tuning**: Implementation of multiple tuning techniques (RandomizedSearchCV, GridSearchCV, and Bayesian Optimization)\n- **Interactive Web Interface**: User-friendly Streamlit application for real-time predictions\n\n## Dataset\nThe dataset contains information about students, including:\n- Demographic information (age, gender, city)\n- Academic factors (CGPA, academic pressure, study satisfaction)\n- Lifestyle factors (sleep duration, dietary habits)\n- Mental health indicators (suicidal thoughts, family history)\n\nThe target variable is the presence or absence of depression, making this a binary classification problem.\n\nYou can find the dataset on Kaggle: [Student Depression Dataset](https://www.kaggle.com/datasets/)\n\n### Sample Data \n\n```csv\nid,Gender,Age,City,Profession,Academic Pressure,Work Pressure,CGPA,Study Satisfaction,Job Satisfaction,Sleep Duration,Dietary Habits,Degree,Have you ever had suicidal thoughts ?,Work/Study Hours,Financial Stress,Family History of Mental Illness,Depression\n2,Male,33.0,Visakhapatnam,Student,5.0,0.0,8.97,2.0,0.0,5-6 hours,Healthy,B.Pharm,Yes,3.0,1.0,No,1\n8,Female,24.0,Bangalore,Student,2.0,0.0,5.9,5.0,0.0,5-6 hours,Moderate,BSc,No,3.0,2.0,Yes,0\n26,Male,31.0,Srinagar,Student,3.0,0.0,7.03,5.0,0.0,Less than 5 hours,Healthy,BA,No,9.0,1.0,Yes,0\n30,Female,28.0,Varanasi,Student,3.0,0.0,5.59,2.0,0.0,7-8 hours,Moderate,BCA,Yes,4.0,5.0,Yes,1\n\n```\n## Project Structure\n```\nStudentDepression/\n├── assets/                      # Images and static files\n├── data/                        # Data files\n│   ├── Student_Depression_Dataset.csv              # Original dataset\n│   ├── PreprocessedData.csv              # Preprocessed dataset\n│   └── FeatureEngineeredData.csv # Feature Engineered dataset\n├── models/                      # Saved machine learning models\n├── chapters/                    # Project modules\n│   ├── data_preprocessing.py   \n│   ├── data_exploration.py   \n│   ├── overview.py  \n│   ├── feature_engineering.py\n│   ├── model_selections_and_training.py\n│   ├── hyperparameter_tuning.py\n│   └── deployment.py\n├── app.py                       # Main Streamlit application\n├── requirements.txt             # Dependencies\n└── README.md                    # This documentation\n```\n\n## Installation\n\n1. Clone the repository:\n    ```sh\n    git clone https://github.com/RealChAuLa/StudentDepression.git\n    cd StudentDepression\n    ```\n\n2. Create a virtual environment and activate it:\n    ```sh\n    python -m venv venv\n    source venv/bin/activate  # On Windows use `venv\\Scripts\\activate`\n    ```\n\n3. Install the required packages:\n    ```sh\n    pip install -r requirements.txt\n    ```\n\n## Usage\n\n1. Run the Streamlit application:\n    ```sh\n    streamlit run app.py\n    ```\n\n2. Open your web browser and navigate to `http://localhost:8501` to interact with the application.\n\n## Project Chapters\n\n### Data Exploration\n\n![Data Exploration](assets/screenshots/data_exploration.png)\n\nVarious visualizations are used to explore the dataset and understand the distribution and relationships between variables. Key visualizations include:\n- Gender Distribution\n- Depression Status\n- Age Distribution\n- Sleep Duration Distribution\n- Academic Pressure\n- Study Satisfaction Level\n- Mental Illness In Family History\n- Suicidal Thoughts Distribution\n- Work/Study Hours Distribution\n- CGPA Distribution\n- Profession Distribution\n- Degree Distribution\n- Job Satisfaction Distribution\n\n### Data Preprocessing\nThis chapter focuses on cleaning the dataset and performing exploratory data analysis to understand the distribution and relationships between variables.\n\n![Data Preprocessing](assets/screenshots/data_preprocessing.png)\n\nKey preprocessing steps:\n- Removal of duplicates and handling missing values\n- Filtering to focus on the student population\n- Removing outliers (students above 35 years)\n- Eliminating uncommon categories with very few records\n\n### Feature Engineering\nThis chapter transforms raw data into meaningful features for the machine learning model.\n\n![Feature Engineering](assets/screenshots/feature_engineering.png)\n\nFeature engineering techniques:\n- Encoding categorical variables (Gender, City, Sleep Duration, etc.)\n- Normalizing numerical features\n- Creating derived features where applicable\n\nEncoding scheme examples:\n- Gender: Male → 0, Female → 1\n- Sleep Duration: Less than 5 hours → 0, 5-6 hours → 1, 7-8 hours → 2, More than 8 hours → 3\n- Plus comprehensive encodings for cities, degrees, and other categorical variables\n\n### Model Selections \u0026 Training\nThis chapter evaluates different machine learning algorithms and selects Logistic Regression as the most suitable for this binary classification task.\n\n![Model Training](assets/screenshots/training.png)\n\nModel evaluation metrics:\n- Accuracy\n- Precision\n- Recall\n- F1 Score\n\nThe initial model provides a baseline performance before tuning.\n\n### Hyperparameter Tuning\nThis chapter implements three different hyperparameter tuning approaches to optimize the model performance.\n\n![Hyperparameter Tuning](assets/screenshots/tuning.png)\n\nTuning methods:\n1. **RandomizedSearchCV**: Efficiently explores the hyperparameter space through random sampling\n2. **GridSearchCV**: Exhaustively searches through a specified parameter grid\n3. **Bayesian Optimization**: Uses probabilistic models to guide the search for optimal parameters\n\nThe performance of each method is compared to select the best configuration.\n\n### Model Deployment\nThis chapter creates an interactive Streamlit web application that allows users to input their information and receive depression risk predictions and personalized recommendations.\n\n![Model Deployment](assets/screenshots/deployment.png)\n\nDeployment features:\n- User-friendly form interface\n- Real-time predictions\n- Risk visualization\n\n## Technologies Used\n- **Python**: Primary programming language\n- **Pandas \u0026 NumPy**: Data manipulation and analysis\n- **Scikit-learn**: Machine learning models and evaluation\n- **Streamlit**: Web application deployment\n- **Plotly**: Interactive data visualization\n- **Joblib**: Model serialization\n- **Scikit-optimize**: Bayesian optimization for hyperparameter tuning\n\n## Future Improvements\n- Expand the dataset to include more diverse student populations\n- Implement more advanced models (e.g., ensemble methods)\n- Add time-series analysis to track changes in depression risk over time\n- Develop a notification system for high-risk individuals\n- Integrate with university counseling services\n\n## Contributors\n- [Chalana Devinda](https://github.com/RealChAuLa)\n- [Sasith Hansaka](https://github.com/sasithhansaka) \n\n---\n\n**Disclaimer**: This tool provides an estimate based on statistical patterns and should not be used as a substitute for professional medical advice, diagnosis, or treatment. If you're experiencing mental health concerns, please consult with a qualified healthcare provider.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frealchaula%2Fstudentdepression","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frealchaula%2Fstudentdepression","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frealchaula%2Fstudentdepression/lists"}