https://github.com/zobayerakib/student-result-data-analysis__data-analysis-project

linear-regression machine-learning mathplotlib numpy pandas predictive-analytics random-forest-regression seaborn student-result-analysis

Last synced: 4 months ago
JSON representation

Host: GitHub
URL: https://github.com/zobayerakib/student-result-data-analysis__data-analysis-project
Owner: ZobayerAkib
License: mit
Created: 2024-06-12T08:12:33.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2024-06-12T08:17:06.000Z (about 2 years ago)
Last Synced: 2025-12-31T15:06:31.812Z (7 months ago)
Topics: linear-regression, machine-learning, mathplotlib, numpy, pandas, predictive-analytics, random-forest-regression, seaborn, student-result-analysis
Language: Jupyter Notebook
Homepage:
Size: 572 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: License

Awesome Lists containing this project

README

          ## Student Result Analysis using Regression Models

### Overview

In this project, we aimed to analyze student results using machine learning regression models. We divided the dataset into an 80:20 ratio for training and testing purposes. Our goal was to predict the scores of various subjects ('math_score', 'history_score', 'physics_score', 'chemistry_score', 'biology_score', 'english_score', 'geography_score') based on the number of weekly self-study hours.

### Model Training and Evaluation

We utilized two regression models for training:

- **Linear Regression (LR)**

- **Random Forest Regression (RDF)**

After training the models, we evaluated their performance using the following metrics:

- **Mean Absolute Error (MAE)**

- **Mean Squared Error (MSE)**

- **Root Mean Squared Error (RMSE)**

- **R-squared (R²)**

### Results

| Model                   | MAE    | MSE     | RMSE    | R²      |

|-------------------------|--------|---------|---------|---------|

| Linear Regression (LR)  | 10.5696| 157.1996| 12.5121 | 0.0699  |

| Random Forest (RDF)    | 10.3420| 155.1183| 12.4352 | 0.0810  |

### Conclusion

Both models yielded similar results, with the Random Forest Regression slightly outperforming the Linear Regression model. However, the overall predictive performance indicates that the relationship between weekly self-study hours and subject scores may be more complex and may require further investigation or feature engineering.

### Data Source

The dataset used for this analysis is sourced from [Kaggle].

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/zobayerakib/student-result-data-analysis__data-analysis-project

Awesome Lists containing this project

README