https://github.com/kakri787/alcoholism-and-grade-analysis
A mini project for university data science module where we analyzed on the relationship between alcohol consumption in students and their academic performance, making use of exploratory data analysis and machine learning techniques to see if we can predict student's grades.
https://github.com/kakri787/alcoholism-and-grade-analysis
data-analysis data-science data-vizualisation lasso-regression machine-learning neural-network
Last synced: 21 days ago
JSON representation
A mini project for university data science module where we analyzed on the relationship between alcohol consumption in students and their academic performance, making use of exploratory data analysis and machine learning techniques to see if we can predict student's grades.
- Host: GitHub
- URL: https://github.com/kakri787/alcoholism-and-grade-analysis
- Owner: kakri787
- Created: 2024-04-18T13:15:50.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-04-23T07:41:08.000Z (about 1 year ago)
- Last Synced: 2025-04-12T22:56:01.262Z (21 days ago)
- Topics: data-analysis, data-science, data-vizualisation, lasso-regression, machine-learning, neural-network
- Language: Jupyter Notebook
- Homepage:
- Size: 16.2 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Alcohol Consumption and Academic Performance Repository
## About
This is our Mini-Project for SC1015, where we focused on analyzing the relationship between alcohol consumption in students and their academic performance. A detailed walkthrough of our project can be viewed from our source code [here](https://github.com/kakri787/alcoholism-and-grade-analysis/blob/c4ef8e35e4255674938fbc6ec907f8aafa9a685e/Mini%20Project%20Source%20Code.ipynb).
## Problem Definition
* Are we able to predict the academic performance, measured by final grade, of a student based on the frequency of their alcohol consumption?
* If not, are there any other factors within the dataset that are better suit for predicting the academic performance?
## Contributors
* @karki787 (Kar Kit) - Data Preparation and Cleanup, Lasso Regression/Neural Network Embedding models
* @kyoniea (Nicholas) - Worked on EDA and problem formulation, Presentation Slides
* @showtimezxc (Aliff) - Worked on EDA and research topic, Presentation Slides
## Models Used
* Lasso Regression
* Neural Network Embedding
## Conclusion
* Alcohol consumption has a very low negative correlation with academic performance
* Alcohol consumption cannot be used as a predictor for student's grades
* Several other factors that contribute to a student's learning environment and background also have low correlation with academic performance
* The optimal combination of variables in our dataset is unable to predict student's grades accurately
* A student's decision to pursue higher education has the greatest correlation with their grades
* Neural Network Embedding model was able to predict grades with mean absolute error of around 20% which is still considerably high.
* Every student is unique and there is no standard baseline of a student's academic ability thus it is difficulty to predict student's grade## What did we learn?
* Handling skewed data
* Lasso Regression
* Neural Network Embedding, Keras, Tensorflow
## References
https://en.wikipedia.org/wiki/Lasso_(statistics)
https://towardsdatascience.com/neural-network-embeddings-explained-4d028e6f0526
https://www.kaggle.com/datasets/uciml/student-alcohol-consumption
https://www.kaggle.com/datasets/whenamancodes/alcohol-effects-on-study