An open API service indexing awesome lists of open source software.

https://github.com/juzershakir/student-intervention

Predicting the likelihood that a given student will pass or not!
https://github.com/juzershakir/student-intervention

adaboost classification f1-score gridsearchcv logistic-regression machine-learning python2 random-forest scikit-learn student-intervention supervised-learning

Last synced: 8 months ago
JSON representation

Predicting the likelihood that a given student will pass or not!

Awesome Lists containing this project

README

          

# Creating a Student Intervention System
## Supervised Learning, Classification

-----

### Table Of Contents:
- [Description](#description)

- [About CharityML Project](#about-charityml-project)

- [What needs to be done](#what-needs-to-be-done)

- [Why this project](#why-this-project)

- [Data](#data)

- [Files](#files)

- [Dataset file](#dataset-file)

- [Loading Project](#loading-project)

- [Requirements](#requirements)

- [Execution](#execution)

----

### Description

#### About the Project
As education has grown to rely more on technology, vast amounts of data has become available for examination and prediction. Logs of student activities, grades, interactions with teachers and fellow students, and more, are now captured in real time through learning management systems like **Canvas** and **Edmodo**. This is especially true for online classrooms, which are becoming popular even at the primary and secondary school level. Within all levels of education, there exists a push to help increase the likelihood of student success, without watering down the education or engaging in behaviors that fail to improve the underlying issues. Graduation rates are often the criteria of choice, and educators seek new ways to predict the success and failure of students early enough to stage effective interventions.

#### What needs to be done
A local school district has a goal to reach a 95% graduation rate by the end of the decade by identifying students who need intervention before they drop out of school. As a software engineer contacted by the school district, my task is to model the factors that predict how likely a student is to pass their high school final exam, by constructing an intervention system that leverages supervised learning techniques. The board of supervisors has asked that I find the most effective model that uses the least amount of computation costs to save on the budget. I will need to analyze the dataset on students' performance and develop a model that will predict the likelihood that a given student will pass, quantifying whether an intervention is necessary.

-----

### Data

#### Files
- `report.ipynb` : This is the main file where I will be performing my work on the project.
- `data.csv` : The dataset required for the project. I'll load this dataset in notebook.
- `export/` : Folder containing HTML and PDF version file of notebook.

Template code is provided in the `report.ipynb` notebook file. I'll be required to use `data.csv` dataset file to complete this work. While some code has already been implemented to get us started, I have implement additional functionality when requested to successfully complete the project.

#### Dataset file
The dataset used in this project is included as `data.csv`. This dataset has the following attributes:

- `school` : student's school (binary: "GP" or "MS")
- `sex` : student's sex (binary: "F" - female or "M" - male)
- `age` : student's age (numeric: from 15 to 22)
- `address` : student's home address type (binary: "U" - urban or "R" - rural)
- `famsize` : family size (binary: "LE3" - less or equal to 3 or "GT3" - greater than 3)
- `Pstatus` : parent's cohabitation status (binary: "T" - living together or "A" - apart)
- `Medu` : mother's education (numeric: 0 - none, 1 - primary education (4th grade), 2 - 5th to 9th grade, 3 - secondary education or 4 - higher education)
- `Fedu` : father's education (numeric: 0 - none, 1 - primary education (4th grade), 2 - 5th to 9th grade, 3 - secondary education or 4 - higher education)
- `Mjob` : mother's job (nominal: "teacher", "health" care related, civil "services" (e.g. administrative or police), "at_home" or "other")
- `Fjob` : father's job (nominal: "teacher", "health" care related, civil "services" (e.g. administrative or police), "at_home" or "other")
- `reason` : reason to choose this school (nominal: close to "home", school "reputation", "course" preference or "other")
- `guardian` : student's guardian (nominal: "mother", "father" or "other")
- `traveltime` : home to school travel time (numeric: 1 - <15 min., 2 - 15 to 30 min., 3 - 30 min. to 1 hour, or 4 - >1 hour)
- `studytime` : weekly study time (numeric: 1 - <2 hours, 2 - 2 to 5 hours, 3 - 5 to 10 hours, or 4 - >10 hours)
- `failures` : number of past class failures (numeric: n if 1<=n<3, else 4)
- `schoolsup` : extra educational support (binary: yes or no)
- `famsup` : family educational support (binary: yes or no)
- `paid` : extra paid classes within the course subject (Math or Portuguese) (binary: yes or no)
- `activities` : extra-curricular activities (binary: yes or no)
- `nursery` : attended nursery school (binary: yes or no)
- `higher` : wants to take higher education (binary: yes or no)
- `internet` : Internet access at home (binary: yes or no)
- `romantic` : with a romantic relationship (binary: yes or no)
- `famrel` : quality of family relationships (numeric: from 1 - very bad to 5 - excellent)
- `freetime` : free time after school (numeric: from 1 - very low to 5 - very high)
- `goout` : going out with friends (numeric: from 1 - very low to 5 - very high)
- `Dalc` : workday alcohol consumption (numeric: from 1 - very low to 5 - very high)
- `Walc` : weekend alcohol consumption (numeric: from 1 - very low to 5 - very high)
- `health` : current health status (numeric: from 1 - very bad to 5 - very good)
- `absences` : number of school absences (numeric: from 0 to 93)
- `passed` : did the student pass the final exam (binary: yes or no)

-----

### Loading Project

#### Requirements

This project requires **Python 2.7.15** and the following Python libraries installed:

- [Python 2.7.15](https://www.python.org/downloads/release/python-2715/)
- [NumPy](http://www.numpy.org/)
- [Pandas](http://pandas.pydata.org)
- [scikit-learn 0.19.1](http://scikit-learn.org/stable/whats_new.html#version-0-19-1)

You will also need to have software installed to run and execute a [Jupyter Notebook](http://ipython.org/notebook.html)

I recommend to install [Anaconda](https://www.anaconda.com/download/), a pre-packaged Python distribution that contains all of the necessary libraries and software for this project.

#### Execution

In a terminal or command window, navigate to the top-level project directory `Student-Intervention/` (that contains this README) and run one of the following commands:

```bash
ipython notebook report.ipynb
```
or
```bash
jupyter notebook report.ipynb
```
or if you have 'Jupyter Lab' installed
```bash
jupyter lab
```

This will open the Jupyter/iPython Notebook software and project file in your browser.

-----