{"id":23825984,"url":"https://github.com/elachaabane/educational-performance-analysis","last_synced_at":"2026-05-12T11:30:17.029Z","repository":{"id":269704660,"uuid":"908206373","full_name":"ElaChaabane/Educational-Performance-Analysis","owner":"ElaChaabane","description":"analyse student exam performance using nonparametric statistical methods","archived":false,"fork":false,"pushed_at":"2024-12-25T12:59:44.000Z","size":157,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-22T01:42:58.740Z","etag":null,"topics":["dunn-s-test","kruskal-algorithm","nonparametric","r","spearman-correlation","statistical-analysis","student-performance"],"latest_commit_sha":null,"homepage":"","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ElaChaabane.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-12-25T12:49:51.000Z","updated_at":"2024-12-25T12:59:47.000Z","dependencies_parsed_at":"2024-12-25T13:47:28.277Z","dependency_job_id":null,"html_url":"https://github.com/ElaChaabane/Educational-Performance-Analysis","commit_stats":null,"previous_names":["elachaabane/educational-performance-analysis"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ElaChaabane%2FEducational-Performance-Analysis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ElaChaabane%2FEducational-Performance-Analysis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ElaChaabane%2FEducational-Performance-Analysis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ElaChaabane%2FEducational-Performance-Analysis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ElaChaabane","download_url":"https://codeload.github.com/ElaChaabane/Educational-Performance-Analysis/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240113943,"owners_count":19749829,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dunn-s-test","kruskal-algorithm","nonparametric","r","spearman-correlation","statistical-analysis","student-performance"],"created_at":"2025-01-02T12:12:23.994Z","updated_at":"2026-05-12T11:30:16.969Z","avatar_url":"https://github.com/ElaChaabane.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Educational Performance Analysis\n\n## Project Overview\nThis project analyzes factors influencing student exam performance using nonparametric statistical methods. Based on the research paper \"Simplifying Statistical Decision Making: A Research Scholar's Guide to Parametric and Non-Parametric Methods\" by Pirani (2024) doi:10.56815/IJMRR.V3I3.2024/184-192 , we implement a comprehensive analysis following the CRISP-DM methodology.\n\n## Table of Contents\n- [Business Understanding](#business-understanding)\n- [Data Understanding](#data-understanding)\n- [Data Preparation](#data-preparation)\n- [Statistical Method Selection](#statistical-method-selection)\n- [Modeling](#modeling)\n- [Evaluation](#evaluation)\n- [Dependencies](#dependencies)\n- [Installation](#installation)\n- [Usage](#usage)\n\n## Business Understanding\nThe goal is to identify key factors affecting student exam performance to help educators and administrators prioritize interventions and resource allocation.\n\n### Target Variable\n- `Exam_Score`: Final exam performance\n\n### Features\n#### Numeric Variables\n- `Hours_Studied`: Study hours allocated\n- `Attendance`: Class attendance percentage\n- `Sleep_Hours`: Average daily sleep\n- `Previous_Scores`: Prior assessment scores\n- `Tutoring_Sessions`: Extra tutoring count\n- `Physical_Activity`: Weekly physical activity hours\n\n#### Categorical Variables\n- Parental Involvement\n- Access to Resources\n- Motivation Level\n- Family Income\n- Extracurricular Activities\n- School Type\n- Gender\n- And others...\n\n## Data Preparation\n\n### Outlier Treatment\n- Used IQR method (1.5 × IQR rule)\n- Implemented boxplot visualization\n- Treated outliers by replacing with NA values\n\n### Missing Value Treatment\n- Utilized KNN imputation (k=5)\n- Visualized missing data patterns\n- Handled both numeric and categorical variables\n\n## Statistical Method Selection\n\n### Normality Testing\n1. Visual inspection:\n   - Histograms with density plots\n   - Q-Q plots\n\n2. Statistical testing using Jarque-Bera test:\n   - H0: Data follows normal distribution\n   - H1: Data does not follow normal distribution\n   - Results: p-values \u003c 0.05 for most variables:\n     - Hours_Studied: p = 0.003\n     - Attendance: p = 0.001\n     - Sleep_Hours: p = 0.002\n     - Previous_Scores: p = 0.004\n     - Exam_Score: p = 0.001\n     - Only Physical_Activity showed p \u003e 0.05 (p = 0.067)\n\n### Choice of Nonparametric Methods\nBased on normality test results (p \u003c 0.05), we rejected the null hypothesis of normality and selected nonparametric methods for analysis:\n1. Spearman correlation instead of Pearson\n2. Kruskal-Wallis test instead of ANOVA\n3. Dunn's test for post-hoc analysis\n\n## Modeling\nImplemented nonparametric methods:\n\n1. Correlation Analysis:\n   - Spearman correlation for numeric variables\n   - Correlation matrix visualization\n\n2. Categorical Analysis:\n   - Kruskal-Wallis tests for variables with \u003e2 groups\n   - Dunn's test with Bonferroni correction for pairwise comparisons\n   - Selection criteria: p \u003c 0.05 for statistical significance\n\n## Key Findings\n1. Strongest predictors of exam performance:\n   - Attendance (ρ = 0.69)\n   - Hours studied (ρ = 0.49)\n   - Parental involvement (Kruskal-Wallis p \u003c 0.001)\n\n2. Moderate impact factors:\n   - Previous scores (ρ = 0.19)\n   - Family income (Kruskal-Wallis p \u003c 0.001)\n   - Access to resources (Kruskal-Wallis p \u003c 0.001)\n\n3. Non-significant factors:\n   - Gender (Kruskal-Wallis p = 0.4548)\n   - School type (Kruskal-Wallis p = 0.3271)\n\n## Dependencies\n```R\nlibrary(ggplot2)      # Visualization\nlibrary(gridExtra)    # Multiple plot arrangement\nlibrary(dplyr)        # Data manipulation\nlibrary(tidyr)        # Data reshaping\nlibrary(rstatix)      # Statistical tests\nlibrary(corrplot)     # Correlation visualization\nlibrary(tseries)      # Jarque-Bera test\nlibrary(VIM)          # KNN imputation\n```\n\n## Installation\n1. Clone this repository\n2. Install R and RStudio\n3. Install required packages:\n```R\ninstall.packages(c(\"ggplot2\", \"gridExtra\", \"dplyr\", \"tidyr\", \n                  \"rstatix\", \"corrplot\", \"tseries\", \"VIM\"))\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Felachaabane%2Feducational-performance-analysis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Felachaabane%2Feducational-performance-analysis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Felachaabane%2Feducational-performance-analysis/lists"}