{"id":25497445,"url":"https://github.com/linconavila/correlation-regression-analysis-research-project","last_synced_at":"2025-11-10T04:30:18.149Z","repository":{"id":274092954,"uuid":"916355190","full_name":"LinconAvila/Correlation-Regression-Analysis-Research-Project","owner":"LinconAvila","description":null,"archived":false,"fork":false,"pushed_at":"2025-02-14T01:32:05.000Z","size":16,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-14T02:36:04.459Z","etag":null,"topics":["correlation-analysis","r","regression-analysis","statistics"],"latest_commit_sha":null,"homepage":"","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/LinconAvila.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-01-13T23:49:20.000Z","updated_at":"2025-02-14T01:32:09.000Z","dependencies_parsed_at":null,"dependency_job_id":"bfc6d5de-3741-496c-bb86-6fcadfd32e71","html_url":"https://github.com/LinconAvila/Correlation-Regression-Analysis-Research-Project","commit_stats":null,"previous_names":["linconavila/correlation-regression-analysis-research-project"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LinconAvila%2FCorrelation-Regression-Analysis-Research-Project","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LinconAvila%2FCorrelation-Regression-Analysis-Research-Project/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LinconAvila%2FCorrelation-Regression-Analysis-Research-Project/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LinconAvila%2FCorrelation-Regression-Analysis-Research-Project/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/LinconAvila","download_url":"https://codeload.github.com/LinconAvila/Correlation-Regression-Analysis-Research-Project/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":239576746,"owners_count":19662114,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["correlation-analysis","r","regression-analysis","statistics"],"created_at":"2025-02-19T01:19:56.219Z","updated_at":"2025-11-10T04:30:18.108Z","avatar_url":"https://github.com/LinconAvila.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Correlation and Regression Analysis Research Project\n\nThis repository contains the code and documentation for a research project analyzing the correlation and regression between the Human Development Index (HDI) and Life Expectancy. The analysis is performed using R and employs statistical techniques such as descriptive analysis, hypothesis testing, and regression modeling.\n\n## Project Overview\n\nThis project is divided into two main stages:\n\n1. **Exploratory Data Analysis (EDA)**:\n   - Analysis of HDI and Life Expectancy for the 50 highest-ranked countries globally.\n   - Univariate analysis: Includes measures like mean, median, standard deviation, and skewness for individual variables.\n   - Bivariate analysis: Examines relationships between HDI and Life Expectancy using scatterplots and correlation coefficients.\n\n2. **Regression Analysis**:\n   - Construction and evaluation of a linear regression model to describe the relationship between HDI (independent variable) and Life Expectancy (dependent variable).\n   - Assessment of model assumptions, including residual independence, normality, and homoscedasticity.\n   - Statistical tests to evaluate the significance of regression coefficients and overall model fit.\n\n## Repository Contents\n\n### Code\n- **`script.r`**: Contains code for data preprocessing, exploratory data analysis, and visualization.\n- **`script2.r`**: Implements linear regression modeling, statistical testing (ANOVA, Shapiro-Wilk, Ljung-Box), and model diagnostics.\n\n### Database\n- **`database.txt`**: Includes HDI and Life Expectancy data for the top 50 countries globally, sourced from:\n  - **Human Development Index (HDI)**: Extracted from the UNDP Human Development Report 2022, which evaluates development based on health, education, and living standards.\n  - **Life Expectancy**: Collected from the CIA World Factbook 2022, representing the average years a person is expected to live at birth.\n \n### Research Papers and Slides\n\n- Link to Folder: [Here](https://drive.google.com/drive/folders/1Fd6u5p-lweRe2Og5dW7qOepyhS0gsz8Y?usp=sharing)\n\nThis folder contains research papers and a slide presentation (in Portuguese) summarizing the key findings of the study.\n\n## Key Findings\n\n1. **Correlation Analysis**:\n   - Pearson's correlation coefficient `r = 0.668` indicates a moderate positive relationship between HDI and Life Expectancy.\n   - Hypothesis testing confirmed the statistical significance of this relationship (`p-value \u003c 0.05`).\n\n2. **Regression Model**:\n   - Linear regression equation: `Life Expectancy = 38.315 + 46.950 * HDI`.\n   - `R² = 0.4465`: About 44.65% of the variation in Life Expectancy is explained by HDI.\n   - Residual analysis showed no significant violations of model assumptions.\n\n3. **Model Limitations**:\n   - The model explains only 44.65% of the variance, suggesting other factors influence Life Expectancy.\n   - Future analyses could incorporate additional socioeconomic and environmental variables.\n\n## Requirements\n\n- **R version**: `\u003e= 4.0.0`\n- **R libraries**: `ggplot2`, `dplyr`, `car`, `lmtest`\n\n## Usage\n\n1. Clone the repository:\n   ```bash\n   git clone https://github.com/LinconAvila/Correlation-Regression-Analysis-Research-Project.git\n   cd Correlation-Regression-Analysis-Research-Project\n   ```\n\n2. Open the scripts (`script.r` and `script2.r`) in RStudio.\n\n3. Run the scripts sequentially to reproduce the analyses and generate the visualizations.\n\n## References\n\n- CIA World Factbook (2022): Life Expectancy data. [https://www.cia.gov/the-world-factbook/about/archives/2022/field/life-expectancy-at-birth/country-comparison](https://www.cia.gov/the-world-factbook/about/archives/2022/field/life-expectancy-at-birth/country-comparison)\n- UNDP (2022): Human Development Index. [https://hdr.undp.org/system/files/documents/global-report-document/hdr2021-22overviewen.pdf](https://hdr.undp.org/system/files/documents/global-report-document/hdr2021-22overviewen.pdf)\n- R Documentation: [https://www.rdocumentation.org/](https://www.rdocumentation.org/)\n\n## Author\n\nLincon Avila de Souza  \nFundação Universidade Federal do Rio Grande (FURG)  \n2024-2025\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flinconavila%2Fcorrelation-regression-analysis-research-project","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flinconavila%2Fcorrelation-regression-analysis-research-project","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flinconavila%2Fcorrelation-regression-analysis-research-project/lists"}