Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/sahar-dev/multivariate_statistics
Explore multivariate statistics through hands-on university projects. Each project delves into real-world datasets, applying statistical techniques like ANOVA, two-factor analysis, and binary logistic regression. Understand data analysis, interpretation, and modeling with R.
https://github.com/sahar-dev/multivariate_statistics
anova-analysis linear-regression multivariate-statistics r
Last synced: about 1 month ago
JSON representation
Explore multivariate statistics through hands-on university projects. Each project delves into real-world datasets, applying statistical techniques like ANOVA, two-factor analysis, and binary logistic regression. Understand data analysis, interpretation, and modeling with R.
- Host: GitHub
- URL: https://github.com/sahar-dev/multivariate_statistics
- Owner: Sahar-dev
- Created: 2023-12-24T14:52:12.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2023-12-24T15:04:14.000Z (12 months ago)
- Last Synced: 2023-12-24T16:19:40.838Z (12 months ago)
- Topics: anova-analysis, linear-regression, multivariate-statistics, r
- Language: Jupyter Notebook
- Homepage:
- Size: 472 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Multivariate Statistics University Projects
This repository contains projects related to multivariate statistics completed as part of university coursework. Each project explores the application of statistical techniques to real-world datasets, providing valuable insights into data analysis and interpretation. Below, you'll find summaries of each project along with instructions on how to explore them.
## TP1: Analysis of Variance (ANOVA) - One Factor
Explore the distribution of lead concentration across four substances using one-way Analysis of Variance (ANOVA). Key steps include:- **Variable Description**: Understand the variable "Lead" in relation to the type of product.
- **Preliminary Tests**: Verify ANOVA assumptions through preliminary tests.
- **ANOVA Testing**: Examine if there's a significant difference in lead concentration among the four substances.
- **Residual Analysis**: Validate ANOVA results through a meticulous analysis of residuals.
- **Bonferroni Correction**: Identify the substance with the highest lead concentration using Bonferroni correction.## TP2: Two-Factor Analysis of Variance
### Exercise 1:
Evaluate responses to three treatments at two different doses each. Key objectives include:- **Variable Analysis**: Describe the "Response" variable and assess its normality.
- **Dose-Response Relationship**: Investigate the relationship between "Response" and preparations at different doses.
- **Average Response**: Calculate average responses for each treatment and draw meaningful conclusions.
- **Mode 1 Analysis**: Analyze the effect of the preparation on the response without considering the dose.
- **Mode 2 Analysis**: Test the effects of treatment, dose, and their interaction.
- **Subgroup Analysis**: Conduct subgroup analysis based on the dose, determining the most effective treatment using pairwise t-tests with Bonferroni correction.
- **ANCOVA**: Perform an analysis of covariance (ANCOVA) considering patient age as a covariate.## TP3: Binary Logistic Regression
Apply binary logistic regression to the "birthwt" dataset, aiming to identify risk factors associated with low birth weight. Tasks include:- **Data Preprocessing**: Transform variables for regression analysis.
- **Dataset Partitioning**: Split the dataset into training and testing sets.
- **Regression Modeling**: Apply logistic regression with model selection using AIC and stepwise methods.
- **Influential Values**: Identify influential values and determine significant variables affecting low birth weight.
- **Odds Ratios and Confidence Intervals**: Calculate odds ratios and their confidence intervals.
- **Performance Evaluation**: Assess predictive performance using the "blorr" package.Feel free to explore each project for comprehensive analyses and insights into the world of multivariate statistical techniques. Dive into the code, methodologies, and results to enhance your understanding of statistical modeling.
## How to Explore
1. Navigate to each project folder (`TP1`, `TP2`, `TP3`) for detailed documentation and code.
2. Review the provided README files for specific instructions and objectives.
3. Dive into the R scripts and notebooks to understand the methodology and analysis steps.
4. Explore the datasets associated with each project.
5. Gain insights into statistical techniques and their application through real-world examples.Feel free to use, modify, and learn from the projects. If you have any questions or suggestions, please don't hesitate to reach out. Happy coding!