https://github.com/jabulente/tukeys-honest-significant-difference
This project explores Tukey’s Honest Significant Difference test as a robust statistical method for comparing group means after conducting ANOVA. In real-world data analysis, we often need to determine not just whether groups are different, but which specific groups differ
https://github.com/jabulente/tukeys-honest-significant-difference
ai anova-test exploratory-data-analysis ml post-hoc-analysis python scipy statsmodels tukey-hsd
Last synced: 26 days ago
JSON representation
This project explores Tukey’s Honest Significant Difference test as a robust statistical method for comparing group means after conducting ANOVA. In real-world data analysis, we often need to determine not just whether groups are different, but which specific groups differ
- Host: GitHub
- URL: https://github.com/jabulente/tukeys-honest-significant-difference
- Owner: Jabulente
- License: mit
- Created: 2025-08-25T14:03:08.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2025-09-22T07:02:32.000Z (8 months ago)
- Last Synced: 2025-09-22T09:21:47.978Z (8 months ago)
- Topics: ai, anova-test, exploratory-data-analysis, ml, post-hoc-analysis, python, scipy, statsmodels, tukey-hsd
- Language: Jupyter Notebook
- Homepage:
- Size: 1.9 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
Tukey’s Honest Significant Difference
This project explores **Tukey’s Honest Significant Difference test** as a robust statistical method for comparing group means after conducting **ANOVA**. In real-world data analysis, we often need to determine not just whether groups are different, but **which specific groups differ**. This project implements the HSD procedure in **Python** to allow precise post-hoc comparisons across multiple groups. By providing detailed insights into group-level differences, the analysis helps researchers, students, and professionals make statistically sound conclusions while minimizing the risk of false positives due to multiple comparisons.
The workflow was carefully designed to ensure replicability, clarity, and flexibility. The project demonstrates not only the implementation of statistical methods but also best practices in **data analysis workflow, reproducibility, visualization, and interpretation**.
---
## 1. ⚙️ Workflow / Methodology
1. **Data Preparation:** Import, clean, and preprocess data; ensure ANOVA assumptions are satisfied.
2. **Exploratory Data Analysis (EDA):** Generate summary statistics, descriptive plots, and check distributions and variances.
3. **ANOVA:** Test for overall group differences and assess significance.
4. **Post-hoc Analysis (Tukey HSD):** Compare pairs of groups with adjustments for multiple testing.
5. **Results Interpretation:** Present significant differences in tables and visualizations.
6. **Documentation & Reporting:** Deliver structured, clear textual and graphical summaries of findings.
## 2. 🛠️ Tools and Technologies Used
This project was built in **Python** and demonstrates proficiency with the following libraries:
* **pandas** → Data manipulation and analysis
* **numpy** → Numerical computations
* **scipy** → Statistical testing
* **statsmodels** → ANOVA and post-hoc tests (including Tukey’s HSD)
* **matplotlib** & **seaborn** → Data visualization and results plotting
* **Jupyter Notebook** → Interactive workflow, documentation, and reporting
## 3. Results
* ANOVA confirmed overall group differences.
* Tukey’s HSD identified specific group pairs with significant differences.
* Visualizations (confidence intervals and mean difference plots) supported findings.
* Clear group separation provides actionable insights for research or decision-making.
## 4. 🤝 Contribution
Contributions are welcome! If you’d like to:
* Improve the methodology
* Add datasets for testing
* Enhance visualization and reporting
Please fork the repository and create a pull request.
## 5. 📜 License
This project is licensed under the **MIT License** – you’re free to use, modify, and distribute with attribution.