{"id":27111884,"url":"https://github.com/jabulente/compact-letters-display","last_synced_at":"2025-04-09T22:04:24.523Z","repository":{"id":286466892,"uuid":"961488819","full_name":"Jabulente/Compact-Letters-Display","owner":"Jabulente","description":"This repository shows how to create compact letter displays (CLDs) in Python after ANOVA and Tukey HSD tests, and how to generate publication-ready tables for summary statistics and statistical inferences from datasets.","archived":false,"fork":false,"pushed_at":"2025-04-07T11:32:03.000Z","size":214,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-09T22:04:24.396Z","etag":null,"topics":["anova","compact-letters-display","exploratory-data-analysis","inferential-statistics","python","research","statistics","tables","turkey-hsd"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Jabulente.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-04-06T16:19:52.000Z","updated_at":"2025-04-07T11:32:06.000Z","dependencies_parsed_at":"2025-04-06T17:39:25.119Z","dependency_job_id":"d67f7f84-1cf4-41c3-9c7a-daf5aa9c41f9","html_url":"https://github.com/Jabulente/Compact-Letters-Display","commit_stats":null,"previous_names":["jabulente/compact-letters-display"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Jabulente%2FCompact-Letters-Display","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Jabulente%2FCompact-Letters-Display/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Jabulente%2FCompact-Letters-Display/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Jabulente%2FCompact-Letters-Display/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Jabulente","download_url":"https://codeload.github.com/Jabulente/Compact-Letters-Display/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248119296,"owners_count":21050755,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anova","compact-letters-display","exploratory-data-analysis","inferential-statistics","python","research","statistics","tables","turkey-hsd"],"created_at":"2025-04-07T01:25:09.905Z","updated_at":"2025-04-09T22:04:24.503Z","avatar_url":"https://github.com/Jabulente.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Compact Letter Displays and Publication-Ready Tables in Python\n\nThis repository provides tools and examples for creating **Compact Letter Displays (CLDs)** in Python following **ANOVA** and **Tukey HSD** tests. It also includes utilities for generating **publication-ready tables** from datasets, summarizing statistics, visualizing distributions, and drawing statistical inference.\n\n---\n\n## Features\n\n- Perform one-way ANOVA tests on grouped data\n- Conduct **Tukey’s HSD** post-hoc analysis\n- Automatically generate **Compact Letter Displays (CLDs)**\n- Create **publication-ready summary tables** with means, standard errors, and group labels\n- Provide descriptive and inferential statistics for reporting\n\n---\n\n## Technologies Used\n\n- **Python 3.x**\n- **pandas** – Data manipulation\n- **numpy** – Numerical operations\n- **statsmodels** – ANOVA and statistical modeling\n- **scikit-posthocs** – Post-hoc tests and multiple comparisons\n- **matplotlib** \u0026 **seaborn** – Data visualization\n- **scipy** – Statistical functions\n\n---\n\n## Installation\n\nInstall the required Python libraries using pip:\n\n```bash\npip install pandas numpy statsmodels scikit-posthocs seaborn matplotlib scipy\n```\n\n---\n\n## 📂 Project Structure\n```\n📦 Compact Letters Display/\n│── 📂 Datasets/                      # Folder for raw and processed datasets  \n│   ├── Dataset.csv         # Cleaned and preprocessed data  \n│\n│── 📂 src/                       # Source code and core scripts  \n│   ├── __init__.py               # Makes this directory a Python package  \n│   ├── perform_tests.py          # Statistical test functions (e.g., ANOVA)  \n│   ├── cld_assignment.py         # Functions to assign compact letter displays  \n│   ├── visualization.py          # Plotting and visualization scripts  \n│\n│── 📂 Notebooks/                 # Jupyter Notebooks for exploratory analysis  \n│   ├── exploratory_analysis.ipynb # EDA and statistical exploration  \n│   ├── final_results.ipynb       # Notebook summarizing final results  \n│\n│── 📂 Figures/                   # Generated plots and charts  \n│   ├── cld_plot.png              # Example CLD visualization  \n│   ├── boxplot.png               # Boxplot with statistical comparisons  \n│   ├── barplot.png               # Barplot with compact letters  \n│\n│── 📂 Results/                   # Processed results, tables, and summary files  \n│   ├── anova_results.csv         # Results of ANOVA/statistical tests  \n│   ├── cld_results.csv           # Compact letter display assignments  \n│   ├── summary_table.csv         # Final structured results table  \n│\n│── 📂 docs/                      # Documentation and reports  \n│   ├── report.pdf                # Detailed project report (if applicable)  \n│\n│── 📂 tests/                     # Unit tests for functions  \n│   ├── test_perform_tests.py     # Tests for statistical functions  \n│   ├── test_visualization.py     # Tests for visualization functions  \n│\n│── .gitignore                    # Ignore unnecessary files  \n│── requirements.txt               # Required Python libraries  \n│── setup.py                       # Script for packaging (if needed)  \n│── main.py                        # Main script to execute the pipeline  \n├── README.md                 # Project overview, installation, and usage  \n```\n---\n\n## Example Illustration\n\nLet’s assume you’re working with a dataset comparing crop yields (`Yield`) across different treatments (`Treatment`).\n\n### Dataset Structure\n\n| Treatment | Yield |\n|-----------|-------|\n| A         | 2.3   |\n| A         | 2.4   |\n| B         | 2.0   |\n| B         | 2.1   |\n| C         | 1.8   |\n| C         | 1.9   |\n\n### 1. Descriptive Statistics\n\n| Treatment | Mean | Std. Dev | Std. Error |\n|-----------|------|----------|------------|\n| A         | 2.35 | 0.07     | 0.05       |\n| B         | 2.05 | 0.07     | 0.05       |\n| C         | 1.85 | 0.07     | 0.05       |\n\n### 2. Distribution Visualization\n\n```python\nimport seaborn as sns\nsns.boxplot(data=data, x='Treatment', y='Yield')\n```\n\nDisplays the spread and central tendency of yield by treatment.\n\n### 3. ANOVA Results\n\n```python\nimport statsmodels.api as sm\nfrom statsmodels.formula.api import ols\n\nmodel = ols('Yield ~ C(Treatment)', data=data).fit()\nanova_table = sm.stats.anova_lm(model, typ=2)\nprint(anova_table)\n```\n\n**Output:**\n\n| Source        | Sum Sq | df | F       | PR(\u003eF)  |\n|---------------|--------|----|---------|---------|\n| C(Treatment)  | 0.45   | 2  | 15.00   | 0.0032  |\n| Residual      | 0.09   | 6  |         |         |\n\nInterpretation: There is a statistically significant difference between treatments.\n\n### 4. Tukey HSD Post-Hoc Test\n\n```python\nfrom statsmodels.stats.multicomp import pairwise_tukeyhsd\n\ntukey = pairwise_tukeyhsd(endog=data['Yield'], groups=data['Treatment'], alpha=0.05)\nprint(tukey)\n```\n\n**Output:**\n\n```\nGroup1 Group2  Meandiff  p-adj   Lower   Upper  Reject\n-------------------------------------------------------\nA      B       -0.30     0.04    -0.58   -0.02   True\nA      C       -0.50     0.01    -0.78   -0.22   True\nB      C       -0.20     0.07    -0.48   0.08    False\n```\n\n### 5. Compact Letter Display (CLD)\n\n| Treatment | Mean Yield | Group |\n|-----------|------------|-------|\n| A         | 2.35       | a     |\n| B         | 2.05       | ab    |\n| C         | 1.85       | b     |\n\nInterpretation: Treatments not sharing a letter are significantly different.\n\n---\n\n## Usage Example\n\n```python\nfrom scripts.anova_tukey_cld import run_anova_and_cld\nfrom scripts.summary_tables import generate_summary_table\n\ncld_df = run_anova_and_cld(data, group_col='Treatment', value_col='Yield')\nsummary = generate_summary_table(data, group_col='Treatment', value_col='Yield', cld_df=cld_df)\nprint(summary)\n```\n\n---\n\n## Contributing\n\nContributions are welcome! You can:\n\n- Add support for more post-hoc tests (e.g., Games-Howell, Dunn's)\n- Improve visualization formatting\n- Extend to two-way ANOVA or repeated measures\n\n---\n\n## License\n\nThis project is licensed under the MIT License.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjabulente%2Fcompact-letters-display","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjabulente%2Fcompact-letters-display","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjabulente%2Fcompact-letters-display/lists"}