{"id":25990241,"url":"https://github.com/poglolopez/nesarc_research","last_synced_at":"2026-04-14T23:34:34.728Z","repository":{"id":279332486,"uuid":"938459521","full_name":"PogloLopez/nesarc_research","owner":"PogloLopez","description":"Analyzing the relationship between Social Anxiety Disorder (SAD) and family history of behavioral problems using NESARC data. Includes statistical hypothesis testing (ANOVA, Chi-Square, Pearson Correlation, Moderation Analysis). Developed as part of the Data Analysis and Interpretation Specialization from Wesleyan University (Coursera).","archived":false,"fork":false,"pushed_at":"2025-02-25T03:04:22.000Z","size":437,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-05T13:42:28.918Z","etag":null,"topics":["anova","chi-square","coursera-assignment","data-analysis","hypothesis-testing","mental-health","moderation-analysis","nesarc","pandas","pearson-correlation","python","social-anxiety","statistical-analysis"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/PogloLopez.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-02-25T01:47:58.000Z","updated_at":"2025-02-25T03:14:03.000Z","dependencies_parsed_at":"2025-02-25T03:23:06.753Z","dependency_job_id":"1e87779f-4804-4f93-a3e6-d5ad3ea47b39","html_url":"https://github.com/PogloLopez/nesarc_research","commit_stats":null,"previous_names":["poglolopez/nesarc_research"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/PogloLopez/nesarc_research","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PogloLopez%2Fnesarc_research","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PogloLopez%2Fnesarc_research/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PogloLopez%2Fnesarc_research/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PogloLopez%2Fnesarc_research/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/PogloLopez","download_url":"https://codeload.github.com/PogloLopez/nesarc_research/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PogloLopez%2Fnesarc_research/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31819936,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-14T18:05:02.291Z","status":"ssl_error","status_checked_at":"2026-04-14T18:05:01.765Z","response_time":153,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anova","chi-square","coursera-assignment","data-analysis","hypothesis-testing","mental-health","moderation-analysis","nesarc","pandas","pearson-correlation","python","social-anxiety","statistical-analysis"],"created_at":"2025-03-05T13:39:08.979Z","updated_at":"2026-04-14T23:34:34.698Z","avatar_url":"https://github.com/PogloLopez.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🧠 NESARC Research: Social Anxiety Disorder (SAD) \u0026 Family Behavioral History  \r\n\r\n*A data-driven exploration of the relationship between SAD severity and family history of behavioral problems.*\r\n\r\n## 📌 Overview  \r\n\r\nThis project was developed as part of the **Data Analysis and Interpretation Specialization** from **Wesleyan University** on **Coursera**. It explores the relationship between **Social Anxiety Disorder (SAD)** and a family history of behavioral problems using data from the **National Epidemiologic Survey on Alcohol and Related Conditions (NESARC)**. The analysis applies **statistical hypothesis testing**, including **ANOVA, Chi-Square test, Pearson Correlation, and moderation analysis (Gender as a moderator)** to assess these relationships.\r\n\r\n## 📂 Dataset  \r\n- **Source**: [National Epidemiologic Survey on Alcohol and Related Conditions (NESARC)](https://www.nesarc.org)  \r\n- **File**: `source/NESARC Dataset.csv`  \r\n- **Size**: 252.61 MB *(tracked with [Git LFS](https://git-lfs.github.com/))*  \r\n- **Key Variables**:  \r\n\r\n| Variable                   | Type          | Description                                                                 |\r\n|---------------------------|---------------|-----------------------------------------------------------------------------|\r\n| `SAD_score`                | Numerical     | Composite score derived from SAD-related survey responses.                  |\r\n| `SAD_spectrum`             | Categorical   | **Low (≤2)**, **Medium (2-5)**, **High (\u003e5)** severity categories.         |\r\n| `behavior_problems_count`  | Numerical     | Number of relatives with behavioral problems.                              |\r\n| `relatives_with_problems`  | Binary (Y/N)  | Presence of ≥1 relative with behavioral problems.                          |\r\n\r\n## 🎯 Objectives  \r\n1. **Primary**: Determine if family history of behavioral problems correlates with higher SAD severity.  \r\n2. **Secondary**: Assess if **gender** moderates this relationship.  \r\n\r\n## 🔍 Methodology  \r\n\r\n### 1. Data Preprocessing  \r\n- **Cleaning**: Removed missing values and standardized variables.  \r\n- **Feature Engineering**:  \r\n  - Created `SAD_score` from symptom-related survey responses.  \r\n  - Binned `SAD_score` into **Low/Medium/High** categories.  \r\n  - Derived `relatives_with_problems` from `behavior_problems_count`.  \r\n\r\n### 2. Statistical Analysis  \r\n\r\n#### **ANOVA: SAD Spectrum vs. Behavior Problems Count**  \r\nA one-way ANOVA revealed significant differences in behavior problems across SAD severity groups:  \r\n\r\n| Source          | Sum Sq   | df  | F      | p-value       |\r\n|-----------------|----------|-----|--------|---------------|\r\n| SAD_spectrum    | 125.78   | 2   | 22.24  | **2.48e-10**  |\r\n| Residual        | 10919.14 | 3862| –      | –             |\r\n\r\n**Post-hoc Tukey's HSD**: All group pairs showed significant differences (p \u003c 0.05).  \r\n\r\n#### **Chi-Square Test: Relatives With Problems vs. SAD Spectrum**  \r\n- **χ² = 34.56** *(p = 3.13e-08)*  \r\n- **Cramér's V = 0.095** (small effect size)  \r\n\r\n*Conclusion*: Significant association between family history and SAD severity.  \r\n\r\n#### **Pearson Correlation: SAD Score vs. Behavior Problems Count**  \r\n- **r = 0.08** *(p = 3.82e-07)*  \r\n- **r² = 0.0067** (0.67% variance explained)  \r\n\r\n*Conclusion*: Weak but statistically significant correlation.  \r\n\r\n#### **Moderation Analysis: Gender as a Moderator**  \r\nAn ANOVA with interaction terms tested if gender moderates the SAD-behavior relationship\r\n\r\n**Key Findings**:  \r\n- Interaction term **p = 0.187** → Gender does not significantly moderate the relationship.  \r\n- Main effects of `SAD_spectrum` remain significant.  \r\n\r\n## 📁 Repository Structure  \r\n\r\n```\r\nNESARC_research/\r\n├── source/                  # Raw data (tracked via Git LFS)\r\n│   └── NESARC Dataset.csv\r\n├── .gitattributes           # Git LFS configuration\r\n├── DMV.ipynb                # Jupyter Notebook (full analysis)\r\n├── LICENCE                  # MIT Licence\r\n├── README.md                # Project documentation\r\n└── requirements.txt         # Python dependencies\r\n```\r\n\r\n## 🛠️ Installation \u0026 Usage  \r\n\r\n### 1. Clone the Repository  \r\n```bash\r\ngit clone https://github.com/PogloLopez/nesarc_research.git\r\ncd nesarc_research\r\n```\r\n\r\n### 2. Install Git LFS \u0026 Download Data  \r\n```bash\r\ngit lfs install   # Set up Git LFS\r\ngit lfs pull      # Download dataset\r\n```\r\n\r\n### 3. Set Up a Virtual Environment  \r\n```bash\r\npython -m venv .venv\r\nsource .venv/bin/activate  # Mac/Linux\r\n.\\.venv\\Scripts\\activate   # Windows\r\n```\r\n\r\n### 4. Install Dependencies  \r\n```bash\r\npip install -r requirements.txt\r\n```\r\n\r\n### 5. Run the Analysis  \r\nLaunch Jupyter Notebook:  \r\n```bash\r\njupyter notebook DMV.ipynb\r\n```\r\n\r\n## 📜 License  \r\nThis project is licensed under the **MIT License**. See [LICENSE](LICENSE) for details.  \r\n\r\n---\r\n\r\n*💡 For questions or collaborations, contact [Pablo López](mailto:poglolopez@gmail.com) or connect on [LinkedIn](https://linkedin.com/in/pablo-a-lopez-s).*  \r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpoglolopez%2Fnesarc_research","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpoglolopez%2Fnesarc_research","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpoglolopez%2Fnesarc_research/lists"}