{"id":32412374,"url":"https://github.com/filippob/introduction_to_gwas","last_synced_at":"2025-10-25T14:55:00.321Z","repository":{"id":71399723,"uuid":"237263399","full_name":"filippob/introduction_to_gwas","owner":"filippob","description":"https://filippob.github.io/introduction_to_gwas/","archived":false,"fork":false,"pushed_at":"2025-06-27T13:58:40.000Z","size":73649,"stargazers_count":20,"open_issues_count":0,"forks_count":15,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-06-27T14:37:52.319Z","etag":null,"topics":["gwas","imputation","linear-regression","pipeline"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/filippob.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2020-01-30T17:12:07.000Z","updated_at":"2025-06-27T13:58:43.000Z","dependencies_parsed_at":null,"dependency_job_id":"7a57d47d-2f8c-450e-a48b-bea7b07b5861","html_url":"https://github.com/filippob/introduction_to_gwas","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/filippob/introduction_to_gwas","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/filippob%2Fintroduction_to_gwas","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/filippob%2Fintroduction_to_gwas/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/filippob%2Fintroduction_to_gwas/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/filippob%2Fintroduction_to_gwas/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/filippob","download_url":"https://codeload.github.com/filippob/introduction_to_gwas/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/filippob%2Fintroduction_to_gwas/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":280971724,"owners_count":26422675,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-25T02:00:06.499Z","response_time":81,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["gwas","imputation","linear-regression","pipeline"],"created_at":"2025-10-25T14:54:59.317Z","updated_at":"2025-10-25T14:55:00.315Z","avatar_url":"https://github.com/filippob.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# introduction_to_gwas\n\n**Material for the Course \"Introduction to genome-wide association studies (GWAS)\"**\n\nInstructors: *Filippo Biscarini, Oscar Gonzalez-Recio, Christian Werner*\n\nThis course will introduce students, researchers and professionals to the steps needed to build an analysis pipeline for Genome-Wide Association Studies (GWAS). The course will describe all the necessary steps involved in a typical GWAS study, which will then be used to build a reusable and reproducible bioinformatics pipeline.\n\nEach day the course will start at **14:00** and end at **20:00** (CET).\nAs a general rule, we'll have a longer break (30 minutes) at 16:00 and two shorter breaks (10-15 minutes) later on during the day (to be decided flexibly depending on the sessions).  \n\n\u003c!-- timetable: [here](https://docs.google.com/spreadsheets/d/1Cy8vBD6I_no8UPzYPU9bz7ASWyI3bc4Y9vcdr5S1TBw/edit#gid=0) --\u003e\n\n**Day 1**\n\n- Lecture 0\tGeneral Introduction / Overview of the Course [Filippo, Oscar, Christian]\n    - [General Introduction](slides/0_General_Introduction.pdf)\n    - [GWAS Workflow (short)](slides/GWAS_workflow_short.pdf)\n- Lecture 1\tGWAS Overview: Case Studies / Examples from Literature [Oscar]\n    - [GWAS Overview](slides/1_GWAS_overview.pdf)\n- Lecture 2\tIntroduction to GWAS: Linkage Disequilibrium and Linear Regression [Oscar]\n    - [Introduction to GWAS](slides/2_Introduction_to_GWAS.pdf)\n- Lab 1 (Demonstration) GWAS: Basic Models (Linear and Logistic Regression) [Oscar]\n    - [R code. Exercise on Simple Linear Regression](basic_model/1.Basis_of_linear_regression.R)\n    - [Rmarkdown Code. Exercise on Simple Logistic Regression](basic_model/2.exercise.Basis_of_logistic_regression.Rmd)\n- Lab 2 - Description of Datasets [Christian]\n    - [Description of Datasets](slides/Description_of_Data.pdf)\n - [Course Manual](slides/gwas_manual.pdf)\n - [GWAS Workflow](slides/GWAS_workflow.pdf)\n\n\n**Day 2**\n\n- Lecture 3 The Multiple Testing Issue [Oscar]\n    - [Multiple Testing](slides/3.MultipleTesting.pdf)\n    - [R code. Exercise on multiple testing correction](5.power_and_significance/MultipleTestingCorrection.R)\n- Lecture 4 Statistical Power, Population Stratification and Experimental Design [Oscar]\n    - [Statistical Power and Population Stratification](slides/4.GWAS_experimental_design_and_statistical_power.pdf)\n    - [R code. Exercise on statistical power](5.power_and_significance/StatisticalPower_exercise.R)\n- Lecture 5 Initial Data Analysis, Exploratory Data Analysis and Data Pre-Processing [Christian]\n    - [Brief Genotyping overview](\u003cslides/5.1 Genotyping_overview.pdf\u003e)\n    - [IDA, EDA \u0026 Data Pre-Processing](slides/5_2_Data_pre-processing.pdf)\n- Lab 3 GWAS: a first simple exercise for you! [Christian, Filippo]\n    - [GWAS demonstration in R - script](0.r_scripts/GWAS_demo.R)\n    - [GWAS demonstration in R - genotypes](example_data/genotypes_demo.csv)\n    - [GWAS demonstration in R - map](example_data/map_demo.csv)\n    - [GWAS demonstration in R - phenotypes](example_data/phenotypes_demo.csv)\n    - [GWAS exercise in R - genotypes](example_data/genotypes_fruit_sim.csv)\n    - [GWAS exercise in R - map](example_data/map_fruit_sim.csv)\n    - [GWAS exercise in R - phenotypes](example_data/phenotypes_fruit_sim.csv)\n\n**Day 3**\n\n- Lab 4 Data filtering and mean/median imputation in R [Filippo]\n    - [filter_genotype_data.R](0.r_scripts/filter_genotype_data.R)\n    - [mean_imputation.R](0.r_scripts/mean_imputation.R)\n    - [median_imputation.R](0.r_scripts/median_imputation.R)\n- Lab 5 GWAS: The Stand-Alone Script(s) for the Full Model [Filippo]\n    - [gwas_rrblup.R](4.gwas/gwas_rrblup.R)\n    - [gwas_statgengwas.R](4.gwas/gwas_statgengwas.R)\n    - [gwas_sommer.R](4.gwas/gwas_sommer.R)\n- Lecture 6 KNN Imputation \n    - [KNN Imputation](\u003cslides/6.KNN imputation.pdf\u003e)\n- Lab 6 (Demonstration) KNNI Imputation [Filippo] [OPTIONAL]\n    - [knni_illustration.Rmd](3.imputation/knni_illustration.Rmd)\n    - [data_for_KNNI_illustration]\u003c!--(model_extensions_data/GenRiz44.txt)--\u003e\n    - [knni_tidymodels.R](3.imputation/knni_tidymodels.R)\n    - [02_knni.sh]\u003c!--(3.imputation/02_knni.sh)--\u003e [support script]\n    - [hamming.R]\u003c!--(3.imputation/hamming.R)--\u003e [support script]\n    - [knni.R]\u003c!--(3.imputation/knni.R)--\u003e [support script]\n- Lecture 7 Working in the shell [Christian]\n    - [Linux and the Shell](\u003cslides/7.1.Linux and the Shell.pdf\u003e) [OPTIONAL]\n    - [Common Data Types and Formats](\u003cslides/7.2 Data_formats.pdf\u003e)\n- Lecture 8\tImputation of Missing Genotypes [Christian]\n    - [Imputation](slides/8_Imputation.pdf)\n- Lab 7 Imputation of Missing Genotypes using Beagle [Christian]\n\n\n**Day 4**\n- Lecture 9 Brief Intermission:\n    - [R code PCA \u0026 Population Structure](4.gwas/PCA_screeplots.R)\n    - [Imputed rice genotypes](4.gwas/rice_imputed.raw)\n- Lab 8 Revising the Steps involved in GWAS [Filippo]\n    - [slides](\u003cslides/9.Revising the steps.pdf\u003e)\n    - [1.get_data.sh](6.steps/1.get_data.sh)\n    - [2.step_filtering.sh](6.steps/2.step_filtering.sh)\n    - [3.step_imputation.sh](6.steps/3.step_imputation.sh)\n    - [4.gwas.sh](6.steps/4.gwas.sh)\n- Lab 9 Introducing the Exercise [Filippo]\n    - [Collaborative Exercise](\u003cslides/9.1.Collaborative exercise.pdf\u003e)\n- Collaborative Exercise: let's build our own GWAS workflow on new data. Pig (*Sus scrofa*) data. [Filippo, Oscar, Christian]\n    - Part 1: Individual/Group Break-Out Sessions to give it a try independetly\n    - Part 2: Whole-Group Revision of the Exercise: step-by-step (1.get_data; 2.filter; 3.imputation; 4.GWAS)\n    - [exercise tips](8.collaborative_exercise/README)\n- Bonus exercise [Optional] (*Parus major* data) \n    \n**Day 5**\n\n- Lecture 10 A light Touch on Post-GWAS Analysis: Inferring Functionality [Oscar]\n    - [slides](\u003cslides/11.Exploring Functionality with FUMA \u0026 DAVID.pdf\u003e)\n    - [R code. Exercise on R, and FUMA](functional_analysis/getGenesFromSNP.R)\n- Lecture 11 GWAS Model Extensions and Applications: [Filippo, Christian, Oscar]\n    - [12.1 GWAS Model Extensions_Dominance_and_other_genotype_Codifications](\u003cslides/12.1 GWAS_model_extensions_genotype_codification.pdf\u003e)\n    - [12.2 GWAS Model Extensions_Polyploids](slides/12.2GWAS_model_extensions_polyploids.pdf)\n        - [R code GWASpoly (vignette)]\u003c!--(slides/GWASpoly_vignette.pdf)--\u003e\n    - [12.3 GWAS Model Extensions_Trait_Types: categorical, longitudinal](model_extensions/)\n        - [slides](slides/12.3.GWAS_model_extensions_trait_type.pdf)\n        - [R code GWAS for categorical Traits](model_extensions/1.categorical_gwas.Rmd)\n        - [R code GWAS for categorical Traits - Examples](model_extensions/2.categorical_gwas_example.Rmd)\n        - [R code GWAS for longitudinal Traits](model_extensions/3.longitudinal_gwas.Rmd)\n    - 12.4 GWAS Model Extensions Multi-Trait Multi-Locus models \u0026 software\n        - [slides](slides/12.4.GWAS_model_extensions_multi_trait_and_locus.pdf)\n        - [multiple-trait models](model_extensions/multi_trait)\n        - [multiple-loci models](model_extensions/multi_locus)\n    - [12.5 A bioinformatic pipeline for GWAS](7.pipeline/)\n        - [slides](\u003cslides/10.A bioinformatic pipeline for GWAS.pdf\u003e)\n    - 12.6 Additional software for GWAS\n        - [gemma](model_extensions/gemma)\n        - [regenie](model_extensions/regenie)\n    - [ROH-based and Resampling Methods as alternative approaches](\u003cslides/14.ROH-based and resampling methods as alternative approaches.pdf\u003e)\n    - [Applications of GWAS: Mendelian Randomization](\u003cslides/15.Applications of GWAS_ Mendelian Randomization.pdf\u003e)\n\n- Final Quiz on what we learned about GWAS! [Filippo, Oscar, Christian]\n- Conclusions and Wrap-Up Discussion on GWAS [Filippo, Oscar, Christian]\n\n## Organization of the Code for the practical Sessions\n\n0. the GWAS workflow in R\n1. preparatory_steps: download and prepare the data\n2. preprocessing: filter the data\n3. imputation: imputing missing genotypes\n4. gwas: run the GWAS models\n5. power_and_significance: designing GWAS experiments\n6. steps: identifying the individual steps involved in a GWAS study\n7. pipeline: assembling the individual steps into a bioinformatics pipeline for GWAS\n8. collaborative exercise: trying out what we learnt on new data\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffilippob%2Fintroduction_to_gwas","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffilippob%2Fintroduction_to_gwas","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffilippob%2Fintroduction_to_gwas/lists"}