{"id":46940998,"url":"https://github.com/datacarpentry/rr-intro","last_synced_at":"2026-03-11T07:03:53.605Z","repository":{"id":24471413,"uuid":"27875119","full_name":"datacarpentry/rr-intro","owner":"datacarpentry","description":"Introduction materials for Reproducible Research Curriculum","archived":false,"fork":false,"pushed_at":"2020-06-18T06:03:42.000Z","size":7232,"stargazers_count":12,"open_issues_count":6,"forks_count":19,"subscribers_count":16,"default_branch":"gh-pages","last_synced_at":"2025-09-04T18:50:05.890Z","etag":null,"topics":["carpentries","data-carpentry","english","lesson","on-hold","project-organization","r","reproducibility"],"latest_commit_sha":null,"homepage":"http://www.datacarpentry.org/rr-intro/","language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/datacarpentry.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.md","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":"CITATION","codeowners":null,"security":null,"support":null}},"created_at":"2014-12-11T14:25:39.000Z","updated_at":"2025-06-27T11:41:02.000Z","dependencies_parsed_at":"2022-08-22T16:40:20.689Z","dependency_job_id":null,"html_url":"https://github.com/datacarpentry/rr-intro","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/datacarpentry/rr-intro","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datacarpentry%2Frr-intro","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datacarpentry%2Frr-intro/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datacarpentry%2Frr-intro/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datacarpentry%2Frr-intro/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/datacarpentry","download_url":"https://codeload.github.com/datacarpentry/rr-intro/tar.gz/refs/heads/gh-pages","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datacarpentry%2Frr-intro/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30373525,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-11T06:09:32.197Z","status":"ssl_error","status_checked_at":"2026-03-11T06:09:17.086Z","response_time":84,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["carpentries","data-carpentry","english","lesson","on-hold","project-organization","r","reproducibility"],"created_at":"2026-03-11T07:03:48.934Z","updated_at":"2026-03-11T07:03:53.598Z","avatar_url":"https://github.com/datacarpentry.png","language":"HTML","funding_links":[],"categories":[],"sub_categories":[],"readme":"rr-intro\n========\n\n## Lesson synopsis:\n\nIn this session we will start by reviewing case studies of (lack of) reproducibility gone wrong. Then participants will work on two reproducibility exercises: first a simple data manipulation and analysis exercise using any software they generally work with and then the same exercise (and extensions to it) using `RMarkdown` in `RStudio` as a better alternative, highlighting how this approach makes documentation, organization, automation, and dissemination easier.  \n\n## Syllabus:\n\n- Recognize the problems that reproducible research helps address\n- Identify pain points in getting your analysis to be reproducible.\n- The role of documentation, sharing, automation, and organization in making your research more reproducible.\n- Introducing some tools to solve these problems, specifically `R`/`RStudio`/`RMarkdown`.\n\n## Goals:\n\nAt the beginning of this session, participants should be able to\n\n- use a spreadsheet program to generate a plot\n- use a text editor (Word, Google Docs, etc.) to communicate\n\nAt the end of the session students will be able to\n\n- recognize the problems that reproducible research helps address\n- identify pain points in getting their analysis to be reproducible\n\nThe specific problems to be addressed in each session are as follows:\n\n- First half (01): motivating reproducibility\n- Second half (02): introduce R Markdown as a reproducible data analysis tool\n\nThe first half of the intro session is language agnostic. If  a workshop uses programming language other than R, only intro-02 will need to be modified.\n\n## Pre-workshop:\n\nParticipants install `R` + `RStudio`.\n\nSee [email template](https://github.com/Reproducible-Science-Curriculum/rr-intro/blob/master/preworkshop-email.md).\n\n## First half (01):\n\nSee instructor notes (`intro-01-instr-notes.Rmd`) for details.\n\n- Welcome + go over schedule\n- Motivating reproducibility slides\n- Group discussion about current tools people are using for documentation / reproducibility\n- Ex 1: Motivating reproducibility\n\n## Second half (02):\n\nSee instructor notes (`intro-02-instr-notes.Rmd`) for details.\n\n- Provide `RMarkdown` approach to what's done in Session 1 (`intro-template.Rmd`)\n\n- Wrap up with pointing participants to the [reproducibility checklist](https://github.com/Reproducible-Science-Curriculum/rr-intro/blob/master/checklist.md).\n\n## Data attribution\n\n- [Gapminder data](http://www.gapminder.org/data/). [Gapminder data is licensed CC-BY 3.0](https://docs.google.com/document/pub?id=1POd-pBMc5vDXAmxrpGjPLaCSDSWuxX6FLQgq5DhlUhM#h.ul2gu2-uwathz).\n\n- Processed and subset (population size, life expectancy, GDP per capita; only every 5 years only starting 1952, only complete records) [Gapminder data as `R` package](https://github.com/jennybc/gapminder). The [data-raw](https://github.com/jennybc/gapminder/tree/master/data-raw) sub-directory reveals the journey from Gapminder.org's Excel workbooks to increasingly clean and tidy data.\n    - clean dataset can be located in R in the following way (after installing the package):\n\n        ~~~\n        pathToTsv \u003c- system.file(\"gapminder.tsv\", package = \"gapminder\")\n        ~~~\n        {: .r}\n\n## People and credits\n\nThis lesson was first created at the [1. Reproducible Science Curriculum Hackathon]. The corresponding author is **Mine Çetinkaya-Rundel** ([@mine-cetinkaya-rundel]). See the commit log for other contributors.\n\nPlease post feedback and issues with the lesson on the repository's issue tracker. For instructor questions about teaching this lesson, you can also contact the corresponding author directly.\n\n[@mine-cetinkaya-rundel]: https://github.com/mine-cetinkaya-rundel\n[1. Reproducible Science Curriculum Hackathon]: https://github.com/Reproducible-Science-Curriculum/Reproducible-Science-Hackathon-Dec-08-2014\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatacarpentry%2Frr-intro","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdatacarpentry%2Frr-intro","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatacarpentry%2Frr-intro/lists"}