{"id":30201536,"url":"https://github.com/esipfed/earth-science-community-ml-tutorials","last_synced_at":"2025-08-13T10:50:48.250Z","repository":{"id":113492738,"uuid":"414624339","full_name":"ESIPFed/earth-science-community-ML-tutorials","owner":"ESIPFed","description":"ESIP Lab 2021 – Cloud-based Open Science Machine Learning Tutorials for Earth Science","archived":false,"fork":false,"pushed_at":"2021-10-07T17:52:56.000Z","size":12,"stargazers_count":4,"open_issues_count":0,"forks_count":2,"subscribers_count":5,"default_branch":"main","last_synced_at":"2024-01-25T18:35:16.392Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ESIPFed.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2021-10-07T14:01:07.000Z","updated_at":"2023-11-25T14:35:59.000Z","dependencies_parsed_at":"2023-09-22T08:23:22.844Z","dependency_job_id":null,"html_url":"https://github.com/ESIPFed/earth-science-community-ML-tutorials","commit_stats":{"total_commits":6,"total_committers":1,"mean_commits":6.0,"dds":0.0,"last_synced_commit":"9890c46c61d1c677e698cdd736d0f052f6dd6fe8"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ESIPFed/earth-science-community-ML-tutorials","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ESIPFed%2Fearth-science-community-ML-tutorials","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ESIPFed%2Fearth-science-community-ML-tutorials/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ESIPFed%2Fearth-science-community-ML-tutorials/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ESIPFed%2Fearth-science-community-ML-tutorials/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ESIPFed","download_url":"https://codeload.github.com/ESIPFed/earth-science-community-ML-tutorials/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ESIPFed%2Fearth-science-community-ML-tutorials/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":270228430,"owners_count":24548818,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-13T02:00:09.904Z","response_time":66,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-08-13T10:50:26.871Z","updated_at":"2025-08-13T10:50:48.199Z","avatar_url":"https://github.com/ESIPFed.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Cloud-based Open Science Machine Learning Tutorials for Earth Science\n\nThis project is funded through ESIP Lab Spring 2021 Request for Proposal. The project is led by [Yuhan (Douglas) Rao](mailto:yrao5@ncsu.edu) at\n[North Carolina Institute for Climate Studies](https://ncics.org) in collaboration with [Chris Slocum](mailto:christopher.slocum@noaa.gov).\n\n## Project Description\n\nCloud computing is beginning to accelerate the science process by removing barriers associated with collecting and quality controlling data. \nCloud computing also provides the means to improve scientific workflows and promote research sharing through open-source literate programming \ntools such as notebook (e.g., Jupyter and Rmarkdown). However, this is a seismic shift in how researchers in the Earth science community do \ntheir work. Many researchers are resistant to adopting and taking advantage of cloud computing because of the hurdles associated with starting,\nthe lack of domain specific examples, unclear cloud computing costs, and the plethora of cloud computing vendor Application Programming\nInterfaces (APIs).\n\nThere are some existing efforts to create such notebooks to promote the adoption of cloud computing and open-source Artificial Intelligence (AI)\ntools. However, these efforts are usually side products related to a specific research project and developed by the researchers themselves. The \nnotebook development process typically does not directly engage potential users, which may reduce the value and impact of the final notebooks. \nIn an effort to develop interactive machine learning tutorials supported by ESIP Funding Friday, we found that training materials would be more \nuseful and impactful when potential users were engaged in the development process. Additionally, many existing notebooks do not necessarily \nfollow the best practices in cloud computing and AI applications (e.g., provenance, reproducibility, and content accessibility).\n\nThe project proposes creating well-documented notebooks that show how to collect, distribute, process, and analyze geophysical datasets with \nopen-source AI tools. The development process will actively engage potential users to identify learning topics of high demand and seek user \nfeedback along the development process. Additionally, all notebooks will follow and highlight community best practices on cloud computing \nand AI applications . This project will build a workflow and infrastructure using the open science ecosystem (i.e., Jupyter, Python, R, Google\nColaboratory, Binder Project, and GitHub) that is scalable and can enable community contributions with notebook templates, contribution \nguidelines, and automated evaluation tools.\n\nTo demonstrate the diversity of cloud computing resources and public Earth science data, we will develop notebooks that use services and \ngeophysical data from several cloud computing vendor APIs (e.g., Amazon AWS, Google Cloud Storage/Earth Engine) and data sets from various\ngovernment agencies that have moved portions of their data holdings to the cloud (e.g., NOAA’s Big Data Project, NASA’s Earthdata Cloud \nEvolution, USGS’s Cloud Hosting Solutions). We will also leverage community-driven tools for open, reproducible, and scalable science, \nsuch as the Pangeo software ecosystem, in the notebook development process.\n\nTo create notebooks that are relevant to users with different levels of technical background, the project will follow the concept of \n“learning journey,” which is a series of progressive notebooks that are suitable for users with different levels of technical knowledge.\nThe learning journey allows us to separate a complicated learning process into manageable pieces to facilitate more effective learning \nfor potential users. Users can start their own learning journey via different entry points of their choice. The main learning objective\nof the project team is to identify the best practices and tools to make interactive notebooks accessible to all users by incorporating the \n[Web Content Accessibility Guidelines (WCAG)](https://www.w3.org/TR/WCAG/) developed by the World Wide Web Consortium (W3C).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fesipfed%2Fearth-science-community-ml-tutorials","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fesipfed%2Fearth-science-community-ml-tutorials","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fesipfed%2Fearth-science-community-ml-tutorials/lists"}