{"id":15937250,"url":"https://github.com/weiji14/cryospheric-data-lakes","last_synced_at":"2025-09-13T00:32:31.831Z","repository":{"id":96183457,"uuid":"94717901","full_name":"weiji14/cryospheric-data-lakes","owner":"weiji14","description":"Big data tools to handle various cryospheric remote sensing datasets, mostly in python.","archived":false,"fork":false,"pushed_at":"2018-12-10T22:59:41.000Z","size":2869,"stargazers_count":3,"open_issues_count":3,"forks_count":2,"subscribers_count":3,"default_branch":"master","last_synced_at":"2024-12-31T02:21:32.909Z","etag":null,"topics":["cryosat","cryosphere","docker","icesat","jupyter-notebook","open-source","python","remote-sensing","reproducible-research","satellite"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/weiji14.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2017-06-18T23:01:27.000Z","updated_at":"2022-05-26T18:21:59.000Z","dependencies_parsed_at":"2023-09-24T16:57:04.752Z","dependency_job_id":null,"html_url":"https://github.com/weiji14/cryospheric-data-lakes","commit_stats":{"total_commits":77,"total_committers":2,"mean_commits":38.5,"dds":"0.012987012987012991","last_synced_commit":"54be7b78af9d899730f7c1c66d165713a98cd9f5"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/weiji14%2Fcryospheric-data-lakes","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/weiji14%2Fcryospheric-data-lakes/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/weiji14%2Fcryospheric-data-lakes/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/weiji14%2Fcryospheric-data-lakes/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/weiji14","download_url":"https://codeload.github.com/weiji14/cryospheric-data-lakes/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":232802581,"owners_count":18578685,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cryosat","cryosphere","docker","icesat","jupyter-notebook","open-source","python","remote-sensing","reproducible-research","satellite"],"created_at":"2024-10-07T05:01:51.347Z","updated_at":"2025-01-07T00:15:56.662Z","avatar_url":"https://github.com/weiji14.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Cryospheric Data Lakes\n\n[![License: Open Data Commons Attribution](https://img.shields.io/badge/License-ODC_BY-brightgreen.svg)](https://opendatacommons.org/licenses/by/)\n[![License: LGPL v3](https://img.shields.io/badge/License-LGPL%20v3-blue.svg)](https://www.gnu.org/licenses/lgpl-3.0)\n[![License: CC BY-SA 4.0](https://img.shields.io/badge/License-CC%20BY--SA%204.0-lightgrey.svg)](https://creativecommons.org/licenses/by-sa/4.0/)\n\nOpen-source big data tools to handle various cryospheric remote sensing datasets.\n\n\u003e ### [Data lake](https://en.wikipedia.org/wiki/Data_lake)\n\u003e ... a method of storing data within a system or repository, in its natural format, that facilitates the collocation of data in various schemata and structural forms, usually object blobs or files... ~Wikipedia\n\n## Contents\n\nFind the underlying [data here](/data) used in this project (or at least links to the sources since they might be too big).\n\nExamine the [code here](/code) which mingles with the data to give some (hopefully) nice scientifically meaningful outputs (whatever that means).\nYou may find some interesting dockerfiles and python3 code inside (if that clicks with you).\n\n## Getting Started\n\nThese instructions will get you a copy of the project up and running on your local machine for development and testing purposes.\n\n### Pre-requisites\n\nYou have some form of [git](https://git-scm.com/) installed for version control.\n**Ideally**, [docker](https://www.docker.com/) should be installed too to fully replicate this scientific development environment, unless you do not have root/admin privilleges.\nFor [conda](https://conda.io) users, you may skip the docker install, but take note of the section below on setting up a conda environment.\n\nFor Debian/Ubuntu-based systems, you can try something like:\n\n    sudo apt install git docker-ce\n\nNote: You may need to set-up the repository first to install docker-ce.\nSee instructions for [Debian](https://docs.docker.com/engine/installation/linux/docker-ce/debian/) and [Ubuntu](https://docs.docker.com/engine/installation/linux/docker-ce/ubuntu/).\n\nFor Windows, if you have [chocolatey](https://chocolatey.org/) (recommended!), it can be as easy as:\n\n    choco install git docker\n\nFor Mac OS X:\n\n    TODO??\n\n### Cloning the repository\n\nWith git installed, fire up your command prompt and do a git clone from this [repo-url](/../../):\n\n    git clone \u003crepo-url\u003e\n\nAlternatively, download the zip file from [here](/../../archive/master.zip), and unzip it.\n\nThe standard `clone` code above will skip over some [submodules](https://github.com/blog/2104-working-with-submodules), such as external tutorials I have cloned into the [tuts](/tuts) folder.\nTo get absolutely everything (beware beware!), you can do:\n\n    git clone --recursive \u003crepo-url\u003e\n\n\n### Setup conda environment (for Anaconda/Miniconda users)\n\nYou can replicate most of the libraries used in this repository by running:\n    \n    conda env create --file=environment.yml\n\n### Running the code\n\nTo try out the code (that downloads big data files, processes the data, etc) you can use a [Jupyter](http://jupyter.org/) [lab](https://jupyterlab.readthedocs.io/en/latest/) or [notebook](https://jupyter-notebook.readthedocs.io/en/stable/) environment.\nDo so by running either one of the below:\n\n    jupyter lab\n    jupyter notebook\n\nAlternatively, you can use the atom-hydrogen-beta docker container [here](/code/docker#atom) to ensure ease of reproducibility (aka mitigate denpendency hell problems).\nYes, I like to do my code writing and execution inside that 'atom' docker container with interactive [Hydrogen](https://github.com/nteract/hydrogen#hydrogen-) functionality!!\n\n![atom-demo-10](https://user-images.githubusercontent.com/23487320/28195882-1c82e6dc-68a1-11e7-9da9-236918621d5d.gif)\n\nBut of course, you can install the libraries yourself.\n\n## Contributing\n\nFeel free to submit a pull request or issue (nice ways of saying hi!) if you'd like to see something in here that's not here yet.\n\n## License\n\n### Data\nAny raw [data](/data) (e.g. binary satellite files) used here is licensed accordingly as per the upstream source.\nDerived datasets are licensed under the [Open Data Commons Attribution license](https://opendatacommons.org/licenses/by/) unless otherwise stated.\n\n### Code\n\nSource [code](/code) used in the handling of the data is licensed under the [GNU Lesser General Public License v3.0](https://choosealicense.com/licenses/lgpl-3.0/).\n\n### Other\n\nOther forms of content (such as documentation) in this project repository which is not covered by the above two licenses is licensed under the [Creative Commons Attribution Share Alike 4.0 License](https://creativecommons.org/licenses/by-sa/4.0/). Linked submodules (e.g. in the [tuts](/tuts) folder) are subjected to their respective upstream licenses.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fweiji14%2Fcryospheric-data-lakes","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fweiji14%2Fcryospheric-data-lakes","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fweiji14%2Fcryospheric-data-lakes/lists"}