{"id":13689214,"url":"https://github.com/fatiando/pooch","last_synced_at":"2025-05-13T18:07:15.441Z","repository":{"id":32754056,"uuid":"140982392","full_name":"fatiando/pooch","owner":"fatiando","description":"A friend to fetch your data files","archived":false,"fork":false,"pushed_at":"2025-02-25T00:09:20.000Z","size":14096,"stargazers_count":669,"open_issues_count":45,"forks_count":79,"subscribers_count":16,"default_branch":"main","last_synced_at":"2025-05-03T05:03:31.030Z","etag":null,"topics":["data","download-manager","fatiando-a-terra","ftp","http","python","python3","scipy","scipy-stack"],"latest_commit_sha":null,"homepage":"https://www.fatiando.org/pooch","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fatiando.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.txt","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS.md","dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2018-07-14T22:35:02.000Z","updated_at":"2025-04-29T12:00:34.000Z","dependencies_parsed_at":"2024-03-12T13:50:04.487Z","dependency_job_id":"5db772b5-9057-4035-afe0-fd92e3fc7c20","html_url":"https://github.com/fatiando/pooch","commit_stats":{"total_commits":292,"total_committers":48,"mean_commits":6.083333333333333,"dds":0.452054794520548,"last_synced_commit":"14d14e46dba616decbec06c946d0abe430e0e865"},"previous_names":[],"tags_count":28,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fatiando%2Fpooch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fatiando%2Fpooch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fatiando%2Fpooch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fatiando%2Fpooch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fatiando","download_url":"https://codeload.github.com/fatiando/pooch/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254000848,"owners_count":21997441,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data","download-manager","fatiando-a-terra","ftp","http","python","python3","scipy","scipy-stack"],"created_at":"2024-08-02T15:01:38.422Z","updated_at":"2025-05-13T18:07:15.405Z","avatar_url":"https://github.com/fatiando.png","language":"Python","readme":"\u003cimg src=\"https://github.com/fatiando/pooch/raw/main/doc/_static/readme-banner.png\" alt=\"Pooch: A friend to fetch your data files\"\u003e\n\n\u003cp align=\"center\"\u003e\n\u003ca href=\"https://www.fatiando.org/pooch\"\u003e\u003cstrong\u003eDocumentation\u003c/strong\u003e (latest)\u003c/a\u003e •\n\u003ca href=\"https://www.fatiando.org/pooch/dev\"\u003e\u003cstrong\u003eDocumentation\u003c/strong\u003e (main branch)\u003c/a\u003e •\n\u003ca href=\"https://github.com/fatiando/pooch/blob/main/CONTRIBUTING.md\"\u003e\u003cstrong\u003eContributing\u003c/strong\u003e\u003c/a\u003e •\n\u003ca href=\"https://www.fatiando.org/contact/\"\u003e\u003cstrong\u003eContact\u003c/strong\u003e\u003c/a\u003e •\n\u003ca href=\"https://github.com/orgs/fatiando/discussions\"\u003e\u003cstrong\u003eAsk a question\u003c/strong\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\nPart of the \u003ca href=\"https://www.fatiando.org\"\u003e\u003cstrong\u003eFatiando a Terra\u003c/strong\u003e\u003c/a\u003e project\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n\u003ca href=\"https://pypi.python.org/pypi/pooch\"\u003e\u003cimg src=\"http://img.shields.io/pypi/v/pooch.svg?style=flat-square\" alt=\"Latest version on PyPI\"\u003e\u003c/a\u003e\n\u003ca href=\"https://github.com/conda-forge/pooch-feedstock\"\u003e\u003cimg src=\"https://img.shields.io/conda/vn/conda-forge/pooch.svg?style=flat-square\" alt=\"Latest version on conda-forge\"\u003e\u003c/a\u003e\n\u003ca href=\"https://codecov.io/gh/fatiando/pooch\"\u003e\u003cimg src=\"https://img.shields.io/codecov/c/github/fatiando/pooch/main.svg?style=flat-square\" alt=\"Test coverage status\"\u003e\u003c/a\u003e\n\u003ca href=\"https://pypi.python.org/pypi/pooch\"\u003e\u003cimg src=\"https://img.shields.io/pypi/pyversions/pooch.svg?style=flat-square\" alt=\"Compatible Python versions.\"\u003e\u003c/a\u003e\n\u003ca href=\"https://doi.org/10.21105/joss.01943\"\u003e\u003cimg src=\"https://img.shields.io/badge/doi-10.21105%2Fjoss.01943-blue?style=flat-square\" alt=\"DOI used to cite Pooch\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n## About\n\n\u003e Just want to download a file without messing with `requests` and `urllib`?\n\u003e Trying to add sample datasets to your Python package?\n\u003e **Pooch is here to help!**\n\n*Pooch* is a **Python library** that can manage data by **downloading files**\nfrom a server (only when needed) and storing them locally in a data **cache**\n(a folder on your computer).\n\n* Pure Python and minimal dependencies.\n* Download files over HTTP, FTP, and from data repositories like Zenodo and figshare.\n* Built-in post-processors to unzip/decompress the data after download.\n* Designed to be extended: create custom downloaders and post-processors.\n\nAre you a **scientist** or researcher? Pooch can help you too!\n\n* Host your data on a repository and download using the DOI.\n* Automatically download data using code instead of telling colleagues to do it themselves.\n* Make sure everyone running the code has the same version of the data files.\n\n## Projects using Pooch\n\n[SciPy](https://github.com/scipy/scipy),\n[scikit-image](https://github.com/scikit-image/scikit-image),\n[xarray](https://github.com/pydata/xarray),\n[Ensaio](https://github.com/fatiando/ensaio),\n[GemPy](https://github.com/cgre-aachen/gempy),\n[MetPy](https://github.com/Unidata/MetPy),\n[napari](https://github.com/napari/napari),\n[Satpy](https://github.com/pytroll/satpy),\n[yt](https://github.com/yt-project/yt),\n[PyVista](https://github.com/pyvista/pyvista),\n[icepack](https://github.com/icepack/icepack),\n[histolab](https://github.com/histolab/histolab),\n[seaborn-image](https://github.com/SarthakJariwala/seaborn-image),\n[Open AR-Sandbox](https://github.com/cgre-aachen/open_AR_Sandbox),\n[climlab](https://github.com/climlab/climlab),\n[mne-python](https://github.com/mne-tools/mne-python),\n[GemGIS](https://github.com/cgre-aachen/gemgis),\n[SHTOOLS](https://github.com/SHTOOLS/SHTOOLS),\n[MOABB](https://github.com/NeuroTechX/moabb),\n[GeoViews](https://github.com/holoviz/geoviews),\n[ScopeSim](https://github.com/AstarVienna/ScopeSim),\n[Brainrender](https://github.com/brainglobe/brainrender),\n[pyxem](https://github.com/pyxem/pyxem),\n[cellfinder](https://github.com/brainglobe/cellfinder),\n[PVGeo](https://github.com/OpenGeoVis/PVGeo),\n[geosnap](https://github.com/oturns/geosnap),\n[BioCypher](https://github.com/biocypher/biocypher),\n[cf-xarray](https://github.com/xarray-contrib/cf-xarray),\n[Scirpy](https://github.com/scverse/scirpy),\n[rembg](https://github.com/danielgatis/rembg),\n[DASCore](https://github.com/DASDAE/dascore),\n[scikit-mobility](https://github.com/scikit-mobility/scikit-mobility),\n[Py-ART](https://github.com/ARM-DOE/pyart),\n[HyperSpy](https://github.com/hyperspy/hyperspy),\n[RosettaSciIO](https://github.com/hyperspy/rosettasciio),\n[eXSpy](https://github.com/hyperspy/exspy),\n[SPLASH](https://github.com/Adam-Boesky/astro_SPLASH)\n[xclim](https://github.com/Ouranosinc/xclim)\n[CLISOPS](https://github.com/roocs/clisops)\n\n\n\u003e If you're using Pooch, **send us a pull request** adding your project to the list.\n\n## Example\n\nFor a **scientist downloading a data file** for analysis:\n\n```python\nimport pooch\nimport pandas as pd\n\n# Download a file and save it locally, returning the path to it.\n# Running this again will not cause a download. Pooch will check the hash\n# (checksum) of the downloaded file against the given value to make sure\n# it's the right file (not corrupted or outdated).\nfname_bathymetry = pooch.retrieve(\n    url=\"https://github.com/fatiando-data/caribbean-bathymetry/releases/download/v1/caribbean-bathymetry.csv.xz\",\n    known_hash=\"md5:a7332aa6e69c77d49d7fb54b764caa82\",\n)\n\n# Pooch can also download based on a DOI from certain providers.\nfname_gravity = pooch.retrieve(\n    url=\"doi:10.5281/zenodo.5882430/southern-africa-gravity.csv.xz\",\n    known_hash=\"md5:1dee324a14e647855366d6eb01a1ef35\",\n)\n\n# Load the data with Pandas\ndata_bathymetry = pd.read_csv(fname_bathymetry)\ndata_gravity = pd.read_csv(fname_gravity)\n```\n\nFor **package developers** including sample data in their projects:\n\n```python\n\"\"\"\nModule mypackage/datasets.py\n\"\"\"\nfrom importlib import resources\nimport pandas\nimport pooch\n\n# Get the version string from your project. You have one of these, right?\nfrom . import version\n\n# Create a new friend to manage your sample data storage\nGOODBOY = pooch.create(\n    # Folder where the data will be stored. For a sensible default, use the\n    # default cache folder for your OS.\n    path=pooch.os_cache(\"mypackage\"),\n    # Base URL of the remote data store. Will call .format on this string\n    # to insert the version (see below).\n    base_url=\"https://github.com/myproject/mypackage/raw/{version}/data/\",\n    # Pooches are versioned so that you can use multiple versions of a\n    # package simultaneously. Use PEP440 compliant version number. The\n    # version will be appended to the path.\n    version=version,\n    # If a version as a \"+XX.XXXXX\" suffix, we'll assume that this is a dev\n    # version and replace the version with this string.\n    version_dev=\"main\",\n    # An environment variable that overwrites the path.\n    env=\"MYPACKAGE_DATA_DIR\",\n    # The cache file registry. A dictionary with all files managed by this\n    # pooch. Keys are the file names (relative to *base_url*) and values\n    # are their respective SHA256 hashes. Files will be downloaded\n    # automatically when needed (see fetch_gravity_data).\n    registry={\"gravity-data.csv\": \"89y10phsdwhs09whljwc09whcowsdhcwodcydw\"}\n)\n# You can also load the registry from a file. Each line contains a file\n# name and it's sha256 hash separated by a space. This makes it easier to\n# manage large numbers of data files. The registry file should be packaged\n# and distributed with your software.\nGOODBOY.load_registry(\n    resources.open_text(\"mypackage\", \"registry.txt\")\n)\n\n# Define functions that your users can call to get back the data in memory\ndef fetch_gravity_data():\n    \"\"\"\n    Load some sample gravity data to use in your docs.\n    \"\"\"\n    # Fetch the path to a file in the local storage. If it's not there,\n    # we'll download it.\n    fname = GOODBOY.fetch(\"gravity-data.csv\")\n    # Load it with numpy/pandas/etc\n    data = pandas.read_csv(fname)\n    return data\n```\n\n## Getting involved\n\n🗨️ **Contact us:**\nFind out more about how to reach us at\n[fatiando.org/contact](https://www.fatiando.org/contact/).\n\n👩🏾‍💻 **Contributing to project development:**\nPlease read our\n[Contributing Guide](https://github.com/fatiando/pooch/blob/main/CONTRIBUTING.md)\nto see how you can help and give feedback.\n\n🧑🏾‍🤝‍🧑🏼 **Code of conduct:**\nThis project is released with a\n[Code of Conduct](https://github.com/fatiando/community/blob/main/CODE_OF_CONDUCT.md).\nBy participating in this project you agree to abide by its terms.\n\n\u003e **Imposter syndrome disclaimer:**\n\u003e We want your help. **No, really.** There may be a little voice inside your\n\u003e head that is telling you that you're not ready, that you aren't skilled\n\u003e enough to contribute. We assure you that the little voice in your head is\n\u003e wrong. Most importantly, **there are many valuable ways to contribute besides\n\u003e writing code**.\n\u003e\n\u003e *This disclaimer was adapted from the*\n\u003e [MetPy project](https://github.com/Unidata/MetPy).\n\n## License\n\nThis is free software: you can redistribute it and/or modify it under the terms\nof the **BSD 3-clause License**. A copy of this license is provided in\n[`LICENSE.txt`](https://github.com/fatiando/pooch/blob/main/LICENSE.txt).\n","funding_links":[],"categories":["Python"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffatiando%2Fpooch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffatiando%2Fpooch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffatiando%2Fpooch/lists"}