{"id":31947580,"url":"https://github.com/vizzuality/cog_worker","last_synced_at":"2025-10-14T11:46:13.750Z","repository":{"id":45883676,"uuid":"378947073","full_name":"Vizzuality/cog_worker","owner":"Vizzuality","description":"Scalable arbitrary analysis on COGs","archived":false,"fork":false,"pushed_at":"2024-07-22T17:28:14.000Z","size":34943,"stargazers_count":27,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-08-31T09:52:02.402Z","etag":null,"topics":["cog","dask","geotiff","gis","raster","rasterio","remote-sensing"],"latest_commit_sha":null,"homepage":"https://vizzuality.github.io/cog_worker","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Vizzuality.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-06-21T13:51:59.000Z","updated_at":"2025-02-01T16:34:28.000Z","dependencies_parsed_at":"2022-09-05T07:01:17.621Z","dependency_job_id":"91e5fa30-3f84-4957-b4d8-410e02f2a0ac","html_url":"https://github.com/Vizzuality/cog_worker","commit_stats":{"total_commits":39,"total_committers":2,"mean_commits":19.5,"dds":0.07692307692307687,"last_synced_commit":"4e97ff045d316f1eb796ca4bddaac5475f3a5eed"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Vizzuality/cog_worker","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Vizzuality%2Fcog_worker","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Vizzuality%2Fcog_worker/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Vizzuality%2Fcog_worker/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Vizzuality%2Fcog_worker/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Vizzuality","download_url":"https://codeload.github.com/Vizzuality/cog_worker/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Vizzuality%2Fcog_worker/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279019073,"owners_count":26086518,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-14T02:00:06.444Z","response_time":60,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cog","dask","geotiff","gis","raster","rasterio","remote-sensing"],"created_at":"2025-10-14T11:46:07.403Z","updated_at":"2025-10-14T11:46:13.742Z","avatar_url":"https://github.com/Vizzuality.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Cog Worker\n\nScalable geospatial analysis on Cloud Optimized GeoTIFFs.\n\n - **Documentation**: https://vizzuality.github.io/cog_worker\n - **PyPI**: https://pypi.org/project/cog-worker\n\ncog_worker is a simple library to help write scripts to conduct scaleable\nanalysis of gridded data. It's intended to be useful for moderate- to large-scale\nGIS, remote sensing, and machine learning applications.\n\n## Installation\n\n```\npip install cog_worker\n```\n\n## Examples\n\nSee `docs/examples` for Jupyter notebook examples\n\n## Quick start\n\n0. A simple cog_worker script\n\n```python\nfrom rasterio.plot import show\nfrom cog_worker import Manager\n\ndef my_analysis(worker):\n    arr = worker.read('roads_cog.tif')\n    return arr\n\nmanager = Manager(proj='wgs84', scale=0.083333)\narr, bbox = manager.preview(my_analysis)\nshow(arr)\n```\n\n1. Define an analysis function that recieves a cog_worker.Worker as the first parameter.\n\n```python\nfrom cog_worker import Worker, Manager\nimport numpy as np\n\n# Define an analysis function to read and process COG data sources\ndef MyAnalysis(worker: Worker) -\u003e np.ndarray:\n\n    # 1. Read a COG (reprojecting, resampling and clipping as necessary)\n    array: np.ndarray = worker.read('roads_cog.tif')\n\n    # 2. Work on the array\n    # ...\n\n    # 3. Return (or post to blob storage etc.)\n    return array\n```\n\n2. Run your analysis in different scales and projections\n\n```python\nimport rasterio as rio\n\n# Run your analysis using a cog_worker.Manager which handles chunking\nmanager = Manager(\n    proj = 'wgs84',       # any pyproj string\n    scale = 0.083333,  # in projection units (degrees or meters)\n    bounds = (-180, -90, 180, 90),\n    buffer = 128          # buffer pixels when chunking analysis\n)\n\n# preview analysis\narr, bbox = manager.preview(MyAnalysis, max_size=1024)\nrio.plot.show(arr)\n\n# preview analysis chunks\nfor bbox in manager.chunks(chunksize=1500):\n    print(bbox)\n\n# execute analysis chunks sequentially\nfor arr, bbox in manager.chunk_execute(MyAnalysis, chunksize=1500):\n    rio.plot.show(arr)\n\n# generate job execution parameters\nfor params in manager.chunk_params(chunksize=1500):\n    print(params)\n```\n\n3. Write scale-dependent functions¶\n\n```python\nimport scipy\n\ndef focal_mean(\n    worker: Worker,\n    kernel_radius: float = 1000 # radius in projection units (meters)\n) -\u003e np.ndarray:\n\n    array: np.ndarray = worker.read('sample-geotiff.tif')\n\n    # Access the pixel size at worker.scale\n    kernel_size = kernel_radius * 2 / worker.scale\n    array = scipy.ndimage.uniform_filter(array, kernel_size)\n\n    return array\n```\n\n4. Chunk your analysis and run it in a dask cluster\n\n```python\nfrom cog_worker.distributed import DaskManager\nfrom dask.distributed import LocalCluster, Client\n\n# Set up a Manager with that connects to a Dask cluster\ncluster = LocalCluster()\nclient = Client(cluster)\ndistributed_manager = DaskManager(\n    client,\n    proj = 'wgs84',\n    scale = 0.083333,\n    bounds = (-180, -90, 180, 90),\n    buffer = 128\n)\n\n# Execute in worker pool and save chunks to disk as they complete.\ndistributed_manager.chunk_save('output.tif', MyAnalysis, chunksize=2048)\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvizzuality%2Fcog_worker","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvizzuality%2Fcog_worker","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvizzuality%2Fcog_worker/lists"}