{"id":19696292,"url":"https://github.com/joshhjacobson/cosif","last_synced_at":"2026-06-17T14:32:39.640Z","repository":{"id":174637019,"uuid":"650882337","full_name":"joshhjacobson/coSIF","owner":"joshhjacobson","description":"Spatial statistical prediction of solar-induced chlorophyll ﬂuorescence (SIF) from multivariate OCO-2 data","archived":false,"fork":false,"pushed_at":"2023-10-16T04:43:00.000Z","size":31771,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-02-27T14:27:10.711Z","etag":null,"topics":["cokriging","multivariate-statistics","oco-2","sif","spatial-statistics","xco2"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/joshhjacobson.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-06-08T02:43:55.000Z","updated_at":"2024-11-18T09:23:49.000Z","dependencies_parsed_at":null,"dependency_job_id":"adcaeb13-508f-4109-8504-974b979668ca","html_url":"https://github.com/joshhjacobson/coSIF","commit_stats":null,"previous_names":["joshhjacobson/cosif"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/joshhjacobson/coSIF","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/joshhjacobson%2FcoSIF","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/joshhjacobson%2FcoSIF/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/joshhjacobson%2FcoSIF/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/joshhjacobson%2FcoSIF/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/joshhjacobson","download_url":"https://codeload.github.com/joshhjacobson/coSIF/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/joshhjacobson%2FcoSIF/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34453431,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-17T02:00:05.408Z","response_time":127,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cokriging","multivariate-statistics","oco-2","sif","spatial-statistics","xco2"],"created_at":"2024-11-11T19:34:36.475Z","updated_at":"2026-06-17T14:32:39.620Z","avatar_url":"https://github.com/joshhjacobson.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# coSIF: Spatial statistical prediction of solar-induced chlorophyll fluorescence (SIF) from multivariate OCO-2 data\n\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.8078592.svg)](https://doi.org/10.5281/zenodo.8078592)\n\nThis repository contains code to reproduce the results in the paper:\n\n\u003e Jacobson, J., Cressie, N., \u0026 Zammit-Mangion, A. (2023). Spatial statistical prediction of solar-induced chlorophyll fluorescence (SIF) from multivariate OCO-2 data. *Remote Sensing*, 15(16), 4038. https://doi.org/10.3390/rs15164038\n\nUnless stated otherwise, all commands are to be run in the root directory of the repository.\n\nThe resulting coSIF data product for February, April, July, and October 2021 is available at: https://doi.org/10.5281/zenodo.8078592\n\nA supplementary dataset of all fitted model parameters is available at: https://doi.org/10.5281/zenodo.8078560\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://github.com/joshhjacobson/coSIF/blob/main/cosif_202107.png\" alt=\"drawing\" width=\"75%\"/\u003e\n\u003c/p\u003e\n\n## Installation and setup\n\nSetup the `cosif` conda environment using the provided file `environment.yaml` (this may take a few minutes):\n```\nconda env create -f environment.yaml\n```\nInstall the required R packages:\n```\nRscript -e \"install.packages(c(\n  \"tidyverse\", \"FRK\", \"sp\", \"sf\", \"rnaturalearth\", \"rnaturalearthdata\"\n))\"\n```\nCreate the required directories:\n```\nmkdir data figures\ncd data\nmkdir eda input intermediate output\ncd intermediate\nmkdir models validation\ncd models\nmkdir 202102 202104 202107 202110\ncd ../validation\nmkdir 202102 202104 202107 202110\n```\n\n## Getting the data\n\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.8078476.svg)](https://doi.org/10.5281/zenodo.8078476)\n\nA compressed file containing the input datasets required for the analysis is archived [here](https://doi.org/10.5281/zenodo.8078476). This file can be downloaded and extracted into the directory `data/input`. These input datasets include both observational and auxiliary datasets, which we obtained as described below.\n\n### Observational datasets: OCO-2 SIF and XCO2\n\nBoth the SIF and XCO2 datasets are publicly available through NASA's GES DISC (Goddard Earth Sciences Data and Information Services Center).\n\n- The SIF Lite files (version 10r) are available [here](https://disc.gsfc.nasa.gov/datasets/OCO2_L2_Lite_SIF_10r/summary). A subset of these NetCDF files for February, April, July, and October 2021 should be located in the directory `data/input/OCO2_L2_Lite_SIF_10r`.\n- The XCO2 Lite files (version 10r) are available [here](https://disc.gsfc.nasa.gov/datasets/OCO2_L2_Lite_FP_10r/summary). A subset of these NetCDF files for March, May, August, and November 2021 should be located in the directory `data/input/OCO2_L2_Lite_FP_10r`.\n\nNOTE: To reproduce the exploratory time series in Figure 1 (see below), you will need to retrieve all of the SIF and XCO2 version 10r Lite files. Organize the SIF and XCO2 parent directories as `data/eda/OCO2_L2_Lite_SIF_10r` and `data/eda/OCO2_L2_Lite_FP_10r`, respectively.\n\n### Auxiliary datasets: MODIS LCC\n\nThe Terra and Aqua combined Moderate Resolution Imaging Spectroradiometer (MODIS) Land Cover Climate Modeling Grid (CMG) (MCD12C1) Version 6.1 data product is publicly available on NASA's [Earthdata platform](https://lpdaac.usgs.gov/products/mcd12c1v061/). The product is available from 2001, but note that only the file for 2021 is needed. The HDF file should be located in the directory `data/input/MCD12C1v061`.\n\n## Running the framework\n\nIn an initial exploratory data analysis (EDA) step, we create a bivariate time series (Figure 1) from monthly, gridded SIF and XCO2 data. This analysis is isolated in the directory `00_eda`. Note that all of the version 10r Lite files are needed for this step (see above).\n\nThere are four main steps in our multivariate spatial-statistical-prediction framework, corresponding to four numbered directories. These are: \n\n1. `01_data_preparation`: Numbered files are to be run in order. Notebooks create the land-cover binary mask; collect and format all daily OCO-2 Lite files into a single NetCDF file for daily, spatially irregular SIF and a single NetCDF file for daily, spatially irregular XCO2; group SIF and XCO2 datasets by month and compute an average for each 0.05-degree CMG grid cell; an R script evaluates bisquare basis functions for all CMG grid cells; a final notebook combines gridded SIF, XCO2, and basis-function datasets into a single NetCDF file. \n2. `02_modeling`: For each of February, April, July, and October 2021, notebooks compute empirical (cross-) semivariograms from the gridded SIF and XCO2 data (one month later), and fit modeled (cross-) semivariograms. Each notebook takes around 10 minutes to run, and can be run on a laptop.\n3. `03_prediction`: For each of February, April, July, and October 2021, the predictions and prediction standard errors required for the coSIF data product are produced by running either `cokriging.py` or `kriging.py` from the command line and specifying the year-month string as an argument. For example, to use cokriging in July 2021, run\n    ```\n    conda activate cosif\n    cd 03_prediction\n    python cokriging.py 202107\n    ```\n    Or, to use kriging in October 2021, run\n    ```\n    conda activate cosif\n    cd 03_prediction\n    python kriging.py 202110\n    ```\n    Note that these are long-running processes that can take several hours to one day of compute time on a 64-core server. It is advised that they be executed in a [screen session](https://linuxize.com/post/how-to-use-linux-screen/) to avoid issues with interruption. Once the predictions and prediction standard errors have been produced for each month, the coSIF data product is collected and formatted in `collect_coSIF_datasets.ipynb`.\n4. `04_validation`: For each of February, April, July, and October 2021, the validation predictions for the Corn Belt validation block (b1) and the Cropland validation block (b2) are produced by running `run_validation.py`. The script takes three arguments from the command line: 1) validation year-month; 2) block name; 3) number of cores for parallelization. For example, the script can be run for July 2021 in the Corn Belt validation block (b1) using 64 cores as follows:\n    ```\n    conda activate cosif\n    cd 04_validation\n    python run_validation.py 202107 b1 64\n    ``` \n    The script can take around 30 minutes to run on a 64-core server. After running the script for both blocks in all four months, metrics used to summarize the validation predictions are collected in `collect_validation_results.ipynb`.\n\nNOTE: ensure that all notebooks are run using the `cosif` conda environment.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjoshhjacobson%2Fcosif","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjoshhjacobson%2Fcosif","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjoshhjacobson%2Fcosif/lists"}