{"id":28360236,"url":"https://github.com/epiforecasts/cityforecasts","last_synced_at":"2025-10-16T22:42:35.288Z","repository":{"id":278168874,"uuid":"934701692","full_name":"epiforecasts/cityforecasts","owner":"epiforecasts","description":"R scripts to generate city-level forecasts ","archived":false,"fork":false,"pushed_at":"2025-06-13T15:10:43.000Z","size":244,"stargazers_count":1,"open_issues_count":16,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-06-22T04:41:47.185Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/epiforecasts.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-02-18T09:14:06.000Z","updated_at":"2025-05-21T16:37:43.000Z","dependencies_parsed_at":"2025-04-30T06:23:23.364Z","dependency_job_id":"1bade06d-312a-4216-818a-e7fc909fd556","html_url":"https://github.com/epiforecasts/cityforecasts","commit_stats":null,"previous_names":["kaitejohnson/cityforecasts","epiforecasts/cityforecasts"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/epiforecasts/cityforecasts","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/epiforecasts%2Fcityforecasts","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/epiforecasts%2Fcityforecasts/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/epiforecasts%2Fcityforecasts/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/epiforecasts%2Fcityforecasts/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/epiforecasts","download_url":"https://codeload.github.com/epiforecasts/cityforecasts/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/epiforecasts%2Fcityforecasts/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261371772,"owners_count":23148682,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-05-28T11:03:39.179Z","updated_at":"2025-10-16T22:42:35.282Z","avatar_url":"https://github.com/epiforecasts.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"# cityforecasts\nR scripts to generate city-level forecasts\n\n## Summary\nThis repository contains R code to generate city-level forecasts for submission to the [flu-metrocast hub](https://github.com/reichlab/flu-metrocast).\nAll models are exploratory/preliminary, though we will regularly update this document to describe the latest mathematical model used in the submission.\n\nAll outputs submitted to the Hub will be archived in this repository, along with additional model metadata (such as the model definition associated with a submission and details on any additional data sources used or decisions made in the submission process).\nIf significant changes to the model are made during submission, we will rename the model in the submission file.\n\nInitially, we plan to fit the data from each state independently, using hierarchical partial pooling to jointly fit the cities within a state.\nThis initially includes producing forecast of:\n|Forecast target| Location |\n|-----------|-----------|\n| ED visits due to ILI | New York City (5 boroughs, unknown, citywide) |\n| Percent of ED visits due to flu | Texas (5 metro areas) |\n\nWe plan to use the same latent model structure for both forecast targets, modifying the observation model for count data (NYC) vs proportion data (Texas).\n\n## Workflow\nBecause all data is available publicly, the forecasts generated should be completely reproducible from the specified configuration file.\nWe start by using the [`mvgam`](https://github.com/nicholasjclark/mvgam) package, which is a an R package that leverages both [`mgcv`](https://cran.r-project.org/web/packages/mgcv/index.html) and [`brms`](https://paulbuerkner.com/brms/) formula interface to fit Bayesian Dynamic Generalized Additive Models (GAMs).\nThese packages use metaprogramming to produce Stan files, and we also include the Stan code generated by the package.\n\nTo produce forecasts each week we follow the following workflow:\n\n1. Modify the configuration file in `input/config.toml`\n2. In the command line, run ` Rscript preprocess_data.R input/config.toml {index}` where index is used to track the individual model runs, which in this case, also have different pre-processing due to being from different data sources. \n3. Next run ` Rscript models.R input/config.toml {index}`\n4. Lastly run `Rscript postprocess_forecasts.R input/{forecast_date}/config.toml`\n5. This will populate the `output/cityforecasts/{forecast_date}` folder with a csv file formatted following the Hub submission guidelines.\n\nEventually, steps 2-4 will be automated with the Github Action `.git/workflows/generate_forecasts` and set on a schedule to run after 12 pm CST, corresponding to the time that the `target_data` is updated on the Hub.\n\n## Model definition\n\nThe below describes the preliminary model used:\n### Observation model\nFor the forecasts of counts due to ED visits, we assume a Poisson observation process\n\n$$\ny_{l,t} \\sim Poisson(exp(x_{l,t}))\n$$\n\nFor the forecasts of the percent of ED visits due to flu, we assume a Beta observation process on the proportion of ED visits due to flu:\n\n```math\n\\begin{align}\np_{l,t} = y_{l,t} \\times 100 \\\\\ny_{l,t} \\sim Beta (z_{l,t}, \\phi) \\\\\nlogit(z_{l,t}) = x_{l,t}\n\\end{align}\n```\n\n### Latent state-space model: Dynamic hierachical GAM with independent autoregression\nWe model latent admissions with a hierarchical GAM component to capture shared seasonality and weekday effects and a univariate autoregressive component to capture trends in the dynamics within each location.\n\n```math\n\\begin{align}\nx_{l,t} \\sim Normal(\\mu_{l,t} + \\delta_{l} x_{l,t-1},  \\sigma_l)\\\\\n\\mu_{l,t} = \\beta_l + f_{global,t}(weekofyear) + f_{l,t}(weekofyear) \\\\\n\\beta_l \\sim Normal(\\beta_{global}, \\sigma_{count}) \\\\\n\\sigma_{count} \\sim exp(0.33) \\\\\n\\delta_l \\sim Normal(0.5, 0.25) T[0,1] \\\\\n\\sigma \\sim exp(1) \\\\\n\\end{align}\n```\n\nFor the NYC data, we have count data on a daily scale so we add in a weekday component\n```math\n\\mu_{l,t} =  \\beta_l + f_{global,t}(week) + f_{l,t}(week) + f_{global,t}(wday)\n```\nAnd since $\\beta_{global}$ represents the intecept on the count scale, we place a prior on it using the mean observed count across the historical data:\n\n$$\n\\beta_{global} \\sim Normal(log(\\frac{\\sum_{l=1}^L \\sum_{t=1}^T y_{l,t}}{N_{obs}}), 1) \\\\\n$$\n\nwhere $N_obs$ is the number of observations of $y_{l,t}$. \n\nFor the TX data, $\\beta_{global}$ represents the intercept as a proportion, so we use:\n\n$$\n\\beta_{global} \\sim Normal(logit(\\frac{\\sum_{l=1}^L \\sum_{t=1}^T y_{l,t}}{N_{obs}}), 1) \\\\\n$$\n\n\n\nFor the NYC data, we have daily data so $t$ is measured in days, whereas for the Texas data, $t$ is measured in weeks.\n\n## Additional models \nThe above model estimates a hierarchical dynamic GAM, which contains both a GAM component and an autoregressive component. \nWe can additionally fit a more traditional hierarchical GAM (with no autoregression but with tensor product splines to jointly estimate across location and time) as well as a vector ARIMA without a spline component. Eventually, we can also mash everything together and estimate a hierarchical GAM with a multivariante vector autoregression. These will be areas of future work. \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fepiforecasts%2Fcityforecasts","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fepiforecasts%2Fcityforecasts","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fepiforecasts%2Fcityforecasts/lists"}