{"id":19891830,"url":"https://github.com/inbo/n2khab-preprocessing","last_synced_at":"2025-03-01T05:19:47.039Z","repository":{"id":35884884,"uuid":"191627131","full_name":"inbo/n2khab-preprocessing","owner":"inbo","description":"Broadly useful data preparation for Flemish Natura 2000 habitat analyses","archived":false,"fork":false,"pushed_at":"2024-05-23T14:12:48.000Z","size":704,"stargazers_count":1,"open_issues_count":7,"forks_count":0,"subscribers_count":8,"default_branch":"main","last_synced_at":"2024-06-11T16:24:08.908Z","etag":null,"topics":["habitat","natura2000","open-data","preprocessing","reproducibility"],"latest_commit_sha":null,"homepage":null,"language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/inbo.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-06-12T18:48:15.000Z","updated_at":"2024-05-14T15:38:03.000Z","dependencies_parsed_at":"2022-08-08T12:15:36.137Z","dependency_job_id":"0ef147a9-71a5-4f86-bb16-55ad92768ce8","html_url":"https://github.com/inbo/n2khab-preprocessing","commit_stats":null,"previous_names":[],"tags_count":21,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/inbo%2Fn2khab-preprocessing","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/inbo%2Fn2khab-preprocessing/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/inbo%2Fn2khab-preprocessing/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/inbo%2Fn2khab-preprocessing/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/inbo","download_url":"https://codeload.github.com/inbo/n2khab-preprocessing/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241318713,"owners_count":19943385,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["habitat","natura2000","open-data","preprocessing","reproducibility"],"created_at":"2024-11-12T18:19:47.587Z","updated_at":"2025-03-01T05:19:47.022Z","avatar_url":"https://github.com/inbo.png","language":"R","readme":"## Welcome\n\nTo support _reproducible_ and _transparent_ analyses on Flemish Natura 2000 habitats and regionally important biotopes (RIBs), this repo provides **data-generating (preprocessing) workflows** as _scripts_ or _R markdown_.\nMore specifically, it generates _those_ processed data sources that are worth saving, consolidating and distributing as such -- these data sources are then defined with an ID for further use.\nProviding readily processed datasets makes sense in the case of time-consuming calculations, despite the reproducibility given the availability of a preprocessing workflow.\n\nThe repo is a companion to the R package **[n2khab](https://inbo.github.io/n2khab)**, which provides functions that return several datasets as standardized R-objects, as well as functions to do certain preprocessing steps.\nSo, if you're just looking for a standardized way of reading existing (raw or processed) data sources into R, look no further than the package!\nThat is, unless the data source is not yet covered there -\u003e [contribute](#you-are-welcome-to-contribute) to this repo!\n\nThis repo is set up with a special interest in the design, review and analysis of Natura 2000 habitat monitoring programmes at the Flemish scale (each is a combination of multiple monitoring schemes).\nBut as defined in the beginning, the repo's scope is wider!\nFor more information, see the [n2khab-monitoring](https://github.com/inbo/n2khab-monitoring) repo (which centralizes planning and workflow documentation in N2KHAB monitoring).\n\nThe ultimate aim is to achieve open and reproducible data workflows. That is a prerequisite for qualifiable science, for sharing and for broad cooperation.\n\n\n\n### Find your way: repository structure\n\nThis is the structure of the repo:\n\n```\n├── n2khab_data                 \u003c- Binary or large data! Copy needed data here. IGNORED by git.\n    ├── 10_raw\n    └── 20_processed            \u003c- Either copy from a source location, or generate with code in src.\n├── src                         \u003c- Put scripts / R markdown files here.\n    ├── generate_XXX            \u003c- Put files together that focus on a common result.\n    ├── generate_YYY\n    └── miscellaneous           \u003c- For your own preparatory scripts and notebooks.\n├── n2khab-preprocessing.Rproj  \u003c- RStudio project file\n├── LICENSE\n└── README.md\n```\n\n### You are welcome to contribute!\n\n#### Managing data and generating processed data\n\nYou should definitely have a look at the distribution and setup of standard data sources for N2KHAB projects, given that the `n2khab-preprocessing` repo conforms to this as well:\n\n```r\nvignette(\"v020_datastorage\", package = \"n2khab\")\n```\n\nProcessed data, or the results of dataset-specific reading functions (see [n2khab](https://inbo.github.io/n2khab) package), are to be [tidied](https://r4ds.had.co.nz/tidy-data.html#tidy-data-1) and as much as possible internationalized:\n\n- availability of English names for types, environmental pressures, ...\nOther languages can be accomodated as well;\n- English names for table headings (dataframe variables).\n\nNote that the [n2khab](https://inbo.github.io/n2khab) package holds some textual reference data files itself.\nThe code to reproduce those is part of the [n2khab](https://inbo.github.io/n2khab) repository.\n\n\n#### Coding tools: it's never too late for learning!\n\nWhen writing workflows (in `src`):\n\n- please use `tidyverse`, `sf` and `raster` packages for data reading.\nDiscover the human-friendly way of coding a data processing pipeline through the use of [pipes](https://r4ds.had.co.nz/pipes.html)!\nOrganise data in R in a [tidy](https://r4ds.had.co.nz/tidy-data.html#tidy-data-1) way in order to avoid troubles later on.\nRecommended resources to get started are:\n    - [R for Data Science](https://r4ds.had.co.nz/)\n    - [Geocomputation with R](https://geocompr.robinlovelace.net)\n- have a quick look at the [tidyverse style guide](https://style.tidyverse.org/).\nThere you see how to style object, variable and function names, as well as the documentation.\nAt least keep in mind: **use lower case and 'snake_case'** for object, variable and function names.\n- preferrably use `git2rdata::write_vc()` when an R _dataframe_ needs to be written to disk for later use (see \u003chttps://ropensci.github.io/git2rdata/\u003e).\nDefine the sorting order well (avoid ties) by using the `sorting` argument, in order to get meaningful _diffs_ when data are updated later.\nThe function stores the object in a version-control + R friendly format (tab separated values (.tsv) plus metadata on sorting order and variables (.yml)).\nThe R object can then be 100% recreated using `git2rdata::read_vc()`!!\n- if your function returns a dataframe, use `dplyr::as_tibble()` to return it as a tibble instead.\nA tibble is a dataframe that makes working in the tidyverse a little [easier](https://r4ds.had.co.nz/tibbles.html).\n\n\n#### How can I contribute code?\n\nMore detailed info on git workflows at INBO: \u003chttps://inbo.github.io/tutorials/tags/git/\u003e.\nSee also [these git workshop materials](https://inbo.github.io/git-course/index.html).\n\n1. Make commits (in your local clone of the remote repo on Github) _in your own git branch_, branched off from the `main` branch.\n(But see this in a relative manner: exactly the same process can be repeated by someone else in turn, relative to your branch.\nSo '`main`' in this protocol can be replaced by another branch name!)\nYou can push your branch to the remote as often as you like, as it will not influence other branches (first time: do `git push -u origin yourbranchname`; afterwards `git push` suffices). It serves as a backup and enables others to work with you on that branch.\n1. Meanwhile, make sure that your branch stays up to date with evolutions in `main` (i.e. in your local repo, update `main` with `git checkout main \u0026\u0026 git pull` and then, with your own branch checked out again, do `git merge --no-ff main`), in order to prevent merge conflicts with `main` later on.\nAt this stage, you need to resolve any merge conflicts that may arise in your own branch.\n1. Propose to merge your commits into `main`: this starts with making a 'pull request' (PR; actually this is a merge request) and assign at least one reviewer before a merge can be decided. At that moment, open online discussion in the repo is possible on your changes (for other open discussion that you want to start, make an _issue_). As long as no merge is performed, more commits can be added to this PR with `git push`, e.g. to implement requested changes by others.\n    - note that, if you branched off another (reference) branch than `main`, make sure to change the reference branch in the pull request (the default reference is `main`).\n1. After your PR is merged, pull the reference branch (usually `main`) and clean up your local repo in order to keep up with the remote.\n\n\n\n### Repository history\n\nPrevious to commit `8990c23`, the code was part of the [n2khab-monitoring](https://github.com/inbo/n2khab-monitoring) repo (formerly 'n2khab-inputs'), where the original version history remains stored (and complete reproducibility is guaranteed).\nAs a convenience, the **n2khab-preprocessing** repo still holds the rewritten (shrinked) version history from before commit `8990c23`, as defined by the related files and folders.\nSee [this](https://github.com/inbo/n2khab-monitoring/issues/28) issue in the 'n2khab-monitoring' repo, where the migration is documented.\n\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finbo%2Fn2khab-preprocessing","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Finbo%2Fn2khab-preprocessing","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finbo%2Fn2khab-preprocessing/lists"}