Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/albansagouis/binder_biotimex

Last synced: about 2 months ago
JSON representation

Host: GitHub
URL: https://github.com/albansagouis/binder_biotimex
Owner: AlbanSagouis
License: gpl-2.0
Created: 2024-03-13T11:02:13.000Z (10 months ago)
Default Branch: main
Last Pushed: 2024-03-13T11:08:52.000Z (10 months ago)
Last Synced: 2024-10-13T19:13:43.763Z (3 months ago)
Language: R
Size: 22.5 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: readme.md
- License: LICENSE

Awesome Lists containing this project

README

        BioTIMEx

--------

### Description

This research compendium regroups scripts used to download, re-structure

and aggregate data sets to constitute a large meta-analysis of

communities in experimental setups sampled several times. The code found

here was originally versionned using git and stored on github <>,

and was eventually submitted to Zenodo <>. This code accompanies

the article: XXXXX.

### Reproducibility and R environment

To ensure that the working environment (R version and package version)

are documented and isolated, the package renv

(https://rstudio.github.io/renv/index.html)

was used. By running `renv::restore()`, renv will install all missing

packages at once. This function will use the `renv.lock` file to

download the same versions of packages that we used.

### Methods

Data sets were originally searched for among LTER data sets and suitable

open access data stored on EPI were selected

(https://portal.edirepository.org/nis/home.jsp).

Suitable data sets were individually downloaded from R. Scripts managing

these downloads are grouped inside `R/data download/`. These scripts

follow EDI process of data checking and formatting. You can run all

these scripts at once by running this command here or from

`R/1.0_downloading_raw_data.r`:

    if(!dir.exists('data/raw data/'))   dir.create('data/raw data/')

    listF <- list.files('./R/data download', pattern = ".R|.r", full.names = TRUE)

    lapply(listF, function(fullPath) source(fullPath, encoding = 'UTF-8', echo = FALSE, local = TRUE))

All downloaded data sets are saved in separate folders named following

the convention `author_year`.

In a second step, each data set is re-structured or wrangled to fit a

common format before analysis. The scripts turning the original

heterogeneously structured data sets into comparable tables are in the

`./R/data wrangling/` folder. You can run all these scripts at once by

running this command here or from `R/2.0-wrangling_raw_data.r`:

    if(!dir.exists('data/wrangled data/'))   dir.create('data/wrangled data/')

    listF <- list.files('R/data wrangling', pattern = ".R|.r", full.names = TRUE)

    lapply(listF, function(fullPath) source(fullPath, encoding = 'UTF-8', echo = FALSE, local = TRUE))

Finally, all restructured tables are aggregated together in a final

table by the `.R/3.0_merging_long-format_tables.r` script. The structure

of the end-product table is a long format with each row recording the

composition of a community in one place at a given time. Format is

described in `./data/template long format.txt` and variables are

defined.

### Analyses

Further analyses were carried at on R too by Shane Blowes and

collaborators.