Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/henrikbengtsson/dirdf
R package: dirdf - Extracts Metadata from Directory and File Names
https://github.com/henrikbengtsson/dirdf
files metadata package r r-package rstats unconf unconf16
Last synced: 29 days ago
JSON representation
R package: dirdf - Extracts Metadata from Directory and File Names
- Host: GitHub
- URL: https://github.com/henrikbengtsson/dirdf
- Owner: HenrikBengtsson
- License: other
- Created: 2016-04-01T17:59:56.000Z (almost 9 years ago)
- Default Branch: master
- Last Pushed: 2021-03-25T22:27:46.000Z (almost 4 years ago)
- Last Synced: 2024-12-04T09:40:06.197Z (about 1 month ago)
- Topics: files, metadata, package, r, r-package, rstats, unconf, unconf16
- Language: R
- Homepage: https://github.com/HenrikBengtsson/dirdf
- Size: 67.4 KB
- Stars: 58
- Watchers: 13
- Forks: 1
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- Changelog: NEWS
- License: LICENSE
Awesome Lists containing this project
README
# dirdf - Extracts Metadata from Directory and File Names
[![Build Status](https://travis-ci.org/ropenscilabs/dirdf.svg)](https://travis-ci.org/ropenscilabs/dirdf)
[![Build Status](https://ci.appveyor.com/api/projects/status/egi4i7nwyvrvm160?svg=true)](https://ci.appveyor.com/project/HenrikBengtsson/dirdf)
[![codecov](https://codecov.io/gh/ropenscilabs/dirdf/badge.svg)](https://codecov.io/gh/ropenscilabs/dirdf)
[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active)Create tidy data frames of file metadata from directory and file names.
## Install
This package is only available on GitHub - it is _not_ available on CRAN. Install it as:
```r
remotes::install_github("ropenscilabs/dirdf")
```## Examples
``` r
path <- system.file("examples", "dataset_1", package = "dirdf")
dir(path)
```## [1] "2013-06-26_BRAFWTNEGASSAY_Plasmid-Cellline-100-1MutantFraction_A01.csv"
## [2] "2013-06-26_BRAFWTNEGASSAY_Plasmid-Cellline-100-1MutantFraction_A02.csv"
## [3] "2014-02-26_BRAFWTNEGASSAY_FFPEDNA-CRC-1-41_D08.csv"
## [4] "2014-03-05_BRAFWTNEGASSAY_FFPEDNA-CRC-REPEAT_platefile.csv"
## [5] "2016-04-01_BRAFWTNEGASSAY_FFPEDNA-CRC-1-41.csv"``` r
dirdf::dirdf(path, template = "Date_Assay_Plasmid-Type-Fraction_WellNumber?.extension")
```## Date Assay Plasmid Type Fraction
## 1 2013-06-26 BRAFWTNEGASSAY Plasmid Cellline 100-1MutantFraction
## 2 2013-06-26 BRAFWTNEGASSAY Plasmid Cellline 100-1MutantFraction
## 3 2014-02-26 BRAFWTNEGASSAY FFPEDNA CRC 1-41
## 4 2014-03-05 BRAFWTNEGASSAY FFPEDNA CRC REPEAT
## 5 2016-04-01 BRAFWTNEGASSAY FFPEDNA CRC 1-41
## WellNumber extension
## 1 A01 csv
## 2 A02 csv
## 3 D08 csv
## 4 platefile csv
## 5 csv
## pathname
## 1 2013-06-26_BRAFWTNEGASSAY_Plasmid-Cellline-100-1MutantFraction_A01.csv
## 2 2013-06-26_BRAFWTNEGASSAY_Plasmid-Cellline-100-1MutantFraction_A02.csv
## 3 2014-02-26_BRAFWTNEGASSAY_FFPEDNA-CRC-1-41_D08.csv
## 4 2014-03-05_BRAFWTNEGASSAY_FFPEDNA-CRC-REPEAT_platefile.csv
## 5 2016-04-01_BRAFWTNEGASSAY_FFPEDNA-CRC-1-41.csv``` r
dirdf::dirdf(path, template = "Year-Month-Day_Assay_Plasmid-Type-Fraction_WellNumber?.extension")
```## Year Month Day Assay Plasmid Type Fraction
## 1 2013 06 26 BRAFWTNEGASSAY Plasmid Cellline 100-1MutantFraction
## 2 2013 06 26 BRAFWTNEGASSAY Plasmid Cellline 100-1MutantFraction
## 3 2014 02 26 BRAFWTNEGASSAY FFPEDNA CRC 1-41
## 4 2014 03 05 BRAFWTNEGASSAY FFPEDNA CRC REPEAT
## 5 2016 04 01 BRAFWTNEGASSAY FFPEDNA CRC 1-41
## WellNumber extension
## 1 A01 csv
## 2 A02 csv
## 3 D08 csv
## 4 platefile csv
## 5 csv
## pathname
## 1 2013-06-26_BRAFWTNEGASSAY_Plasmid-Cellline-100-1MutantFraction_A01.csv
## 2 2013-06-26_BRAFWTNEGASSAY_Plasmid-Cellline-100-1MutantFraction_A02.csv
## 3 2014-02-26_BRAFWTNEGASSAY_FFPEDNA-CRC-1-41_D08.csv
## 4 2014-03-05_BRAFWTNEGASSAY_FFPEDNA-CRC-REPEAT_platefile.csv
## 5 2016-04-01_BRAFWTNEGASSAY_FFPEDNA-CRC-1-41.csvInconsistent file names
-----------------------``` r
path <- system.file("examples", "dataset_2", package = "dirdf")
dir(path)
```## [1] "2011-12-16_OTHERASSAY_FFPEDNA-CRC-1-41_D08.csv"
## [2] "2013-06-26_OTHERASSAY_Plasmid-Cellline-100-1MutantFraction_B02.csv"
## [3] "2014-03-05_OTHERASSAY_FFPEDNA-CRC-REPEAT_platefile.csv"
## [4] "2014-07-06_OTHERASSAY_Plasmid-Cellline-100-1MutantFraction_B01.csv"
## [5] "2016-01-11_OTHERASSAY_FFPEDNA-CRC-2-41.csv"``` r
dirdf::dirdf(path, template = "date_assay_experiment_well.ext")
```## Error in dirdf_parse(pathnames, template = template, colnames = colnames, : Unexpected path(s) found:
## 2016-01-11_OTHERASSAY_FFPEDNA-CRC-2-41.csv``` r
dirdf::dirdf(path, template = "date_assay_experiment_well?.ext")
```## date assay experiment well ext
## 1 2011-12-16 OTHERASSAY FFPEDNA-CRC-1-41 D08 csv
## 2 2013-06-26 OTHERASSAY Plasmid-Cellline-100-1MutantFraction B02 csv
## 3 2014-03-05 OTHERASSAY FFPEDNA-CRC-REPEAT platefile csv
## 4 2014-07-06 OTHERASSAY Plasmid-Cellline-100-1MutantFraction B01 csv
## 5 2016-01-11 OTHERASSAY FFPEDNA-CRC-2-41 csv
## pathname
## 1 2011-12-16_OTHERASSAY_FFPEDNA-CRC-1-41_D08.csv
## 2 2013-06-26_OTHERASSAY_Plasmid-Cellline-100-1MutantFraction_B02.csv
## 3 2014-03-05_OTHERASSAY_FFPEDNA-CRC-REPEAT_platefile.csv
## 4 2014-07-06_OTHERASSAY_Plasmid-Cellline-100-1MutantFraction_B01.csv
## 5 2016-01-11_OTHERASSAY_FFPEDNA-CRC-2-41.csvMetadata in directory and path names
------------------------------------``` r
> dir("examples/", recursive = TRUE)
[1] "LabA,2016/2013-06-26_BRAFWTNEG_Plasmid-Cellline-100_A01.csv"
[2] "LabA,2016/2013-06-26_BRAFWTNEG_Plasmid-Cellline-100_A02.csv"
[3] "LabA,2016/2014-02-26_BRAFWTNEG_FFPEDNA-CRC-1-41_D08.csv"
[4] "LabA,2016/2014-03-05_BRAFWTNEG_FFPEDNA-CRC-REPEAT_H03.csv"
[5] "LabA,2016/2016-04-01_BRAFWTNEG_FFPEDNA-CRC-1-41_E12.csv"
[6] "LabB,2015/2011-12-16_OTHER_FFPEDNA-CRC-1-41_D08.csv"
[7] "LabB,2015/2013-06-26_OTHER_Plasmid-Cellline-100_B02.csv"
[8] "LabB,2015/2014-03-05_OTHER_FFPEDNA-CRC-REPEAT_H03.csv"
[9] "LabB,2015/2014-07-06_OTHER_Plasmid-Cellline-100_B01.csv"
[10] "LabB,2015/2016-01-11_OTHER_FFPEDNA-CRC-2-41.csv"> dirdf::dirdf("examples/", template = "lab,year/date_assay_experiment_well?.ext")
lab year date assay experiment well ext pathname
1 LabA 2016 2013-06-26 BRAFWTNEG Plasmid-Cellline-100 A01 csv LabA,2016/2013-06-26_BRAFWTNEG_Plasmid-Cellline-100_A01.csv
2 LabA 2016 2013-06-26 BRAFWTNEG Plasmid-Cellline-100 A02 csv LabA,2016/2013-06-26_BRAFWTNEG_Plasmid-Cellline-100_A02.csv
3 LabA 2016 2014-02-26 BRAFWTNEG FFPEDNA-CRC-1-41 D08 csv LabA,2016/2014-02-26_BRAFWTNEG_FFPEDNA-CRC-1-41_D08.csv
4 LabA 2016 2014-03-05 BRAFWTNEG FFPEDNA-CRC-REPEAT H03 csv LabA,2016/2014-03-05_BRAFWTNEG_FFPEDNA-CRC-REPEAT_H03.csv
5 LabA 2016 2016-04-01 BRAFWTNEG FFPEDNA-CRC-1-41 E12 csv LabA,2016/2016-04-01_BRAFWTNEG_FFPEDNA-CRC-1-41_E12.csv
6 LabB 2015 2011-12-16 OTHER FFPEDNA-CRC-1-41 D08 csv LabB,2015/2011-12-16_OTHER_FFPEDNA-CRC-1-41_D08.csv
7 LabB 2015 2013-06-26 OTHER Plasmid-Cellline-100 B02 csv LabB,2015/2013-06-26_OTHER_Plasmid-Cellline-100_B02.csv
8 LabB 2015 2014-03-05 OTHER FFPEDNA-CRC-REPEAT H03 csv LabB,2015/2014-03-05_OTHER_FFPEDNA-CRC-REPEAT_H03.csv
9 LabB 2015 2014-07-06 OTHER Plasmid-Cellline-100 B01 csv LabB,2015/2014-07-06_OTHER_Plasmid-Cellline-100_B01.csv
10 LabB 2015 2016-01-11 OTHER FFPEDNA-CRC-2-41 <NA> csv LabB,2015/2016-01-11_OTHER_FFPEDNA-CRC-2-41.csv
```[![rOpenSci footer](http://ropensci.org/public_images/github_footer.png)](https://ropensci.org)