https://github.com/openpharma/filters

Last synced: 6 months ago
JSON representation
Host: GitHub
URL: https://github.com/openpharma/filters
Owner: openpharma
License: other
Created: 2023-07-05T11:28:59.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2024-04-16T15:02:04.000Z (about 1 year ago)
Last Synced: 2024-04-17T10:09:44.785Z (about 1 year ago)
Language: R
Size: 1.93 MB
Stars: 4
Watchers: 3
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Contributing: .github/CONTRIBUTING.Rmd
- License: LICENSE
- Codeowners: .github/CODEOWNERS
Awesome Lists containing this project

README

        filters

================

A "snake_case" filter system to `R`.

## Installation

``` r

if (!requireNamespace("remotes")) {

  install.packages("remotes")

}

remotes::install_github(

  repo = "openpharma/filters",

  upgrade = "never"

)

```

## Features

``` r

library(filters)

library(magrittr)

library(random.cdisc.data)

library(rtables)

library(tern)

set.seed(1)

adsl <- radsl()

adae <- radae(adsl)

vads <- list(adsl = adsl, adae = adae)

```

### Built-In Filters

`{filters}` comes with a built-in filter library. You can list them using `list_all_filters()`.

``` r

list_all_filters()

```

    # A tibble: 272 x 4

       id     title                      target condition                      

                                                           

     1 COV    Confirmed/Suspected COVID… ADAE   ACOVFL == 'Y'                  

     2 COVAS  AEs Associated with COVID… ADAE   ACOVASFL == 'Y'                

     3 CTC35  Grade 3-5 Adverse Events   ADAE   ATOXGR %in% c('3', '4', '5')   

     4 DSC    Adverse Events Leading to… ADAE   AEACN == 'DRUG WITHDRAWN'      

     5 DSM    Adverse Events Leading to… ADAE   AEACN %in% c('DOSE INCREASED',…

     6 FATAL  Fatal Adverse Events       ADAE   AESDTH == 'Y'                  

     7 NCOV   Excluding Confirmed/Suspe… ADAE   ACOVFL != 'Y'                  

     8 NCOVAS AEs not Associated with C… ADAE   ACOVASFL != 'Y'                

     9 NFATAL Non-fatal Adverse Events   ADAE   AESDTH == 'N'                  

    10 NREL   Adverse Events not Relate… ADAE   AREL == 'N'                    

    # … with 262 more rows

### Adding New Filters

To add a new filter use `add_filter()`. The last argument, `condition`,

defines the condition to use to filter the datasets later on. It will be

passed to `subset()` when calling `apply_filter()`.

``` r

add_filter(

  id = "CTC34",

  title = "Grade 3-4 Adverse Events",

  target = "ADAE",

  condition = AETOXGR %in% c("4", "5")

)

```

Alternatively, you can use `load_filters()` to load filter definitions

from a yaml file. The file should be structured like this:

``` yaml

CTC4:

  title: Grade 4 Adverse Events

  target: ADAE

  condition: ATOXGR == "4"

TP53WT:

  title: TP53 Wild Type

  target: ADSL

  condition: TP53 == "WILD TYPE"

```

``` r

file_path <- system.file("filters_eg.yaml", package = "filters")

load_filters(file_path)

```

You can confirm that filters haven been successfully added by using

`get_filter()`.

``` r

get_filter("CTC34")

```

    $title

    [1] "Grade 3-4 Adverse Events"

    

    $target

    [1] "ADAE"

    

    $condition

    AETOXGR %in% c("4", "5")

If you ask for a non-existing filter `get_filter()` will throw an error.

``` r

get_filter("GIDIS")

```

    Error: Filter 'GIDIS' does not exist.

To overwrite an existing filter you will have to set `overwrite = TRUE`.

Otherwise an error is thrown.

``` r

add_filter(

  id = "FATAL",

  title = "Fatal Adverse Events",

  target = "ADAE",

  condition = ATOXGR == "5"

)

```

    Error: Filter 'FATAL' already exists. Set `overwrite = TRUE` to force overwriting the existing filter definition.

``` r

add_filter(

  id = "FATAL",

  title = "Fatal Adverse Events",

  target = "ADAE",

  condition = ATOXGR == "5",

  overwrite = TRUE

)

```

### Applying Filters to Datasets

You can use `apply_filter()` to filter a single dataset or a `list` of

multiple

    datasets.

``` r

adsl_se <- apply_filter(adsl, "SE")

```

    Filter 'SE' matched target ADSL.

    400/400 records matched the filter condition `SAFFL == 'Y'`.

``` r

adae_ctc34_ser <- apply_filter(adae, "CTC34_SER")

```

    Filters 'CTC34', 'SER' matched target ADAE.

    216/1967 records matched the filter condition `AETOXGR %in% c('4', '5') & AESER == 'Y'`.

``` r

filtered_datasets <- apply_filter(vads, "CTC34_SER_SE")

```

    Filter 'SE' matched target ADSL.

    400/400 records matched the filter condition `SAFFL == 'Y'`.

    Filters 'CTC34', 'SER' matched target ADAE.

    216/1967 records matched the filter condition `AETOXGR %in% c('4', '5') & AESER == 'Y'`.

As you can see `apply_filter()` gives you feedback on which IDs matched

the dataset. This matching is done by the name of the input dataset. It

does not matter whether the dataset name is in upper or lower case or a

mix of both.

``` r

ADSL <- adsl

adsl_it <- apply_filter(ADSL, "IT")

```

    Filter 'IT' matched target ADSL.

    400/400 records matched the filter condition `ITTFL == 'Y'`.

In case your dataset is not named in a standard way you can manually

tell `apply_filter()` which dataset it is by setting the `target`

argument.

``` r

sl <- adsl

sl_it1 <- apply_filter(sl, "IT")

```

    No filter matched target SL.

``` r

sl_it2 <- apply_filter(sl, "IT", target = "ADSL")

```

    Filter 'IT' matched target ADSL.

    400/400 records matched the filter condition `ITTFL == 'Y'`.

### Using {filters} for Generating Outputs

`{filters}` package works well with `{rtables}` and `{tern}` packages. See the

following example of creating a table by a function:

``` r

t_ae <- function(datasets) {

  anl <- merge(

    x = datasets$adsl,

    y = datasets$adae,

    by = c("STUDYID", "USUBJID"),

    all = FALSE, # inner join

    suffixes = c("", "_ADAE")

  )

  

  split_fun <- drop_split_levels

  lyt <- basic_table(show_colcounts = TRUE) %>%

  split_cols_by(var = "ARM") %>%

  add_overall_col(label = "All Patients") %>%

  analyze_num_patients(

    vars = "USUBJID",

    .stats = c("unique", "nonunique"),

    .labels = c(

      unique = "Total number of patients with at least one adverse event",

      nonunique = "Overall total number of events"

    )

  ) %>%

  split_rows_by(

    "AEBODSYS",

    child_labels = "visible",

    nested = FALSE,

    split_fun = split_fun,

    label_pos = "topleft",

    split_label = obj_label(adae$AEBODSYS)

  ) %>%

  summarize_num_patients(

    var = "USUBJID",

    .stats = c("unique", "nonunique"),

    .labels = c(

      unique = "Total number of patients with at least one adverse event",

      nonunique = "Total number of events"

    )

  ) %>%

  count_occurrences(

    vars = "AEDECOD",

    .indent_mods = -1L

  ) %>%

  append_varlabels(adae, "AEDECOD", indent = 1L)

  result <- build_table(

    lyt,

    df = datasets$adae,

    alt_counts_df = datasets$adsl

  )

  return(result)

}

```

You can easily create multiple outputs with this function by applying

the filters to the input datasets *before* passing them to

`t_ae()`.

``` r

vads %>% apply_filter("SE") %>% t_ae()

```

    Filter 'SE' matched target ADSL.

    400/400 records matched the filter condition `SAFFL == 'Y'`.

``` 
Body System or Organ Class 
  Dictionary-Derived Term 
———————— 
Total number of 
Overall total number of events 
cl A.1 
  Total number 
  Total number of events 
  dcd A.1.1.1.1 
  dcd A.1.1.1.2 
cl B.1 
  Total number 
  Total number of events 
  dcd B.1.1.1.1 
cl B.2 
  Total number 
  Total number of events 
  dcd B.2.1.2.1 
  dcd B.2.2.3.1 
cl C.1 
  Total number 
  Total number of events 
  dcd C.1.1.1.3 
cl C.2 
  Total number 
  Total number of events 
  dcd C.2.1.2.1 
cl D.1 
  Total number 
  Total number of events 
  dcd D.1.1.1.1 
  dcd D.1.1.4.2 
cl D.2 
  Total number 
  Total number of events 
  dcd D.2.1.5.3 
```

A: Drug X    B: Placebo    C: Combination   All Patients (N=133)       (N=141)        (N=126)         (N=400) —————————————————————————————————————————————————————————————————————————————————————————————————————————————— patients with at least one adverse event     111 (83.5%)   132 (93.6%)    119 (94.4%)     362 (90.5%) 636           755            655             2046

of patients with at least one adverse event   63 (47.4%)    79 (56.0%)      71 (56.3%)     213 (53.2%) 123           144            133             400 47 (35.3%)    63 (44.7%)      50 (39.7%)     160 (40.0%) 42 (31.6%)    47 (33.3%)      44 (34.9%)     133 (33.2%)

of patients with at least one adverse event   47 (35.3%)    49 (34.8%)      59 (46.8%)     155 (38.8%) 73            63              75             211 47 (35.3%)    49 (34.8%)      59 (46.8%)     155 (38.8%)

of patients with at least one adverse event   73 (54.9%)    88 (62.4%)      73 (57.9%)     234 (58.5%) 132           156            137             425 44 (33.1%)    56 (39.7%)      50 (39.7%)     150 (37.5%) 48 (36.1%)    59 (41.8%)      44 (34.9%)     151 (37.8%)

of patients with at least one adverse event   50 (37.6%)    53 (37.6%)      42 (33.3%)     145 (36.2%) 62            75              62             199 50 (37.6%)    53 (37.6%)      42 (33.3%)     145 (36.2%)

of patients with at least one adverse event   50 (37.6%)    65 (46.1%)      50 (39.7%)     165 (41.2%) 67            87              63             217 50 (37.6%)    65 (46.1%)      50 (39.7%)     165 (41.2%)

of patients with at least one adverse event   74 (55.6%)    95 (67.4%)      72 (57.1%)     241 (60.2%) 120           158            112             390 37 (27.8%)    59 (41.8%)      35 (27.8%)     131 (32.8%) 54 (40.6%)    63 (44.7%)      48 (38.1%)     165 (41.2%)

of patients with at least one adverse event   43 (32.3%)    54 (38.3%)      56 (44.4%)     153 (38.2%) 59            72              73             204 43 (32.3%)    54 (38.3%)      56 (44.4%)     153 (38.2%)

``` r

vads %>% apply_filter("SER_SE") %>% t_ae()

```

    Filter 'SE' matched target ADSL.

    400/400 records matched the filter condition `SAFFL == 'Y'`.

    Filter 'SER' matched target ADAE.

    581/1967 records matched the filter condition `AESER == 'Y'`.

``` 

Body System or Organ Class                                   A: Drug X    B: Placebo    C: Combination   All Patients

  Dictionary-Derived Term                                     (N=133)       (N=141)        (N=126)         (N=400)   

—————————————————————————————————————————————————————————————————————————————————————————————————————————————————————

Total number of patients with at least one adverse event     93 (69.9%)   110 (78.0%)     98 (77.8%)     301 (75.2%) 

Overall total number of events                                  248           280            246             774     

cl A.1                                                                                                               

  Total number of patients with at least one adverse event   42 (31.6%)   47 (33.3%)      44 (34.9%)     133 (33.2%) 

  Total number of events                                         54           63              58             175     

  dcd A.1.1.1.2                                              42 (31.6%)   47 (33.3%)      44 (34.9%)     133 (33.2%) 

cl B.1                                                                                                               

  Total number of patients with at least one adverse event   47 (35.3%)   49 (34.8%)      59 (46.8%)     155 (38.8%) 

  Total number of events                                         73           63              75             211     

  dcd B.1.1.1.1                                              47 (35.3%)   49 (34.8%)      59 (46.8%)     155 (38.8%) 

cl B.2                                                                                                               

  Total number of patients with at least one adverse event   48 (36.1%)   59 (41.8%)      44 (34.9%)     151 (37.8%) 

  Total number of events                                         74           78              65             217     

  dcd B.2.2.3.1                                              48 (36.1%)   59 (41.8%)      44 (34.9%)     151 (37.8%) 

cl D.1                                                                                                               

  Total number of patients with at least one adverse event   37 (27.8%)   59 (41.8%)      35 (27.8%)     131 (32.8%) 

  Total number of events                                         47           76              48             171     

  dcd D.1.1.1.1                                              37 (27.8%)   59 (41.8%)      35 (27.8%)     131 (32.8%) 

```

## (Current) Limitations

The filters you created using `add_filter()` only persist for the

duration of your `R` session. That means that whenever you restart your

`R` session you will have to re-create them. The simplest way to do so

is by putting all your filter definitions inside a file `filters.yml`

file as described above and call `load_filters("path/to/filters.yml")`

before creating outputs.

If you pass an existing filter that does not match your target dataset

no warning or error is thrown. Instead `apply_filter()` only tells you

which filters it actually used. Thus, checking that only valid filters

are passed to `apply_filter()` is up to you.

``` r

add_filter(

  id = "INFCT",

  title = "Infections and Infestations",

  target = "ADAE",

  condition = AEBODSYS == "INFECTIONS AND INFESTATIONS"

)

adsl_filtered <- apply_filter(adsl, "DIABP_IT")

```

    Filter 'IT' matched target ADSL.

    400/400 records matched the filter condition `ITTFL == 'Y'`.

## How Does it Work?

Internally, `{filters}` stores the filter definitions inside the

`.filters` environment defined in `R/zzz.R`. When you add a filter with

`add_filter()` a new variable with the name of the ID is created inside

this environment. This variable is a list that stores the title, target

and condition as a quoted expression. When you use `apply_filter()` the

function looks for variables in `.filters` matching the provided

suffixes. It then maps the filters to their target datasets and finally

builds a call to `subset()` with the dataset as first and condition for

the filters as second argument. This call is then evaluated using

`eval()` and the result is returned.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/openpharma/filters

Awesome Lists containing this project

README