An open API service indexing awesome lists of open source software.

https://github.com/basf/xeredar

Functions for automated statistical analysis of Xenopus Eleutherembryonic Thyroid Assays (XETA), Rapid Androgen Disruption Activity Reporter (RADAR) assays and Rapid Estrogen Activity In Vitro (REACTIV) assays.
https://github.com/basf/xeredar

endocrine-disruptor hazard-assessment mixed-models radar reactiv xeta

Last synced: 7 days ago
JSON representation

Functions for automated statistical analysis of Xenopus Eleutherembryonic Thyroid Assays (XETA), Rapid Androgen Disruption Activity Reporter (RADAR) assays and Rapid Estrogen Activity In Vitro (REACTIV) assays.

Awesome Lists containing this project

README

          

# xeredar

[![R-CMD-check](https://github.com/basf/xeredar/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/basf/xeredar/actions/workflows/R-CMD-check.yaml)

Background
==========

The package xeredar is an R-package for analysis of the New Approach
Methodology (NAM) assays of XETA (Xenopus Eleutheroembryonic Thyroid),
RADAR (Rapid Androgen Disruption Activity Reporter) and REACTIV (Rapid
Estrogen ACTivity In Vivo) for assessing endocrine effects of chemicals
on the thyroid, androgen/steroid and estrogen axis. The functionality is
based on the SAS-script recommended in the Annex 13 of OECD test
guideline No. 248 of the XETA assay
([2019](https://www.oecd-ilibrary.org/environment/tg-248-xenopus-eleutheroembryonic-thyroid-assay-xeta_a13f80ee-en)),
written by John Green.

Installation
============

You can install xeredar using one of the followig commands:

devtools::install_github("basf/xeredar")

pak::pkg_install("basf/xeredar")

Data requirements
=================

Data frames that are supposed to be analyzed with xeredar needs to
fulfill certain requirements. The data frame or tibble needs to contain
the following column headers:

knitr::kable(head(xeredar::testDataSpiked))

Replicate
Treatment
Row
Fluor
Conc

11
1
0 + T3
4
19.768
0

12
1
0 + T3
4
27.928
0

13
1
0 + T3
4
29.592
0

14
1
0 + T3
4
22.816
0

15
1
0 + T3
4
26.080
0

16
1
0 + T3
4
25.332
0

The type of each column should be accordingly:

knitr::kable(purrr::map_df(xeredar::testDataSpiked, class))

Replicate
Treatment
Row
Fluor
Conc

factor
character
character
numeric
ordered

factor
character
character
numeric
factor

**Replicate** (i.e. run), **Treatment** (i.e. a unique name for each
treatment level in either spiked or unspiked mode) and **Row**
(i.e. exposure vessel) can either be *factor* or *character* columns,
but **Fluor** (i.e. measured fluorescence) must always be *numeric* and
**Conc** (i.e. concentration of test item) must always contain *ordered
factors*. The order of the columns is not relevant.

**Replicate**, **Treatment** and **Row** can either be *factor* or
*character* columns, but **Fluor** must always be *numeric* and **Conc**
must always contain *ordered factors*. The order of the columns is not
relevant. It is important that the decimal separator is a period instead
of a comma.

When simply aiming to use the `data_prep()` function, the data frame
needs to either contain spiked treatments or unspiked treatments. When
still having spiked and unspiked treatments in one data frame they
should be separated. Imagine you have a XETA data frame (dat) which
contains spiked and unspiked treatments as well as the T4 positive
control. The T3 or T4 additions are designated by “T3” and “T4” in the
**Treatment** column. The spiked and unspiked datasets could quickly be
subset using the following code:

datSpiked <- dat[grepl("T3",dat$Treatment),] # spiked data

datUnspiked <- setdiff(dat,datSpiked)
datUnspiked <- datUnspiked[which(datUnspiked$Treatment != "T4"),] # unspiked data

Running default XETA analysis
=============================

To demonstrate how to run XETA analysis, we will use one of the data
sets by the French lab containing the spiked and unspiked measurements
from the XETA ring test as included in OECD test guideline No. 248 of
the XETA assay
([2019](https://www.oecd-ilibrary.org/environment/tg-248-xenopus-eleutheroembryonic-thyroid-assay-xeta_a13f80ee-en)).

xeta_spiked <- xeredar::valid_data_xeta[["ptu_france_spiked"]]
xeta_unspiked <- xeredar::valid_data_xeta[["ptu_france_unspiked"]]

The default XETA analysis can be run using the `data_prep()` function
with either spiked or unspiked data. This function automatically decides
whether trimming, outlier removal and/or transformations are conducted
following the manuscript ([Spyridonov et al. 2025](https://academic.oup.com/etc/advance-article/doi/10.1093/etojnl/vgaf056/8046659?searchresult=1)).
The actual analysis is carried out by the `ana()` function. The `ana()` function is
called by the `data_prep()` function and does not need to be called
separately. For this dataset, the exposure well ID (Row of the 96 well
plate) is not recorded, therefore, we set the `row` argument to `FALSE`.
In this case, we use the reduced mixed ANOVA model where the exposure
well ID is not included as a random effect. Please specify `row=TRUE` if
the exposure well ID is recorded and you want to use the full mixed
ANOVA model.

xeta_spiked_result <- xeredar::data_prep(dataframe = xeta_spiked, row = FALSE)
xeta_unspiked_result <- xeredar::data_prep(dataframe = xeta_unspiked, row= FALSE)

Here we use the spike data as an example to demonstrate the output of
the `data_prep()` function.

The outputs of the `data_prep()` function are lists containing the
following elements:

- A reasoning for the recommended transformation and trimming. The raw
data should be used for the analysis because the residuals of the
mixed ANOVA are normally distributed and show homogeneous variances
among treatment groups.

xeta_spiked_result$Justify
#> [1] "The raw data should be used for the analysis because\n the residuals of the mixed ANOVA are normally\n distributed and show homogeneous variances\n among treatment groups."

- A data frame of the processed data (e.g. raw data, trimmed,
transformed, or outlier removed) used for actual statistical testing
following the reasoning. The box plots of the processed data per
run/replicate (i.e. each panel represents each run/replicate) are
also provided for visual inspection.

knitr::kable(head(xeta_spiked_result$ProcessedData))

Replicate
Treatment
Fluor
Conc
Country
Substance
Spiked

1
FETAXT3
4989.533
0
france
ptu
TRUE

1
FETAXT3
5002.533
0
france
ptu
TRUE

1
FETAXT3
6331.533
0
france
ptu
TRUE

1
FETAXT3
4645.533
0
france
ptu
TRUE

1
FETAXT3
4977.533
0
france
ptu
TRUE

1
FETAXT3
6229.533
0
france
ptu
TRUE

xeta_spiked_result$BoxPlots

![](User_guide_files/figure-markdown_strict/unnamed-chunk-9-1.png)

- Summary tables of the processed data (per replicate and overall)

knitr::kable(xeta_spiked_result$SummaryDF_Rep, caption="Summary statistics of fluorescence in different concentrations of test item per replicate")

Summary statistics of fluorescence in different concentrations of test item per replicate

Conc
Replicate
N
Mean
Standard deviation
Coefficient of variation

0
1
19
4651.060
944.7420
0.2031240

0
2
20
4024.300
603.6731
0.1500070

0
3
19
4952.225
1179.3597
0.2381475

1
1
19
3854.137
828.3944
0.2149364

1
2
20
4557.867
558.4921
0.1225337

1
3
19
4720.819
619.8992
0.1313118

3
1
19
4671.428
976.9674
0.2091368

3
2
19
4486.538
783.8641
0.1747147

3
3
20
5421.433
703.9490
0.1298456

10
1
20
4573.083
946.2029
0.2069070

10
2
19
4329.538
827.4898
0.1911266

10
3
19
5106.849
876.5940
0.1716507

30
1
20
5391.183
809.5849
0.1501683

30
2
19
4556.275
694.1698
0.1523547

30
3
19
5164.691
763.8837
0.1479050

100
1
19
5181.060
858.5454
0.1657085

100
2
19
5124.381
733.4611
0.1431317

100
3
20
6048.083
959.5900
0.1586602

knitr::kable(xeta_spiked_result$SummaryDF, caption="Summary statistics of fluorescence in different concentrations of test item of all replicates")

Summary statistics of fluorescence in different concentrations of test item of all replicates

Conc
N
Mean
Standard deviation
Coefficient of variation

0
58
4533.593
998.2901
0.2201984

1
58
4380.715
764.2147
0.1744498

3
58
4869.483
910.7568
0.1870336

10
58
4668.156
928.9066
0.1989879

30
58
5043.483
825.4425
0.1636652

100
58
5461.466
945.7370
0.1731654

- Tables of results evaluated using increasing/decreasing Williams
test and/or Dunnett’s test, if applicable.

In the the Williams’ test result tables, *Y.Tilde* is the amalgamated
mean of the fluorescence in each treatment group, *Y0* is the mean of
the control fluorescence, *DIFF* is the estimated difference between the
treatment and the control, *SE\_DIFF* is the standard error of the
Williams’ test, *DF* is the degrees of freedom for Williams’ test, *WILL
Incr* or *Will Decr* is the Williams’ test statistic, *crit Val* is the
critical value of Williams distribution, *Sign* suggests if there is
significant difference between the treatment and the control, and
*%Incr* is the percent increase of the fluorescence compared to the
control.

In the Dunnett’s test result table, *Estimate* is the estimated
difference between the treatment and the control, *SE* is the standard
error of the mixed ANOVA model, *t value* is the Dunnett’s test
statistic, *adj p* is the adjusted p value and *%Incr* is the percent
increase of the fluorescence compared to the control.

knitr::kable(xeta_spiked_result$WilliamsIncrease, caption="Increasing Williams' test")

Increasing Williams’ test

Conc
Y.Tilde
Y0
DIFF Incr
SE_DIFF
DF
WILL Incr
crit Val
Sign
% Incr

Conc100 - Conc0
100
5461.47
4533.593
927.8769
249.7868
10
3.7146758
1.971
TRUE
20.466613

Conc30 - Conc0
30
5043.48
4533.593
509.8869
249.7868
10
2.0412886
1.965
TRUE
11.246933

Conc10 - Conc0
10
4768.82
4533.593
235.2269
249.7868
10
0.9417108
1.956
FALSE
2.968123

Conc3 - Conc0
3
4768.82
4533.593
235.2269
249.7868
10
0.9417108
1.940
FALSE
7.408918

Conc1 - Conc0
1
4380.72
4533.593
-152.8731
249.7845
10
-0.6120201
1.908
FALSE
-3.372111

knitr::kable(xeta_spiked_result$WilliamsDecrease, caption="Decreasing Williams' test")

Decreasing Williams’ test

Conc
Y.Tilde
Y0
DIFF Decr
SE_DIFF
DF
WILL Decr
crit Val
Sign
% Incr

Conc100 - Conc0
100
4884.66
4533.593
-351.0669
249.7868
10
-1.405466
1.971
FALSE
20.466613

Conc30 - Conc0
30
4884.66
4533.593
-351.0669
249.7868
10
-1.405466
1.965
FALSE
11.246933

Conc10 - Conc0
10
4884.66
4533.593
-351.0669
249.7868
10
-1.405466
1.956
FALSE
2.968123

Conc3 - Conc0
3
4884.66
4533.593
-351.0669
249.7868
10
-1.405466
1.940
FALSE
7.408918

Conc1 - Conc0
1
4884.66
4533.593
-351.0669
249.7845
10
-1.405479
1.908
FALSE
-3.372111

knitr::kable(xeta_spiked_result$Dunnetts, caption="Dunnett's test")

Dunnett’s test

Estimate
SE
df
t value
adj p
% Incr

Conc1 - Conc0
-160.4067
249.7845
10
-0.6421806
0.9458404
-3.372111

Conc3 - Conc0
320.1063
249.7868
10
1.2815181
0.5970718
7.408918

Conc10 - Conc0
128.8281
249.7868
10
0.5157522
0.9769722
2.968123

Conc30 - Conc0
499.2993
249.7868
10
1.9989021
0.2367993
11.246933

Conc100 - Conc0
911.7088
249.7868
10
3.6499482
0.0174131
20.466613

- Further information about the normality test (Shapiro-Wilk), the
homogeneity of variance test (Levene’s test) of residuals of the
mixed ANOVA model, the monotonicity test and the model fit.

xeta_spiked_result$NormalityTest
#>
#> Shapiro-Wilk normality test
#>
#> data: stats::resid(mixedaov)
#> W = 0.99186, p-value = 0.05343
xeta_spiked_result$LeveneTest
#> # A tibble: 1 × 4
#> statistic p.value df df.residual
#>
#> 1 0.759 0.580 5 342
xeta_spiked_result$`Monotonicity Test`
#> Test t value Pr(>|t|) Significance
#> 1 Linear 6.49 <0.0001 ***
#> 2 Quadratic 2.14 0.0335 *
xeta_spiked_result$MixedAnova
#> Linear mixed model fit by REML ['lmerMod']
#> Formula:
#> Fluor ~ Conc + (1 | Replicate) + (1 | Replicate:Conc)
#> Data: dataframe
#> REML criterion at convergence: 5608.294
#> Random effects:
#> Groups Name Std.Dev.
#> Replicate:Conc (Intercept) 241.1
#> Replicate (Intercept) 350.5
#> Residual 827.7
#> Number of obs: 348, groups:
#> Replicate:Conc, 18; Replicate, 3
#> Fixed Effects:
#> (Intercept) Conc.L Conc.Q Conc.C
#> 4824.2 758.5 264.5 52.6
#> Conc^4 Conc^5
#> 149.8 -270.8

*The list output from running the data\_prep() function can be
summarized with the data\_summary() function.*

xeredar::data_summary(xeta_spiked_result) |> knitr::kable()

1
3
10
30
100

Replicate 1
-17.13
0.44
-1.68
15.91
11.4

Replicate 2
13.26
11.49
7.58
13.22
27.34

Replicate 3
-4.67
9.47
3.12
4.29
22.13

Pooled
-3.37
7.41
2.97
11.25
20.47

Dunnett
ns
ns
ns
ns
*

IncreasingWilliams
ns
ns
ns
*
*

DecreasingWilliams
ns
ns
ns
ns
ns

Running default RADAR analysis
==============================

To demonstrate how to run RADAR analysis, we will use one of the data
sets by the Pos\_mDHT\_Fraunhofer\_RADAR containing the spiked and
unspiked measurements from the RADAR study validation in the lab
Fraunhofer with an androgen axis active chemical.

radar_spiked <- xeredar::RADAR_valid_data_table_spiked_unspiked[["mDHTFRAUNH_Spiked"]]
radar_unspiked <- xeredar::RADAR_valid_data_table_spiked_unspiked[["mDHTFRAUNH_Unspiked"]]

The default radar analysis can be run using the `data_prep()` function
with either spiked or unspiked data. This function automatically decides
whether trimming, outlier removal and/or transformations are conducted
following the manuscript (Spyridonov et al. unpublished). The actual
analysis is carried out by the `ana()` function. The `ana()` function is
called by the `data_prep()` function and does not need to be called
separately. the analysis the RADAR assay follows the description of the
Method 2 (the mixed ANOVA approach) in the Annex 8: methods for the
statistical analysis of RADAR assay data of the OECD TG 251
([2022](https://www.oecd-ilibrary.org/environment/test-no-251-rapid-androgen-disruption-activity-reporter-radar-assay_da264d82-en)).
For this dataset, the exposure well ID (Row of the 96 well plate) is not
recorded, therefore, we set the `row` argument to `FALSE`. In this case,
we use the reduced mixed ANOVA model where the exposure well ID is not
included as a random effect. Please specify `row=TRUE` if the exposure
well ID is recorded and you want to use the full mixed ANOVA model.
Trimming is not required.

radar_spiked_result <- xeredar::data_prep(dataframe = radar_spiked, row = FALSE, trimming=FALSE)
radar_unspiked_result <- xeredar::data_prep(dataframe = radar_unspiked, row= FALSE, trimming=FALSE)

Here we use the spike data as an example to demonstrate the output of
the `data_prep()` function.

The outputs of the `data_prep()` function are lists containing the
following elements:

- A reasoning for the recommended transformation and trimming. The raw
data (without trimming or outlier removal) where the fluorescence
values are log transformed should be used for the analysis because
only after log transformation, the residuals of the mixed ANOVA are
normally distributed and have homogeneous variances among treatment
groups.

radar_spiked_result$Justify
#> [1] "The raw data (without trimming or outlier removal) where\n the fluorescence values are log transformed should be used\n for the analysis because only after log transformation,\n the residuals of the mixed ANOVA are normally distributed\n and have homogeneous variances among treatment groups."

- A data frame of the processed data (e.g. raw data, transformed, or
outlier removed) used for actual statistical testing following the
reasoning. The box plots of the processed data per run/replicate
(i.e. each panel represents each run/replicate) are also provided
for visual inspection.

knitr::kable(head(radar_spiked_result$ProcessedData))

lab
Compound
Treatment
Fluor
Conc
Replicate

FRAUNH
mDHT
FETAX MT
113152
0
1

FRAUNH
mDHT
mDHT-1µg/L + 17MT
708
1
1

FRAUNH
mDHT
mDHT-2µg/L + 17MT
2498
2
1

FRAUNH
mDHT
mDHT-4µg/L + 17MT
10152
4
1

FRAUNH
mDHT
mDHT-8µg/L + 17MT
356
8
1

FRAUNH
mDHT
mDHT-16µg/L + 17MT
4621
16
1

radar_spiked_result$BoxPlots

![](User_guide_files/figure-markdown_strict/unnamed-chunk-17-1.png)

- Summary tables of the processed data (per replicate and overall)

knitr::kable(radar_spiked_result$SummaryDF_Rep, caption="Summary statistics of fluorescence in different concentrations of test item per replicate")

Summary statistics of fluorescence in different concentrations of test item per replicate

Conc
Replicate
N
Mean
Standard deviation
Coefficient of variation

0
1
17
23818.471
43072.456
1.808364

0
2
17
15009.824
37062.085
2.469189

0
3
17
29657.294
37428.116
1.262021

1
1
17
6505.471
10835.321
1.665571

1
2
17
10271.000
14904.156
1.451091

1
3
17
20555.882
29472.508
1.433775

2
1
17
29385.941
47143.811
1.604298

2
2
17
8911.118
10565.718
1.185678

2
3
17
41815.706
54205.056
1.296285

4
1
17
15968.941
33341.972
2.087926

4
2
17
14532.765
21802.622
1.500239

4
3
17
19419.471
28057.724
1.444824

8
1
17
5178.118
8983.787
1.734952

8
2
17
8220.882
15084.270
1.834872

8
3
17
16399.647
23315.385
1.421700

16
1
17
6119.000
8448.481
1.380696

16
2
17
7955.706
15390.405
1.934512

16
3
17
17665.412
19842.708
1.123252

knitr::kable(radar_spiked_result$SummaryDF, caption="Summary statistics of fluorescence in different concentrations of test item of all replicates")

Summary statistics of fluorescence in different concentrations of test item of all replicates

Conc
N
Mean
Standard deviation
Coefficient of variation

0
51
22828.529
38967.63
1.706971

1
51
12444.118
20556.80
1.651929

2
51
26704.255
43299.93
1.621462

4
51
16640.392
27641.60
1.661115

8
51
9932.882
17189.94
1.730609

16
51
10580.039
15836.94
1.496870

- Tables of results evaluated using increasing/decreasing Williams
test and/or Dunnett’s test, if applicable.

In the the Williams’ test result tables, *Y.Tilde* is the amalgamated
mean of the fluorescence in each treatment group, *Y0* is the mean of
the control fluorescence, *DIFF* is the estimated difference between the
treatment and the control, *SE\_DIFF* is the standard error of the
Williams’ test, *DF* is the degrees of freedom for Williams’ test, *WILL
Incr* or *Will Decr* is the Williams’ test statistic, *crit Val* is the
critical value of Williams distribution, *Sign* suggests if there is
significant difference between the treatment and the control, and
*%Incr* is the percent increase of the fluorescence compared to the
control.

In the Dunnett’s test result table, *Estimate* is the estimated
difference between the treatment and the control, *SE* is the standard
error of the mixed ANOVA model, *t value* is the Dunnett’s test
statistic, *adj p* is the adjusted p value and *%Incr* is the percent
increase of the fluorescence compared to the control.

knitr::kable(radar_spiked_result$WilliamsIncrease, caption="Increasing Williams' test")

Increasing Williams’ test

Conc
Y.Tilde
Y0
DIFF Incr
SE_DIFF
DF
WILL Incr
crit Val
Sign
% Incr

Conc16 - Conc0
16
8.38858
8.51927
-0.13069
0.3478327
10
-0.3757267
1.971
FALSE
-53.65431

Conc8 - Conc0
8
8.38858
8.51927
-0.13069
0.3478327
10
-0.3757267
1.965
FALSE
-56.48917

Conc4 - Conc0
4
8.38858
8.51927
-0.13069
0.3478327
10
-0.3757267
1.956
FALSE
-27.10703

Conc2 - Conc0
2
8.38858
8.51927
-0.13069
0.3478327
10
-0.3757267
1.940
FALSE
16.97755

Conc1 - Conc0
1
8.07433
8.51927
-0.44494
0.3478327
10
-1.2791783
1.908
FALSE
-45.48875

knitr::kable(radar_spiked_result$WilliamsDecrease, caption="Decreasing Williams' test")

Decreasing Williams’ test

Conc
Y.Tilde
Y0
DIFF Decr
SE_DIFF
DF
WILL Decr
crit Val
Sign
% Incr

Conc16 - Conc0
16
7.99566
8.51927
0.52361
0.3478327
10
1.5053503
1.971
FALSE
-53.65431

Conc8 - Conc0
8
7.99566
8.51927
0.52361
0.3478327
10
1.5053503
1.965
FALSE
-56.48917

Conc4 - Conc0
4
8.54577
8.51927
-0.02650
0.3478327
10
-0.0761861
1.956
FALSE
-27.10703

Conc2 - Conc0
2
8.54577
8.51927
-0.02650
0.3478327
10
-0.0761861
1.940
FALSE
16.97755

Conc1 - Conc0
1
8.54577
8.51927
-0.02650
0.3478327
10
-0.0761861
1.908
FALSE
-45.48875

knitr::kable(radar_spiked_result$Dunnetts, caption="Dunnett's test")

Dunnett’s test

Estimate
SE
df
t value
adj p
% Incr

Conc1 - Conc0
-0.4449358
0.3478327
10
-1.279166
0.5986692
-45.48875

Conc2 - Conc0
0.4707977
0.3478327
10
1.353518
0.5520005
16.97755

Conc4 - Conc0
0.0536514
0.3478327
10
0.154245
0.9999132
-27.10703

Conc8 - Conc0
-0.5776610
0.3478327
10
-1.660744
0.3794252
-56.48917

Conc16 - Conc0
-0.4695496
0.3478327
10
-1.349930
0.5543160
-53.65431

- Further information about the normality test (Shapiro-Wilk), the
homogeneity of variance test (Levene’s test) of residuals of the
mixed ANOVA model, the monotonicity test and the model fit.

radar_spiked_result$NormalityTest
#>
#> Shapiro-Wilk normality test
#>
#> data: stats::resid(mixedaov)
#> W = 0.99071, p-value = 0.05007
radar_spiked_result$LeveneTest
#> # A tibble: 1 × 4
#> statistic p.value df df.residual
#>
#> 1 0.474 0.796 5 300
radar_spiked_result$`Monotonicity Test`
#> Test t value Pr(>|t|) Significance
#> 1 Linear -1.57 0.1167 .
#> 2 Quadratic -1.48 0.1400 .
radar_spiked_result$MixedAnova
#> Linear mixed model fit by REML ['lmerMod']
#> Formula:
#> log(Fluor) ~ Conc + (1 | Replicate) + (1 | Replicate:Conc)
#> Data: dataframe
#> REML criterion at convergence: 1218.932
#> Random effects:
#> Groups Name Std.Dev.
#> Replicate:Conc (Intercept) 0.000
#> Replicate (Intercept) 0.466
#> Residual 1.756
#> Number of obs: 306, groups:
#> Replicate:Conc, 18; Replicate, 3
#> Fixed Effects:
#> (Intercept) Conc.L Conc.Q Conc.C
#> 8.35799 -0.37806 -0.37347 0.01863
#> Conc^4 Conc^5
#> 0.68924 -0.25055
#> optimizer (nloptwrap) convergence code: 0 (OK) ; 0 optimizer warnings; 1 lme4 warnings

*The list output from running the data\_prep() function can be
summarized with the data\_summary() function.*

xeredar::data_summary(radar_spiked_result) |>
knitr::kable()

1
2
4
8
16

Replicate 1
-72.69
23.37
-32.96
-78.26
-74.31

Replicate 2
-31.57
-40.63
-3.18
-45.23
-47

Replicate 3
-30.69
41
-34.52
-44.7
-40.43

Pooled
-45.49
16.98
-27.11
-56.49
-53.65

Dunnett
ns
ns
ns
ns
ns

IncreasingWilliams
ns
ns
ns
ns
ns

DecreasingWilliams
ns
ns
ns
ns
ns

Running default REACTIV analysis
================================

To demonstrate how to run REACTIV analysis, we will use one artificial
data set containing the spiked and unspiked measurements.

reactiv_spiked <- xeredar::REACTIV_valid_data_table_spiked_unspiked[["Anastrozole_UK"]]$Spiked
reactiv_unspiked <- xeredar::REACTIV_valid_data_table_spiked_unspiked[["Anastrozole_UK"]]$Unspiked

The default REACTIV analysis can be run using the `data_prep()` function
with either spiked or unspiked data. This function automatically decides
whether trimming, outlier removal and/or transformations are conducted
following the manuscript (Spyridonov et al. unpublished). The actual
analysis is carried out by the `ana()` function. The `ana()` function is
called by the `data_prep()` function and does not need to be called
separately. The analysis the REACTIV assay follows the description of
the Method 2 (the mixed ANOVA approach) in the Annex 8: methods for the
statistical analysis of REACTIV assay data of the Amended Draft new Test
Guideline for the REACTIV assay for second WNT-review
([30.01.2024](https://www.oecd.org/chemicalsafety/testing/amended-draft-new-test-guideline-for-the-rapid-estrogen-ACTivity-in-vivo-assay.pdf)).
For this assay, the `row` argument should be set to `FALSE`. Trimming is
not required. In case there are residuals deviate from normality and
variance homogeneity, outlier removal (e.g. by applying the Tukey rule
(Green et al., 2018) and data transformation (for example log- or
square-root) can be conducted.

reactiv_spiked_result <- xeredar::data_prep(dataframe = reactiv_spiked, row = FALSE, trimming=FALSE, boxcox = FALSE)
reactiv_unspiked_result <- xeredar::data_prep(dataframe = reactiv_unspiked, row= FALSE, trimming=FALSE, boxcox = FALSE)

Here we use the spiked data as an example to demonstrate the output of
the `data_prep()` function.

The outputs of the `data_prep()` function are lists containing the
following elements:

- A reasoning for the recommended transformation and trimming. The
data from which outliers were removed with the Tukey-rule where the
fluorescence values are square-root transformed, should be used for
the analysis because only after outlier removal and sqrt
transformation, the residuals of the mixed ANOVA are normally
distributed and have homogeneous variances among treatment groups

reactiv_spiked_result$Justify
#> [1] "The data from which outliers were removed with the\n Tukey-rule where the fluorescence values are square-root\n transformed, should be used for the analysis because only\n after outlier removal and sqrt transformation, the\n residuals of the mixed ANOVA are normally distributed\n and have homogeneous variances among treatment groups"

- A data frame of the processed data (e.g. raw data, transformed, or
outlier removed) used for actual statistical testing following the
reasoning. The box plots of the processed data per run/replicate
(i.e. each panel represents each run/replicate) are also provided
for visual inspection.

knitr::kable(head(reactiv_spiked_result$ProcessedData))

Treatment
Fluor
Conc
Replicate
Compound
Country

AN-0,18mg/L + Testostérone
27037.33
0.18
1
Anastrozole
UK

AN-0,36mg/L + Testostérone
201316.67
0.36
1
Anastrozole
UK

AN-0,73mg/L + Testostérone
34651.00
0.73
1
Anastrozole
UK

AN-1,45mg/L + Testostérone
289984.67
1.45
1
Anastrozole
UK

AN-2,9mg/L + Testostérone
233271.00
2.9
1
Anastrozole
UK

AN-0,18mg/L + Testostérone
64846.33
0.18
1
Anastrozole
UK

reactiv_spiked_result$BoxPlots

![](User_guide_files/figure-markdown_strict/unnamed-chunk-25-1.png)

- Summary tables of the processed data (per replicate and overall)

knitr::kable(reactiv_spiked_result$SummaryDF_Rep, caption="Summary statistics of fluorescence in different concentrations of test item per replicate")

Summary statistics of fluorescence in different concentrations of test item per replicate

Conc
Replicate
N
Mean
Standard deviation
Coefficient of variation

0
1
6
1417505.4
367130.47
0.2589976

0
2
5
1299380.0
241756.84
0.1860555

0
3
6
1699476.3
382759.66
0.2252221

0.18
1
8
216154.0
211883.73
0.9802445

0.18
2
7
149156.5
116877.63
0.7835909

0.18
3
8
255036.8
162183.70
0.6359228

0.36
1
8
149004.8
96508.01
0.6476841

0.36
2
7
230237.3
105711.23
0.4591404

0.36
3
7
199221.1
116329.38
0.5839210

0.73
1
8
169331.4
115443.55
0.6817610

0.73
2
7
187208.8
215340.83
1.1502710

0.73
3
8
292332.4
139121.01
0.4759001

1.45
1
8
116438.8
102761.27
0.8825347

1.45
2
7
173724.2
134626.51
0.7749439

1.45
3
7
236771.7
100056.61
0.4225869

2.9
1
8
159732.2
78955.44
0.4942987

2.9
2
6
234475.2
170916.42
0.7289317

2.9
3
8
247387.4
127553.16
0.5156009

knitr::kable(reactiv_spiked_result$SummaryDF, caption="Summary statistics of fluorescence in different concentrations of test item of all replicates")

Summary statistics of fluorescence in different concentrations of test item of all replicates

Conc
N
Mean
Standard deviation
Coefficient of variation

0
17
1482281.8
363637.5
0.2453228

0.18
23
209287.9
168250.6
0.8039193

0.36
22
190829.4
106636.7
0.5588065

0.73
23
217555.3
161918.4
0.7442633

1.45
22
172953.7
118883.2
0.6873702

2.9
22
211991.3
126959.5
0.5988900

- Tables of results evaluated using increasing/decreasing Williams
test and/or Dunnett’s test, if applicable.

In the Williams’ test result tables, *Y.Tilde* is the amalgamated mean
of the fluorescence in each treatment group, *Y0* is the mean of the
control fluorescence, *DIFF* is the estimated difference between the
treatment and the control, *SE\_DIFF* is the standard error of the
Williams’ test, *DF* is the degrees of freedom for Williams’ test, *WILL
Incr* or *Will Decr* is the Williams’ test statistic, *crit Val* is the
critical value of Williams distribution, *Sign* suggests if there is
significant difference between the treatment and the control, and
*%Incr* is the percent increase of the fluorescence compared to the
control.

In the Dunnett’s test result table, *Estimate* is the estimated
difference between the treatment and the control, *SE* is the standard
error of the mixed ANOVA model, *t value* is the Dunnett’s test
statistic, *adj p* is the adjusted p value and *%Incr* is the percent
increase of the fluorescence compared to the control.

knitr::kable(reactiv_spiked_result$WilliamsIncrease, caption="Increasing Williams' test")

Increasing Williams’ test

Conc
Y.Tilde
Y0
DIFF Incr
SE_DIFF
DF
WILL Incr
crit Val
Sign
% Incr

Conc2.9 - Conc0
2.9
439.658
1208.915
-769.2567
48.94620
12
-15.71637
1.933
FALSE
-85.69831

Conc1.45 - Conc0
1.45
413.890
1208.915
-795.0246
48.95169
12
-16.24101
1.927
FALSE
-88.33193

Conc0.73 - Conc0
0.73
413.890
1208.915
-795.0246
48.47768
12
-16.39981
1.918
FALSE
-85.32295

Conc0.36 - Conc0
0.36
413.890
1208.915
-795.0246
48.95169
12
-16.24101
1.903
FALSE
-87.12597

Conc0.18 - Conc0
0.18
413.890
1208.915
-795.0246
48.47768
12
-16.39981
1.873
FALSE
-85.88070

knitr::kable(reactiv_spiked_result$WilliamsDecrease, caption="Decreasing Williams' test")

Decreasing Williams’ test

Conc
Y.Tilde
Y0
DIFF Decr
SE_DIFF
DF
WILL Decr
crit Val
Sign
% Incr

Conc2.9 - Conc0
2.9
411.240
1208.915
797.6747
48.94620
12
16.29697
1.933
TRUE
-85.69831

Conc1.45 - Conc0
1.45
411.240
1208.915
797.6747
48.95169
12
16.29514
1.927
TRUE
-88.33193

Conc0.73 - Conc0
0.73
424.957
1208.915
783.9576
48.47768
12
16.17152
1.918
TRUE
-85.32295

Conc0.36 - Conc0
0.36
424.957
1208.915
783.9576
48.95169
12
16.01493
1.903
TRUE
-87.12597

Conc0.18 - Conc0
0.18
424.957
1208.915
783.9576
48.47768
12
16.17152
1.873
TRUE
-85.88070

knitr::kable(reactiv_spiked_result$Dunnetts, caption="Dunnett's test")

Dunnett’s test

Estimate
SE
df
t value
adj p
% Incr

Conc0.18 - Conc0
-786.2696
48.47768
12
-16.21921
0
-85.88070

Conc0.36 - Conc0
-786.3430
48.95169
12
-16.06365
0
-87.12597

Conc0.73 - Conc0
-776.5006
48.47768
12
-16.01769
0
-85.32295

Conc1.45 - Conc0
-822.2317
48.95169
12
-16.79680
0
-88.33193

Conc2.9 - Conc0
-769.7654
48.94620
12
-15.72677
0
-85.69831

- Further information about the normality test (Shapiro-Wilk), the
homogeneity of variance test (Levene’s test) of residuals of the
mixed ANOVA model, the monotonicity test and the model fit.

reactiv_spiked_result$NormalityTest
#>
#> Shapiro-Wilk normality test
#>
#> data: stats::resid(mixedaov)
#> W = 0.97621, p-value = 0.0227
reactiv_spiked_result$LeveneTest
#> # A tibble: 1 × 4
#> statistic p.value df df.residual
#>
#> 1 0.829 0.532 5 123
reactiv_spiked_result$`Monotonicity Test`
#> Test t value Pr(>|t|) Significance
#> 1 Linear -6.34 <0.0001 ***
#> 2 Quadratic 6.05 <0.0001 ***
reactiv_spiked_result$MixedAnova
#> Linear mixed model fit by REML ['lmerMod']
#> Formula:
#> sqrt(Fluor) ~ Conc + (1 | Replicate) + (1 | Replicate:Conc)
#> Data: wt_outlier
#> REML criterion at convergence: 1607.975
#> Random effects:
#> Groups Name Std.Dev.
#> Replicate:Conc (Intercept) 0.00
#> Replicate (Intercept) 51.74
#> Residual 151.56
#> Number of obs: 129, groups:
#> Replicate:Conc, 18; Replicate, 3
#> Fixed Effects:
#> (Intercept) Conc.L Conc.Q Conc.C
#> 551.13 -471.74 437.64 -271.05
#> Conc^4 Conc^5
#> 175.76 -30.96
#> optimizer (nloptwrap) convergence code: 0 (OK) ; 0 optimizer warnings; 1 lme4 warnings

*The list output from running the data\_prep() function can be
summarized with the data\_summary() function.*

xeredar::data_summary(reactiv_spiked_result) |>
knitr::kable()

0.18
0.36
0.73
1.45
2.9

Replicate 1
-84.75
-89.49
-88.05
-91.79
-88.73

Replicate 2
-88.52
-82.28
-85.59
-86.63
-81.95

Replicate 3
-84.99
-88.28
-82.8
-86.07
-85.44

Pooled
-85.88
-87.13
-85.32
-88.33
-85.7

Dunnett
*
*
*
*
*

IncreasingWilliams
ns
ns
ns
ns
ns

DecreasingWilliams
*
*
*
*
*

The shiny app
=============

xeredar contains an integrated shiny app that is available to the users
by writing `run_app()` into the console. The only requirement to use the
app is the successful installation of xeredar. When the app started, the
user has a couple of options to analyse the data. These options can be
adjusted by clicking on the little gear sign next to *Inputs*. The
following options are available:

- **Use d’Agostino test?**
- The default here is that this is not selected meaning the
Shapiro-Wilk test is utilized to check for residual normality.
However, the RADAR TG mentions the d’Agostino test which is why
it is also available to the users.
- **Apply 10% Trimming**
- The default here is that this is selected, meaning that 10%
Trimming is conducted when the ANOVA assumptions are not
fulfilled by the raw data. Please be aware that this does not
mean that 10% Trimming is always conducted. It is only
conducted, when the raw data is violating the residual normality
and variance homogeneity assumptions, tested with the respective
test with an adjustable alpha level.
- **Remove outliers**
- The default here is that this is selected, meaning that outlier
removal is conducted when the ANOVA assumptions are not
fulfilled by the raw data or by 10% Trimming, if it is selected
above.
- **Try Box-Cox transformation**
- The default here is that this is selected, meaning that box-cox
transformation is carried out when the ANOVA assumptions are not
fulfilled by the raw or processed data (10% Trimming or outlier
removal) nor by log- or square-root transformation. Box-Cox
transformation is not mentioned in any of the TGs which is why
it is left to the choice of the user.
- **alpha for residual normality and variance homogeneity tests**
- The default here is 0.05. In the TGs the alpha level is
discussed so the user is adviced to inspect the requirements of
the respective study to analyze.

The next option in the little box on the top-left of the app is to set a
hook or remove the hook to decide whether the exposure well ID is
regarded as random term in the underlying mixed ANOVA model. In case a
REACTIV study is investigated the hook should be removed. For RADAR and
XETA, the hook should be placed as long as information about the
exposure well ID was documented. Of course, reducing the complexity of
the random term might also make sense for XETA and RADAR studies in case
no variance is explained by the exposure well-ID. However, this choice
is left to the user.

When own data is supposed to be analyzed xlsx files with either spiked
or unspiked data can be uploaded. Please make sure that the uploaded
data fulfills the Data requirements explained above. In case of
insecurity, inspect the uploaded data structure of the pre-loaded
example data.

The app contains several output boxes about the required data
processing, a conclusion table, the uploaded data, residual diagnostics
plots, boxplots, a data summary table, the output of the residual
normality and homogeneity tests, the monotonicity test result, the
Dunnett’s and increasing and decreasing Williams test result tables
along with the summary table of the underlying mixed ANOVA model.

The depicted information can be downloaded in a simple report by
clicking on the button \*\* Word report\*\*. It takes a couple of
seconds until the final docx file is produced and ready to download.
Avoid clicking the button several times as this can lead to long waiting
times and the production of several reports.

References
==========

OECD. 2019a. Validation Report of the Xenopus Eleutheroembryonic Thyroid
Signaling Assay (XETA) for the Detection of Thyroid Active Substances.
OECD.

OECD. 2019b. TG 248: Xenopus Eleutheroembryonic Thyroid Assay (XETA).
OECD.
https://doi.org/10.1787/a13f80ee-en.

OECD. 2022b. Test No. 251: Rapid Androgen Disruption Activity Reporter
(RADAR) Assay. OECD.
https://doi.org/10.1787/da264d82-en.

OECD. 2022a. Rapid Estrogen Activity in Vivo (REACTIV) Assay (OECD Draft
TG): Guideline for the Testing of Chemicals, Section 2: Effects on
Biotic System. OECD.

Inka Marie Spyridonov, Lijuan Yan, Eduard Szöcs, Ana Filipa Pereira Miranda, Carsten Lange,
Andrew Tindall, David Du Pasquier, Gregory Lemkine, Lennart Weltje, Maike Habekost,
Pernille Thorbek. 2025. Xeredar: An open-source R-package for the statistical analysis
of endocrine new approach methods (NAMs) using fish or amphibian eleutheroembryos,
Environmental Toxicology and Chemistry.
https://doi.org/10.1093/etojnl/vgaf056