Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jokergoo/statistics-multtest
Control false discovery rate in multiple test
https://github.com/jokergoo/statistics-multtest
Last synced: 23 days ago
JSON representation
Control false discovery rate in multiple test
- Host: GitHub
- URL: https://github.com/jokergoo/statistics-multtest
- Owner: jokergoo
- Created: 2012-08-28T12:43:31.000Z (over 12 years ago)
- Default Branch: master
- Last Pushed: 2017-08-18T03:13:31.000Z (over 7 years ago)
- Last Synced: 2024-10-14T11:11:20.675Z (2 months ago)
- Language: Perl
- Homepage:
- Size: 12.7 KB
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: Changes
Awesome Lists containing this project
README
## Control false discovery rate in multiple test
### install
```
cpan -i Statistics::Multtest
```### usage
```perl
use Statistics::Multtest qw(bonferroni holm hommel hochberg BH BY qvalue);
use Statistics::Multtest qw(:all);
use strict;
my $p;
# p-values can be stored in an array by reference
$p = [0.01, 0.02, 0.05,0.41,0.16,0.51];
# @$res has the same order as @$p
my $res = BH($p);
print join "\n", @$res;
# p-values can also be stored in a hash by reference
$p = {"a" => 0.01,
"b" => 0.02,
"c" => 0.05,
"d" => 0.41,
"e" => 0.16,
"f" => 0.51 };
# $res is also a hash reference which is the same as $p
$res = holm($p);
foreach (sort {$res->{a} <=> $res->{$b}} keys %$res) {
print "$_ => $res->{$_}\n";
}
# since qvalue does not always run successfully,
# it should be embeded in 'eval'
$res = eval 'qvalue($p)';
if($@) {
print $@;
}
else {
print join "\n", @$res;
}
```### description
For statistical test, p-value is the probability of false positives. While there are many hypothesis for testing simultaneously, the probability of getting at least one false positive would be large. Therefore the origin p-values should be adjusted to decrease the false discovery rate.
Seven procedures to controlling false positive rates is provided. The names of the methods are derived from `p.adjust` in `stat` package and `qvalue` in `qvalue` package (http://www.bioconductor.org/packages/release/bioc/html/qvalue.html) in R. Code is translated directly from R to Perl using `List::Vectorize` module.
All seven subroutines receive one argument which can either be an array reference or a hash reference, and return the adjusted p-values in corresponding data structure. The order of items in the array does not change after the adjustment.
### subroutines
- `bonferroni($pvalue)`
Bonferroni single-step process.
- `hommel($pvalue)`
Hommel singlewise process.
Hommel, G. (1988). A stagewise rejective multiple test procedure based on a modified Bonferroni test. Biometrika, 75, 383¨C386.
- `holm($pvalue)`
Holm step-down process.
Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics, 6, 65¨C70.
- `hochberg($pvalue)`
Hochberg step-up process.Hochberg, Y. (1988). A sharper Bonferroni procedure for multiple tests of significance. Biometrika, 75, 800¨C803.
- `BH($pvalue)`
Benjamini and Hochberg, controlling the FDR.Benjamini, Y., and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B, 57, 289¨C300.
- `BY($pvalue)`
Use Benjamini and Yekutieli.Benjamini, Y., and Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency. Annals of Statistics 29, 1165¨C1188.
- `qvalue($pvalue, %setup)`
Storey and Tibshirani.
Storey JD and Tibshirani R. (2003) Statistical significance for genome-wide experiments. Proceedings of the National Academy of Sciences, 100: 9440-9445.
The default method for estimating `pi0` in the origin `qvalue` package is to utilize cubic spline interpolation. However, there is no suitable perl module to do such work (external libraries should be installed if using `Math::GSL::Spline` and there seems to be some mistakes when I using `Math::Spline`). Therefore, in this module, we only provide 'bootstrap' method to estimate `pi0`, which is also the second `pi0` method in `qvalue` package. Some arguments which are the same in `qvalue` package can be set in `%setup` as follows.
```perl
lambda => multiply(seq(0, 90, 5), # The value of the tuning parameter
0.01), # to estimate pi_0. Must be in [0,1).
# It should be an array reference
robust => 0, # An indicator of whether it is desired to make the estimate
# more robust for small p-values and a direct finite sample
# estimate of pFDR
```For details, please see the Storey (2003) and the qvalue document in R.
NOTE: The results of this subroutine are not always exactly consistent to the `qvalue` package due to the floating point number calculation.
In some circumstance, the estimated `pi0 <= 0`, and the subroutine would throw an error. (try p-value list: `[0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]`). So you should embeded this subroutine in 'eval' such as:
```perl
my $qvalue;
eval '$qvalue = qvalue($pvalue, %setup)';
if($@) {
# do something
}
else {
print join ", ", @$qvalue;
}
```