Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/davejacobs/stats
An experiment with stats, the Ruby way
https://github.com/davejacobs/stats
ruby statistics
Last synced: 25 days ago
JSON representation
An experiment with stats, the Ruby way
- Host: GitHub
- URL: https://github.com/davejacobs/stats
- Owner: davejacobs
- License: other
- Created: 2011-03-15T23:58:12.000Z (almost 14 years ago)
- Default Branch: master
- Last Pushed: 2017-04-02T07:02:38.000Z (over 7 years ago)
- Last Synced: 2024-08-10T14:17:31.798Z (5 months ago)
- Topics: ruby, statistics
- Language: Ruby
- Homepage:
- Size: 420 KB
- Stars: 39
- Watchers: 2
- Forks: 4
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: COPYING
Awesome Lists containing this project
README
# Stats #
## Description ##
This is a prototype of a statistical library for Ruby. Starting out, the purpose of the library is to be readable (for people studying statistics), to be well-tested (against R and Python statistical functions), and to be useful for Small Data. Big Data can come later, if I have enough fun. With `stats`, I aim to create an API that makes statistics intuitive and harder to mess up. For example, I'd like to take a stab at an assumption framework that can tag specific functions with assumptions that will throw warnings if they're not met.
---
## Try it out ##
Once this is stable and fully tested (it is so far for all the functions listed below), I'll consider publishing it as a gem. Until then, you can play around with `master`:
brew install gsl
git clone https://github.com/davejacobs/stats.git
cd stats
bundle## Running tests ##
I've started integrating R into my tests to make testing as easy and repeatable as possible. I'm also planning to incorporate something like Randly to expand the values that I test.
To run tests:
brew install homebrew/science/r
rspec## Progress ##
### For developers ###
- [x] Get Ruby GSL bindings (`gem install gsl`) to work on Ruby 2.0/OS X
- [ ] Implement gemspec so this is installable via git URL### Distribution functions ###
I've added a wrapper around GSL distribution functions, for more intuitive access and testing.
- [x] Normal distribution - PDF & CDF
- [x] Chi square distribution - PDF & CDF
- [x] T distribution - PDF & CDF
- [x] F distribution - PDF & CDF### Basic functions ###
- [x] Mean, arithmetic
- [x] Mean, geometric
- [x] Median
- [x] Mode
- [x] Variance
- [x] Standard deviation
- [x] Standard error of the mean (for samples only)
- [x] Relative standard error of the mean (for samples only)
- [x] Coefficient of variation### Significance tests ###
- [x] Chi square
- [x] T-test, single sample
- [x] T-test, two-sample
- [x] T-test, repeated measures
- [x] Wilcoxon rank sum test
- [ ] Wilcoxon signed rank test
- [ ] Median test
- [ ] Kruskall-Wallis H test
- [ ] Friedman test
- [x] ANOVA, one-way
- [ ] Factorial ANOVA, two-way
- [ ] Factorial ANOVA, three-way
- [ ] ANOVA, repeated measures
- [ ] MANOVA
- [ ] ANCOVA
- [ ] Welch's ANOVA
- [ ] Fisher's least significant difference### Regressions ###
- [ ] Linear regression
- [ ] Multiple linear regression
- [ ] Pearson's correlation
- [ ] Spearman correlation### Support & other ###
- [x] Basic assumption framework
- [ ] Confidence intervals (general idea)
- [ ] Basic data structures
- [ ] Significance methods on data structures
- [ ] Test using R integration and something like [Rantly](https://github.com/hayeah/rantly)## Resources ##
- [How to choose the right statistical test](http://www.graphpad.com/support/faqid/1790/)
- [Wilkinson's *Statistics Quiz* (RTF)](http://tspintl-test.com/products/tsp/benchmarks/wilk.rtf)
- Assessing the reliability of statistical software
- [Part 1](http://www.questia.com/googleScholar.qst?docId=5001390400)
- [Part 2](http://www.questia.com/googleScholar.qst?docId=5001888610)