Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mamba413/ball
Statistical Inference and Sure Independence Screening via Ball Statistics
https://github.com/mamba413/ball
ball-correlation ball-covariance ball-divergence feature-selection independence-tests k-sample-test sure-independence-screening
Last synced: about 2 months ago
JSON representation
Statistical Inference and Sure Independence Screening via Ball Statistics
- Host: GitHub
- URL: https://github.com/mamba413/ball
- Owner: Mamba413
- Created: 2017-11-27T08:24:29.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2023-02-12T11:30:20.000Z (almost 2 years ago)
- Last Synced: 2024-10-20T05:24:44.875Z (2 months ago)
- Topics: ball-correlation, ball-covariance, ball-divergence, feature-selection, independence-tests, k-sample-test, sure-independence-screening
- Language: C
- Homepage: https://mamba413.github.io/Ball
- Size: 3.03 MB
- Stars: 27
- Watchers: 3
- Forks: 1
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Ball Statistics
===========[![AppVeyor Build Status](https://ci.appveyor.com/api/projects/status/github/Mamba413/Ball?branch=master&svg=true)](https://ci.appveyor.com/project/Mamba413/Ball)
[![CRAN Status Badge](http://www.r-pkg.org/badges/version/Ball)](https://CRAN.R-project.org/package=Ball)
[![PyPI version](https://badge.fury.io/py/Ball.svg)](https://pypi.python.org/pypi/Ball/)Introdution
----------
The fundamental problems for data mining, statistical analysis, and machine learning are:
- whether several distributions are different?
- whether random variables are dependent?
- how to pick out useful variables/features from a high-dimensional data?These issues can be tackled by Ball statistics, which enjoy following admirable advantages:
- available for most of datasets (e.g., traditional tabular data, brain shape, functional connectome, wind direction and so on)
- insensitive to outliers, distribution-free and model-free;
- theoretically guaranteed and computationally efficient.Softwares
----------
### R package
Install the **Ball** package from CRAN:
```R
install.packages("Ball")
```
Compared with selective R packages available for datasets in metric spaces:| | [fastmit](https://cran.r-project.org/web/packages/fastmit) | [energy](https://cran.r-project.org/web/packages/energy) | [HHG](https://cran.r-project.org/web/packages/HHG) | [Ball](https://cran.r-project.org/web/packages/Ball) |
| :-------------------------------- | :----------------------------------------------------------: | :--------------------------------------------------------: | :--------------------------------------------------: | :----------------------------------------------------: |
| Test of equal distributions | :x: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
| Test of independence | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
| Test of joint independence | :x: | :x: | :x: | :heavy_check_mark: |
| Feature screening / Sure Independence Screening (SIS) | :x: | :x: | :x: | :heavy_check_mark: |
| Iterative Feature screening / Iterative SIS | :x: | :x: | :x: | :heavy_check_mark: |
| Datasets in metric spaces | :heavy_check_mark: | SNT | :heavy_check_mark: | :heavy_check_mark: |
| Robustness | :heavy_check_mark: | :x: | :heavy_check_mark: | :heavy_check_mark: |
| Parallel programming | :x: | :x: | :heavy_check_mark: | :heavy_check_mark: |
| Computational efficiency | :running::running::running: | :running::running::running: | :running::running: | :running::running::walking: |*SNT is the abbreviation of strong negative type.*
See the following documents for more details about the **[Ball](https://cran.r-project.org/web/packages/Ball)** package:
- [github page](https://github.com/Mamba413/Ball/tree/master/R-package) (short)
- [vignette](https://cran.r-project.org/web/packages/Ball/vignettes/Ball.html) (moderate)
- [JSS paper](https://arxiv.org/abs/1811.03750) (detailed)### Python package
Install the **Ball** package from PyPI:
```shell
pip install Ball
```Citation
----------
If you use Ball or reference our vignettes in a presentation or publication, we would appreciate citations of our package.
> Zhu J, Pan W, Zheng W, Wang X (2021). “Ball: An R Package for Detecting Distribution Difference and Association in Metric Spaces.” Journal of Statistical Software, 97(6), 1–31. doi: 10.18637/jss.v097.i06.Here is the corresponding Bibtex entry
```
@Article{,
title = {{Ball}: An {R} Package for Detecting Distribution Difference and Association in Metric Spaces},
author = {Jin Zhu and Wenliang Pan and Wei Zheng and Xueqin Wang},
journal = {Journal of Statistical Software},
year = {2021},
volume = {97},
number = {6},
pages = {1--31},
doi = {10.18637/jss.v097.i06},
}
```References
----------
- Pan, Wenliang; Tian, Yuan; Wang, Xueqin; Zhang, Heping. [Ball Divergence: Nonparametric two sample test](https://projecteuclid.org/euclid.aos/1525313077). Ann. Statist. 46 (2018), no. 3, 1109--1137. doi:10.1214/17-AOS1579. [https://projecteuclid.org/euclid.aos/1525313077](https://projecteuclid.org/euclid.aos/1525313077)
- Wenliang Pan, Xueqin Wang, Weinan Xiao & Hongtu Zhu (2018) [A Generic Sure Independence Screening Procedure](https://amstat.tandfonline.com/doi/full/10.1080/01621459.2018.1462709#.WupWaoiFM2x), Journal of the American Statistical Association, DOI: 10.1080/01621459.2018.1462709
- Wenliang Pan, Xueqin Wang, Heping Zhang, Hongtu Zhu & Jin Zhu (2019) [Ball Covariance: A Generic Measure of Dependence in Banach Space](https://doi.org/10.1080/01621459.2018.1543600), Journal of the American Statistical Association, DOI: 10.1080/01621459.2018.1543600
- Jin, Z., Wenliang P., Wei Z., and Xueqin W. (2018). Ball: An R package for detecting distribution difference and association in metric spaces. arXiv preprint arXiv:1811.03750. URL http://arxiv.org/abs/1811.03750.Bug report
----------
Open an [issue](https://github.com/Mamba413/Ball/issues) or send an email to Jin Zhu at [email protected]