https://github.com/pnnl/ddks
A high-dimensional Kolmogorov-Smirnov distance for comparing high dimensional distributions
https://github.com/pnnl/ddks
machine-learning physics statistics
Last synced: 10 months ago
JSON representation
A high-dimensional Kolmogorov-Smirnov distance for comparing high dimensional distributions
- Host: GitHub
- URL: https://github.com/pnnl/ddks
- Owner: pnnl
- License: other
- Created: 2021-06-24T14:50:30.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2022-09-20T14:27:38.000Z (almost 4 years ago)
- Last Synced: 2025-06-08T19:05:49.792Z (about 1 year ago)
- Topics: machine-learning, physics, statistics
- Language: Jupyter Notebook
- Homepage:
- Size: 2.15 MB
- Stars: 27
- Watchers: 5
- Forks: 7
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# ddKS - a d-dimensional Kolmogorov-Smirnov Test
*Alex Hagen1, Shane Jackson1, James Kahn2, Jan Strube1, Isabel Haide2, Karl Pazdernik1, and Connor Hainje1*
1: Pacific Northwest National Laboratory,
2: Karlsruhe Institute of Technology
This code accompanies our paper submitted to IEEE Transactions on
Pattern Analysis and Machine Intelligence titled "Accelerated Computation of a
High Dimensional Kolmogorov-Smirnov Distance" ([arXiv](https://arxiv.org/abs/2106.13706)).
As of 6/25/2021 there are 3 methods implemented:
* ddKS - d-dimensional KS test caclulated per
* Variable splitting of space (all points, subsample, grid spacing)
* rdKS - ddKS approximation using distance from (d+1) corners
* vdKS - ddKS approximation calculating ddks distance between voxels instead of points
# Quickstart
Installation of `ddks` should be pretty easy, simple run
```bash
pip install git+https://github.com/pnnl/DDKS
```
or, if you want to develop on DDKS, simply clone this repository into a safe
spot on your computer and run
```bash
pip install -e .
```
from the top level of the repository.
Then, you can get started used the
repository by starting a `ddks` object and performing the distance calculation
on any pair of torch tensors that are `sample_size` x `dimension`.
```python
import torch
import ddks
p = torch.rand((100, 3))
t = torch.rand((50, 3))
calculation = ddks.methods.ddKS()
distance = calculation(p, t)
print(f"The ddKS distance is {distance}")
```
To operate on GPU, all you need to do is move the tensors to the device before
calculation:
```python
p = torch.rand((100, 3)).to('cuda:0')
t = torch.rand((50, 3)).to('cuda:0')
calculation = ddks.methods.ddKS()
distance = calculation(p, t)
```
If you want to use a different accelerated method, simply use
`ddks.methods.rdKS` or `ddks.methods.vdKS`. Note that rdKS and vdKS cannot use
GPU.
# Package Structure:
1. methods - Callable classes for xdks methods [x=d,r,v]
1. data - Contains several data generators to play around with
1. run_scripts - Contains an example run script
1. Unit_tests - Contains unit tests for repo