Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ngiann/fastparzenwindows.jl
Fast Parzen Windows: a kernel-based method for non-parametric probability density function.
https://github.com/ngiann/fastparzenwindows.jl
julia kernel-density-estimation statistics
Last synced: 20 days ago
JSON representation
Fast Parzen Windows: a kernel-based method for non-parametric probability density function.
- Host: GitHub
- URL: https://github.com/ngiann/fastparzenwindows.jl
- Owner: ngiann
- License: mit
- Created: 2022-01-19T20:33:13.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2024-01-18T11:34:31.000Z (10 months ago)
- Last Synced: 2024-10-03T05:06:01.444Z (about 1 month ago)
- Topics: julia, kernel-density-estimation, statistics
- Language: Julia
- Homepage:
- Size: 139 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# FastParzenWindows.jl
[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)
![GitHub](https://img.shields.io/github/license/ngiann/FastParzenWindows.jl)## ℹ What is this?
This is a Julia implementation of the **Fast Parzen Window Density Estimator** described in
*X. Wang, P. Tino, M. A. Fardal, S. Raychaudhury and A. Babul, "Fast parzen window density estimator," 2009 International Joint Conference on Neural Networks, 2009, pp. 3267-3274.*
This is a technique for estimating a probability density from an observed set of data points. The data space is partitioned in hyper-discs of fixed radii `r` and each partition is modelled with a Gaussian density. The final model is a mixture of Gaussians with each Gaussian fitted locally to a partition.
The algorithm presented in the paper has two versions called 'hard' and 'soft'. This repository provides an implementation of the 'soft' version.
## 💾 How to install
Apart from cloning, an easy way of using the package is to switch into "package mode" with ```]``` in the Julia REPL and use `add FastParzenWindows`.
## â–¶ How to use
There are two functions of interest: `fpw` and `cvfpw`.
- `fpw` takes two arguments, a N×D data matrix `X` and a scalar `r` which expresses the radius of the hyper-discs in which the data space is partitioned. The output is an object of the type `Distributions.MixtureModel`.
- `cvfpw` takes two arguments, a N×D data matrix `X` and a range of candidate radii of the hyper-discs. It performs cross-validation for each candidate `r` and returns a matrix of out-of-sample log-likelihoods of dimensions (number of `r` candidates)×(number of folds).## ▶ Example
We use a dataset taken from the paper. We generate 300 data points using:
```
using FastParzenWindows
using PyPlot # must be independently installed, needed only for plotting present example
using StatisticsX = spiraldata(300)
plot(X[:,1], X[:,2], "bo", label="dataset")
```We want to find out which `r` works well for this dataset:
```
# define range of 100 candidate radii
r_range = LinRange(0.01, 2.0, 100)# perform cross-validation
cvresults = cvfpw(X, r_range)# which is the best r?
r_perf = mean(cvresults, dims=2)
best_index = argmax(r_perf)r_best = r_range[best_index]
```Estimate final model:
```
mix = fpw(X, r_best)# generate observations and plot them
x = rand(mix, 1000)'
plot(x[:,1], x[:,2], "r.", label="generated")
legend()```
![Spiral example](spiral.png)