{"id":20669486,"url":"https://github.com/iitis/cumulantsfeatures.jl","last_synced_at":"2026-02-12T15:05:28.692Z","repository":{"id":61797637,"uuid":"150100381","full_name":"iitis/CumulantsFeatures.jl","owner":"iitis","description":"Cumulants based features selection and outlier detection","archived":false,"fork":false,"pushed_at":"2023-05-17T08:31:26.000Z","size":177,"stargazers_count":6,"open_issues_count":0,"forks_count":3,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-07-11T00:04:26.107Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Julia","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/iitis.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-09-24T12:39:44.000Z","updated_at":"2024-07-09T09:18:03.000Z","dependencies_parsed_at":"2025-01-17T13:30:17.326Z","dependency_job_id":"60e5d628-8d64-4351-b22d-bf8a2120bbbf","html_url":"https://github.com/iitis/CumulantsFeatures.jl","commit_stats":{"total_commits":158,"total_committers":2,"mean_commits":79.0,"dds":0.006329113924050667,"last_synced_commit":"015dc44d4406fb21e3e5e212483dccfd16e7dd0c"},"previous_names":[],"tags_count":11,"template":false,"template_full_name":null,"purl":"pkg:github/iitis/CumulantsFeatures.jl","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iitis%2FCumulantsFeatures.jl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iitis%2FCumulantsFeatures.jl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iitis%2FCumulantsFeatures.jl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iitis%2FCumulantsFeatures.jl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/iitis","download_url":"https://codeload.github.com/iitis/CumulantsFeatures.jl/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iitis%2FCumulantsFeatures.jl/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":271378657,"owners_count":24749188,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-20T02:00:09.606Z","response_time":69,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-16T20:14:31.557Z","updated_at":"2026-02-12T15:05:28.657Z","avatar_url":"https://github.com/iitis.png","language":"Julia","funding_links":[],"categories":[],"sub_categories":[],"readme":"# CumulantsFeatures.jl\n\n[![Coverage Status](https://coveralls.io/repos/github/iitis/CumulantsFeatures.jl/badge.svg?branch=master)](https://coveralls.io/github/iitis/CumulantsFeatures.jl?branch=master)\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.7944059.svg)](https://doi.org/10.5281/zenodo.7944059)\n\n\nCumulantsFeatures.jl uses multivariate cumulants to provide the algorithms for the outliers detection and the features selection given the multivariate data represented in the form of `t x n` matrix of Floats, `t` numerates the realisations, while `n` numerates the marginals.\n\nRequires SymmetricTensors.jl Cumulants.jl and CumulantsUpdates.jl to compute and update multivariate cumulants of data.\n\nAs of 24/09/2018 [@kdomino](https://github.com/kdomino) is the lead maintainer of this package.\n\n## Installation\n\nWithin Julia, run\n\n```julia\npkg\u003e add CumulantsFeatures\n```\n\nParallel computation is supported\n\n## Features selection\n\nGiven `n`-variate data,  iteratively determines its `k`-marginals that are little informative.\nUses `C2`- the covariance matrix, and `CN` - the `N`th cumulant's tensor, both in the `SymmetricTensor` type, see SymmetricTensors.jl. Uses one of the following optimisation functions\n`f`: `[\"hosvd\", \"norm\", \"mev\"].\n\n```julia\n\njulia\u003e function cumfsel(C2::SymmetricTensor{T,2}, CN::SymmetricTensor{T, N}, f::String, k::Int = n) where {T \u003c: AbstractFloat, N}\n\n```\nThe \"norm\" uses the norm of the higher-order cumulant tensor, this is a benchmark method for comparison. \n\nThe \"mev\" uses only the corrlelation matrix, see: C. Sheffield, 'Selecting band combinations from multispectral data', Photogrammetric Engineering and Remote Sensing, vol. 51 (1985)\n\nThe \"hosvd\" uses the Higher Order Singular Value decomposition of cumulant's tensor to extract information. For the `N=3` case, the Joint Skewness Band Selection (JSBS), see X. Geng, K. Sun, L. Ji, H. Tang \u0026 Y. Zhao 'Joint Skewness and Its Application in Unsupervised Band Selection for Small Target Detection Sci Rep. vol.5 (2015) (https://www.nature.com/articles/srep09915). For the JSBS application in biomedical data analysis see: M. Domino, K. Domino, Z. Gajewski, 'An application of higher order multivariate cumulants in modelling of myoelectrical activity of porcine uterus during early pregnancy', Biosystems (2018), (https://doi.org/10.1016/j.biosystems.2018.10.019). For `N = 4` and `N = 5` see also P. Głomb, K. Domino, M. Romaszewski, M. Cholewa 'Band selection with Higher Order Multivariate Cumulants for small target detection in hyperspectral images' (2018) (https://arxiv.org/abs/1808.03513). \n\n```julia\n\njulia\u003e Random.seed!(42);\n\njulia\u003e using Cumulants\n\njulia\u003e using SymmetricTensors\n\njulia\u003e x = rand(12,10);\n\njulia\u003e c = cumulants(x, 4);\n\njulia\u003e cumfsel(c[2], c[4], \"hosvd\")\n10-element Array{Any,1}:\n (Bool[true, true, true, false, true, true, true, true, true, true], 27.2519, 4)        \n (Bool[true, true, false, false, true, true, true, true, true, true], 22.6659, 3)       \n (Bool[true, true, false, false, false, true, true, true, true, true], 18.1387, 5)      \n (Bool[false, true, false, false, false, true, true, true, true, true], 14.4492, 1)     \n (Bool[false, true, false, false, false, true, true, false, true, true], 11.2086, 8)    \n (Bool[false, true, false, false, false, true, true, false, true, false], 7.84083, 10)  \n (Bool[false, false, false, false, false, true, true, false, true, false], 5.15192, 2)  \n (Bool[false, false, false, false, false, false, true, false, true, false], 2.56748, 6)\n (Bool[false, false, false, false, false, false, true, false, false, false], 0.30936, 9)\n (Bool[false, false, false, false, false, false, false, false, false, false], 0.0, 7)  \n\n```\n\nThe output is the Array of tuples `(ind::Array{Bool}, fval::Float64, i::Int)`, each tuple corresponds to the one step\nof the features selection. Marginals are removed in the information hierarchy, starting from the least informatve and ending on the most infomrative.\n\nThe vector `ind` consist of `false` that determines the removed marginal, and `true` that determines the left marginal. \n\nThe `fval` is the value of the target function.\n\nThe `i` numerates the marginal removed at the given step.\n\nTo limit number of steps use the default parameter `k`:\n\n```julia\n\njulia\u003e cumfsel(Array(c[2]), Array(c[4]), \"hosvd\", 2)\n2-element Array{Any,1}:\n (Bool[true, true, true, false, true, true, true, true, true, true], 27.2519, 4)\n (Bool[true, true, false, false, true, true, true, true, true, true], 22.6659, 3)\n\n```\n\nFor the mev optimization run:\n\n```julia\n\njulia\u003e cumfsel(Σ::SymmetricTensor{T,2}, k::Int = Σ.dats)\n\n```\n\n\n## The higher-order cross-correlation matrix\n\n```julia\n\n  cum2mat(c::SymmetricTensor{T, N}) where {T \u003c: AbstractFloat, N}\n\n```\nReturns the higher-order cross-correlation matrix in the form of `SymmetricTensor{T, 2}`. Such matrix is the contraction of the corresponding higher-order cumulant tensor `c::SymmetricTensor{T, N}`\nwith itself in all modes but one.\n\n```julia\n\njulia\u003e Random.seed!(42);\n\njulia\u003e t = rand(SymmetricTensor{Float64, 3}, 4);\n\njulia\u003e cum2mat(t)\nSymmetricTensor{Float64,2}(Union{Nothing, Array{Float64,2}}[[7.69432 4.9757; 4.9757 5.72935] [6.09424 4.92375; 5.05157 3.17723]; nothing [7.33094 4.93128; 4.93128 4.7921]], 2, 2, 4, true)\n\n```\n\n## Outliers detection\n\nLet `X` be the multivariate data represented in the form of `t x n` matrix of Floats, `t` numerates the realisations, while `n` numerates the marginals.\n\n### RX detector\n\n```julia\n\n  rxdetect(X::Matrix{T}, α::Float64 = 0.99)\n\n```\n\nThe RX (Reed-Xiaoli) Anomaly Detection returns the array of Bool, where `true`\ncorresponds to the outlier realisations while `false` corresponds to the ordinary data. The parameter `α` is the sensitivity (threshold) parameter of the RX detector.\n\n\n```julia\njulia\u003e Random.seed!(42);\n\njulia\u003e x = vcat(rand(8,2), 20*rand(2,2))\n10×2 Array{Float64,2}:\n  0.533183    0.956916\n  0.454029    0.584284\n  0.0176868   0.937466\n  0.172933    0.160006\n  0.958926    0.422956\n  0.973566    0.602298\n  0.30387     0.363458\n  0.176909    0.383491\n 11.8582      5.25618\n 14.9036     10.059   \n\njulia\u003e rxdetect(x, 0.95)\n10-element Array{Bool,1}:\n false\n false\n false\n false\n false\n false\n false\n false\n  true\n  true\n```\n\n### The 4th order multivariate cumulant outlier detector\n\n```julia\n\n  function hosvdc4detect(X::Matrix{T}, β::Float64 = 4.1, r::Int = 3)\n\n```\nThe 4th order multivariate cumulant outlier detector returns the array of Bool, where `true`\ncorresponds to the outlier realisations while `false` corresponds to the ordinary data. The parameter `β` is the sensitivity parameter, the parameter `r` is the number of specific directions (with high `4`th order cumulant) on which data are projected. See K. Domino: 'Multivariate cumulants in outlier detection for financial data analysis', [arXiv:1804.00541] (https://arxiv.org/abs/1804.00541). \n\n```julia\n\njulia\u003e Random.seed!(42);\n\njulia\u003e x = vcat(rand(8,2), 20*rand(2,2))\n10×2 Array{Float64,2}:\n  0.533183    0.956916\n  0.454029    0.584284\n  0.0176868   0.937466\n  0.172933    0.160006\n  0.958926    0.422956\n  0.973566    0.602298\n  0.30387     0.363458\n  0.176909    0.383491\n 11.8582      5.25618\n 14.9036     10.059\n\njulia\u003e rxdetect(x, 0.95)\n10-element Array{Bool,1}:\n false\n false\n false\n false\n false\n false\n false\n false\n  true\n  true\n```\n## Tests on artificial data.\n\nIn folder `benchmarks/outliers_detect` and `benchmarks/features_select` there are the Julia executable files for testing features selection and outliers detection on artificial data.\n\n### Features selection\n\nIn `./benchmarks/features_select` the executable file `gendat4selection.jl` generates multivariate data where the subset of `infomrative` margianls is modelled by the t-Student copula with `--nu` degrees of freedom (by defalt `4`). All univariate marginal distributions are t-Student with `-nuu` degrees of freedom (by defalt `25`).\n\n\nThe `gendat4selection.jl` returns a `.jld2` file with data. Run `jkfs_selection.jl` on this file to display the characteristics of features selection plotted in `./benchmarks/features_select/pics/`\n\n### Outlier detection\n\nIn `./benchmarks/outliers_detect/` the executable file `gendat4detection.jl` generates multivariate data with outliers modelled by the t-Student copula with `--nu` degrees of freedom (by defalt `6`). All univariate marginal distributions are t-Student with `--nuu` degrees of freedom (by defalt `6`). The number of test realisations is `--reals` (by default `5`).\n\nThe `gendat4detection.jl` returns a `.jld2` file with data. Run `detect_outliers.jl` on this file to display the characteristics of outlier detection plotted in `./benchmarks/outliers_detect/pics/'\n`\n\n# Citing this work\n\nThis project was partially financed by the National Science Centre, Poland – project number 2014/15/B/ST6/05204.\n\nWhile using `hosvdc4detect()` - please cite: K. Domino: 'Multivariate cumulants in outlier detection for financial data analysis', Physica A: Statistical Mechanics and its Applications Volume 558, 15 November 2020, 124995 (https://doi.org/10.1016/j.physa.2020.124995).\n\n\nWhile using `cumfsel()` - please cite: P. Głomb, K. Domino, M. Romaszewski, M. Cholewa, 'Band selection with Higher Order Multivariate Cumulants for small target detection in hyperspectral images', Wroclaw University of Science and Technology, Conference Proceedings: PP-RAI'2019 (2019), ISBN: 978-83-943803-2-8; [arxiv: 1808.03513] (https://arxiv.org/abs/1808.03513).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fiitis%2Fcumulantsfeatures.jl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fiitis%2Fcumulantsfeatures.jl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fiitis%2Fcumulantsfeatures.jl/lists"}