Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/advestis/nmf-rank-determination
https://github.com/advestis/nmf-rank-determination
Last synced: about 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/advestis/nmf-rank-determination
- Owner: Advestis
- Created: 2023-10-20T12:56:46.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2023-11-13T09:25:23.000Z (about 1 year ago)
- Last Synced: 2023-11-13T10:34:06.967Z (about 1 year ago)
- Language: Python
- Size: 3.16 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## On Rank Selection in Non-Negative Matrix Factorization using ConcordanceOfficial *GitHub* repository for the original *MDPI Mathematics* paper
![Graphical_abstract](images/graphical_abstract.png)
*Credits*: DALLΒ·E 3**On Rank Selection in Non-Negative Matrix Factorization using Concordance**
Paul Fogel, Christophe Geissler, Nicolas Morizet and George Luta
**DOI**: https://doi.org/10.3390/math11224611This article belongs to the Special Issue [Advances in Applied Probability and Statistical Inference](https://www.mdpi.com/journal/mathematics/special_issues/Advances_Applied_Probability_Statistical_Inference).
**Abstract**: The choice of the factorization rank of a matrix is critical, e.g., in dimensionality reduction,
filtering, clustering, deconvolution, etc., because selecting a rank that is too high amounts to adjusting the noise,
while selecting a rank that is too low results in the oversimplification of the signal.
Numerous methods for selecting the factorization rank of a non-negative matrix have been proposed.
One of them is the cophenetic correlation coefficient (*ccc*), widely used in data science to evaluate the number of
clusters in a hierarchical clustering. In previous work, it was shown that πππ performs better than other methods
for rank selection in non-negative matrix factorization (NMF) when the underlying structure of the matrix consists
of orthogonal clusters. In this article, we show that using the ratio of *ccc* to the approximation error significantly
improves the accuracy of the rank selection. We also propose a new criterion, *concordance*, which, like *ccc*,
benefits from the stochastic nature of NMF; its accuracy is also improved by using its ratio-to-error form.
Using real and simulated data, we show that *concordance*, with a CUSUM-based automatic detection algorithm
for its original or ratio-to-error forms, significantly outperforms *ccc*.
It is important to note that the new criterion works for a broader class of matrices,
where the underlying clusters are not assumed to be orthogonal.**Datasets used in the study**
- Swimmer
- Sausage
- Brunet
- MNIST**Requirements**
- Python 3.9.0
- scikit-learn
- pandas
- numpy
- scipy
- tqdm## Citation
```
@Article{math11224611,
AUTHOR = {Fogel, Paul and Geissler, Christophe and Morizet, Nicolas and Luta, George},
TITLE = {On Rank Selection in Non-Negative Matrix Factorization Using Concordance},
JOURNAL = {Mathematics},
VOLUME = {11},
YEAR = {2023},
NUMBER = {22},
ARTICLE-NUMBER = {4611},
URL = {https://www.mdpi.com/2227-7390/11/22/4611},
ISSN = {2227-7390},
DOI = {10.3390/math11224611}
}
```