Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/NKI-CCB/won-parafac
Weighted orthogonal non-negative (WON) parallel factor analsyis (PARAFAC)
https://github.com/NKI-CCB/won-parafac
bioinformatics genomics matlab parafac unsupervised-learning
Last synced: about 8 hours ago
JSON representation
Weighted orthogonal non-negative (WON) parallel factor analsyis (PARAFAC)
- Host: GitHub
- URL: https://github.com/NKI-CCB/won-parafac
- Owner: NKI-CCB
- License: apache-2.0
- Created: 2019-09-11T09:31:52.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2021-05-17T15:41:28.000Z (over 3 years ago)
- Last Synced: 2024-08-02T20:43:39.694Z (3 months ago)
- Topics: bioinformatics, genomics, matlab, parafac, unsupervised-learning
- Language: MATLAB
- Size: 13.8 MB
- Stars: 1
- Watchers: 4
- Forks: 2
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-multi-omics - WON-PARAFAC - Kim - weighted orthogonal nonnegative parallel factor analysis - [paper](https://doi.org/10.1038/s41467-019-13027-2) (Software packages and methods / Multi-omics correlation or factor analysis)
README
# WON-PARAFAC
### Weighted orthogonal non-negative (WON) parallel factor analsyis (PARAFAC)
WON-PARAFAC is a variant of parallel factor analysis (PARAFAC), a tensor factorization method.
WON-PARAFAC impose the following three constraints on the standard PARAFAC:
1. Weighting scheme
- For balanced integration of the multiple data types
2. Orthogonality constraint
- To reduce overlapping between a factor (originally used on gene mode). This also introduces extra sparcity on the mode.
3. Non-negativity
- To induce sparse and parts-based representation.### Implementation / Dependency
A multiplicative update rule was used to derive the algorithm, as in the original NMF implementation from [Lee & Seung (Nature, 1999)](https://www.nature.com/articles/44565).
The code requires [tensor toolbox version 2.6 (by Tamara Kolda)](https://www.sandia.gov/~tgkolda/TensorToolbox/index-2.6.html
), freely available for non-commercial use upon registration.For running the code, tenstor toobox must be avilable on the path environment, using `addpath` command in MATLAB.
### Demo code and data
You can load demo data, which contains pan-cancer multiomics data produced in [GDSC1000 project (Sanger)](https://www.cancerrxgene.org/gdsc1000/GDSC1000_WebResources/Home.html).
You can load the data by:```matlab
load Demo.mat
```
The command will load a varialbe `X`, a 3-way tensor (1815 gene by 935 cell lines and 5 data types).
Note that the 5 data types corresponds to below:
- positive gene expression levels (non-negative continuous; GE(+))
- absolute value of negative gene expression levels (non-negative continuous; GE(-))
- mutation (binary; MT)
- copy number gain (binary; CN(+))
- copy number loss (binary; CN(-))The list of genes names in `X` is indicated in `genenames`, which will also be loaded together with `X`.
`Demo.m` will perform WON-PARAFAC analysis using random 100 genes by default, and varying number of factors and strength of orthogonal constraint on gene factor matrix.
- Number of basis: 10, 20, 30, ..., 200
- Strength of orthogonal constraint: 0 (no constraint), 0.2, 0.5, 1Finally, a plot will be generated to show the performance of WON-PARAFAC for reconstructing input tensor (see below for an example).
![alt text](https://github.com/NKI-CCB/won-parafac/blob/master/Demo_plot.png "Demo plot")