Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/htpusa/ptd-cca
Tensor Canonical Correlation Analysis (TCCA) via penalised tensor decomposition
https://github.com/htpusa/ptd-cca
canonical-correlation-analysis cca scca stcca tcca tensor-canonical-correlation-analysis
Last synced: 29 days ago
JSON representation
Tensor Canonical Correlation Analysis (TCCA) via penalised tensor decomposition
- Host: GitHub
- URL: https://github.com/htpusa/ptd-cca
- Owner: htpusa
- Created: 2024-02-06T13:45:37.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2024-04-17T11:12:22.000Z (9 months ago)
- Last Synced: 2024-04-17T12:33:13.659Z (9 months ago)
- Topics: canonical-correlation-analysis, cca, scca, stcca, tcca, tensor-canonical-correlation-analysis
- Language: MATLAB
- Homepage:
- Size: 10.7 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# PTD-CCA: Sparse Tensor Canonical Correlation Analysis (STCCA) via penalised tensor decomposition
PTD-CCA is an unsupervised dimensionality reduction method for 2 or more views/data modalities. The algorithm is a straightforward extension of the penalised matrix decomposition CCA (PMD-CCA) proposed by Witten et al. (2009) to more than 2 views and maximises the "higher-order covariance" between the linear projections `X_m*w_m` where each `X_m` is data matrix and `w_m` a vector of coefficients. It reduces to PMD-CCA if there are just 2 views.
## EXAMPLE
Set up some synthetic data
```MATLAB
a = [ones(20,1); -ones(20,1); zeros(60,1)];
b = [zeros(60,1); -ones(20,1); ones(20,1)];
c = [ones(20,1); zeros(60,1); -ones(20,1)];
d = [-ones(10,1); ones(10,1); zeros(60,1); -ones(10,1); ones(10,1)];
Z = rand(100,4); Z = Z./sum(Z,2);
X1 = normrnd(Z(:,1)*a',0.1);
X2 = normrnd(Z(:,2)*b',0.1);
X3 = normrnd(Z(:,3)*c',0.1);
X4 = normrnd(Z(:,4)*d',0.1);
X = {X1;X2;X3;X4};
```Run `PTDCCA` with "intermediate" sparsity and compare the model to the ground truth.
```MATLAB
W = PTDCCA(X,0.5);
wtrue = [a,b,c,d];
figure
for m=1:4
subplot(2,4,m);bar(wtrue(:,m));title(sprintf('True w_%d',m))
xlabel('variable');ylabel('coefficient')
subplot(2,4,4+m);bar(W{m});title(sprintf('Inferred w_%d',m))
end
```Sparsity can also be set for each view separately:
```MATLAB
c = [0.05,0.25,0.75,1];
W = PTDCCA(X,c);
for m=1:4
subplot(1,4,m);bar(W{m});title(sprintf('c = %.2f',c(m)))
xlabel('variable');ylabel('coefficient')
end
```To calculate multiple canonical variable tuples, use the name-value input `D`
```MATLAB
W = PTDCCA(X,0.5,'D',3);
```If you've ran the examples above, you may have noticed the function takes some time to return. Most of the running time is in fact spent calculating the cross-covariance tensor which is used to initialise the algorithm. This can be avoided by using a random initialisation instead:
```MATLAB
W = PTDCCA(X,0.5,'initType','random');
```
The covariance tensor also takes up a lot of memory, and if the dimensions of the data are high enough, might exceed the largest allowed array size. If this happens, `PTDCCA` defaults to the random initialisation.## References
Witten, Daniela M., Robert Tibshirani, and Trevor Hastie. "A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis." Biostatistics 10.3 (2009): 515-534.