https://github.com/smrfeld/python_prob_pca_tutorial

Tutorial for probabilistic PCA in Python and Mathematica
https://github.com/smrfeld/python_prob_pca_tutorial

mathematica pca python tutorial

Last synced: 10 months ago
JSON representation

Tutorial for probabilistic PCA in Python and Mathematica

Host: GitHub
URL: https://github.com/smrfeld/python_prob_pca_tutorial
Owner: smrfeld
Created: 2020-12-26T20:41:13.000Z (about 5 years ago)
Default Branch: main
Last Pushed: 2020-12-26T20:41:59.000Z (about 5 years ago)
Last Synced: 2025-02-16T06:44:59.246Z (12 months ago)
Topics: mathematica, pca, python, tutorial
Language: Mathematica
Homepage: https://medium.com/practical-coding/the-simplest-generative-model-you-probably-missed-c840d68b704
Size: 1.17 MB
Stars: 2
Watchers: 3
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # Tutorial on probabilistic PCA in Python and Mathematica

[You can read a complete tutorial on Medium here.](https://medium.com/practical-coding/the-simplest-generative-model-you-probably-missed-c840d68b704)

## Running

* Python: `python prob_pca.py`. The figures are output to the [figures_py](figures_py) directory.

* Mathematica: Run the notebook `prob_pca.nb`. The figures are output to the [figures_ma](figures_ma) directory.

## Description

You can find more information in the original paper: ["Probabilistic principal component analysis" by Tipping & Bishop](https://www.jstor.org/stable/2680726?seq=1#metadata_info_tab_contents).

### Import data

Let's import and plot some 2D data:

```

data = import_data()

d = data.shape[1]

print("\n---\n")

mu_ml = np.mean(data,axis=0)

print("Data mean:")

print(mu_ml)

data_cov = np.cov(data,rowvar=False)

print("Data cov:")

print(data_cov)

```

You can see the plotting functions in the complete repo. Visualizing the data shows the 2D distribution:





### Max likelihood

Assume the latent distribution has the form:



The visibles are sampled from the conditional distribution:



From these, the marginal distribution for the visibles is:





We are interested in maximizing the log likelihood:



where `S` is the covariance matrix of the data. The solution is obtained using the eigendecomposition of the data covariance matrix `S`:



Where the columns of `U` are the eigenvectors, and `D` is a diagonal matrix with the eigenvalues `\lambda`. Let the number of dimensions of the dataset be `d` (in this case, `d=2`). Let the number of latent variables be `q`, then the maximum likelihood solution is:





and `\mu` are just set to the mean of the data.

Here the eigenvalues `\lambda` are sorted from high to low, with `\lambda_1` being the largest and `\lambda_d` the smallest. Here `U_q` is the matrix where columns are the corresponding eigenvectors of the `q` largest eigenvalues, and `D_q` is a diagonal matrix with the largest eigenvalues. `R` is an arbitrary rotation matrix - here we can simply take `R=I` for simplicity (see Bishop for a detailed discussion). We have discarded the dimensions beyond `q` - the ML variance `\sigma^2` is then the average variance of these discarded dimensions. 

Here is the corresponding Python code to calculate these max-likelihood solutions:

```

# No hidden variables < no visibles = d

q = 1

# Variance

lambdas, eigenvecs = np.linalg.eig(data_cov)

idx = lambdas.argsort()[::-1]   

lambdas = lambdas[idx]

eigenvecs = - eigenvecs[:,idx]

print(eigenvecs)

# print(eigenvecs @ np.diag(lambdas) @ np.transpose(eigenvecs))

var_ml = (1.0 / (d-q)) * sum([lambdas[j] for j in range(q,d)])

print("Var ML:")

print(var_ml)

# Weight matrix

uq = eigenvecs[:,:q]

print("uq:")

print(uq)

lambdaq = np.diag(lambdas[:q])

print("lambdaq")

print(lambdaq)

weight_ml = uq * np.sqrt(lambdaq - var_ml * np.eye(q))

print("Weight matrix ML:")

print(weight_ml)

```

### Sampling latent variables

After determining the ML parameters, we can sample the hidden units from the visible according to:





You can implement it in Python as follows:

```

act_hidden = sample_hidden_given_visible(

    weight_ml=weight_ml,

    mu_ml=mu_ml,

    var_ml=var_ml,

    visible_samples=data

    )

```

where we have defined:

```

def sample_hidden_given_visible(

    weight_ml : np.array, 

    mu_ml : np.array,

    var_ml : float,

    visible_samples : np.array

    ) -> np.array:

    q = weight_ml.shape[1]

    m = np.transpose(weight_ml) @ weight_ml + var_ml * np.eye(q)

    cov = var_ml * np.linalg.inv(m)

    act_hidden = []

    for data_visible in visible_samples:

        mean = np.linalg.inv(m) @ np.transpose(weight_ml) @ (data_visible - mu_ml)

        sample = np.random.multivariate_normal(mean,cov,size=1)

        act_hidden.append(sample[0])

    

    return np.array(act_hidden)

```

The result is data which looks a lot like the standard normal distribution:



### Sample new data points

We can sample new data points by first drawing new samples from the hidden distribution (a standard normal):

```

mean_hidden = np.full(q,0)

cov_hidden = np.eye(q)

no_samples = len(data)

samples_hidden = np.random.multivariate_normal(mean_hidden,cov_hidden,size=no_samples)

```



and then sample new visible samples from those:

```

act_visible = sample_visible_given_hidden(

    weight_ml=weight_ml,

    mu_ml=mu_ml,

    var_ml=var_ml,

    hidden_samples=samples_hidden

    )

print("Covariance visibles (data):")

print(data_cov)

print("Covariance visibles (sampled):")

print(np.cov(act_visible,rowvar=False))

print("Mean visibles (data):")

print(np.mean(data,axis=0))

print("Mean visibles (sampled):")

print(np.mean(act_visible,axis=0))

```

where we have defined:

```

def sample_visible_given_hidden(

    weight_ml : np.array, 

    mu_ml : np.array,

    var_ml : float,

    hidden_samples : np.array

    ) -> np.array:

    d = weight_ml.shape[0]

    act_visible = []

    for data_hidden in hidden_samples:

        mean = weight_ml @ data_hidden + mu_ml

        cov = var_ml * np.eye(d)

        sample = np.random.multivariate_normal(mean,cov,size=1)

        act_visible.append(sample[0])

    

    return np.array(act_visible)

```

The result are data points that closely resemble the data distribution:





### Rescaling the latent distribution

Finally, we can rescale the latent variables to have any Gaussian distribution:



For example:

```

mean_hidden = np.array([120.0])

cov_hidden = np.array([[23.0]])

no_samples = len(data)

samples_hidden = np.random.multivariate_normal(mean_hidden,cov_hidden,size=no_samples)

```



We can simply transform the parameters and then **still** sample new valid visible samples from those:



Nota that `\sigma^2` is unchanged from before.

In Python we can do the rescaling:

```

weight_ml_rescaled = weight_ml @ np.linalg.inv(sl.sqrtm(cov_hidden))

mu_ml_rescaled = mu_ml - weight_ml_rescaled @ mean_hidden

print("Mean ML rescaled:")

print(mu_ml_rescaled)

print("Weight matrix ML rescaled:")

print(weight_ml_rescaled)

```

and then repeat the sampling with the new weights & mean:

```

act_visible = sample_visible_given_hidden(

    weight_ml=weight_ml_rescaled,

    mu_ml=mu_ml_rescaled,

    var_ml=var_ml,

    hidden_samples=samples_hidden

    )

```

Again, the samples look like they could come from the original data distribution:

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/smrfeld/python_prob_pca_tutorial

Awesome Lists containing this project

README