Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/reshalfahsi/medical-image-latent-space-visualization
Medical Image Latent Space Visualization Using VQ-VAE
https://github.com/reshalfahsi/medical-image-latent-space-visualization
image-processing latent-space medical-image-processing medmnist pytorch pytorch-lightning vq-vae
Last synced: about 6 hours ago
JSON representation
Medical Image Latent Space Visualization Using VQ-VAE
- Host: GitHub
- URL: https://github.com/reshalfahsi/medical-image-latent-space-visualization
- Owner: reshalfahsi
- Created: 2024-01-07T01:20:19.000Z (10 months ago)
- Default Branch: master
- Last Pushed: 2024-01-11T03:58:28.000Z (10 months ago)
- Last Synced: 2024-01-11T06:48:04.170Z (10 months ago)
- Topics: image-processing, latent-space, medical-image-processing, medmnist, pytorch, pytorch-lightning, vq-vae
- Language: Jupyter Notebook
- Homepage:
- Size: 257 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Medical Image Latent Space Visualization Using VQ-VAE
In this project, VQ-VAE (Vector Quantized VAE) is leveraged to learn the latent representation _z_ of various medical image datasets _x_ from MedMNIST. Similar to VAE (Variational Autoencoder), VQ-VAE consists of an encoder _q_(_z_|_x_) and a decoder _p_(_x_|_z_). But unlike VAE, which generally uses the Gaussian reparameterization trick, VQ-VAE utilizes vector quantization to sample the latent representation _z_ ~ _q_(_z_|_x_). Using vector quantization, it allows VQ-VAE to replace a generated latent variable from the encoder with a learned embedding from a codebook __C__ ∈ R_E_ × _D_, where E is the number of embeddings and _D_ is the number of latent variable dimensions (or channels in the context of image data). Let __X__ ∈ R_H_ × _W_ × _D_ be the output feature map of the encoder, where _H_ is the height and _W_ is the width. To transform the raw latent variable to the discretized one, first we need to find the Euclidean distance between __X__ and __C__. This step is essential to determine the closest representation of the raw latent variable to the embedding. The computation of this step is roughly expressed as: (__X__)2 + (__C__)2 - 2 × (__X__ × __C__). This calculation yields __Z__ ∈ R_H_ × _W_, where each element denotes the index of the nearest embedding of the corresponding latent variable. Then, __Z__ is subject to __C__ to get the final discrete representation. Inspired by the centroid update of K-means clustering, EMA (exponential moving average) is applied during training, which updates in an online fashion involving embeddings in the codebook and the estimated number of members in a cluster.
## Experiment
To discern the latent space, go to [here](https://github.com/reshalfahsi/medical-image-latent-space-visualization/blob/master/Medical_Image_Latent_Space_Visualization_Using_VQ-VAE.ipynb).
## Result
## Evaluation Metric Curve
Loss of the model at the training stage.
MAE on the training and validation sets.
PSNR on the training and validation sets.
SSIM on the training and validation sets.## Qualitative Result
Here is the visualization of the latent space:
The latent space of five distinct datasets, i.e., DermaMNSIT, PneumoniaMNIST, RetinaMNIST, BreastMNIST, and BloodMNIST.## Credit
- [Neural Discrete Representation Learning](https://arxiv.org/pdf/1711.00937.pdf)
- [Vector-Quantized Contrastive Predictive Coding](https://github.com/bshall/VectorQuantizedCPC)
- [Variational AutoEncoder](https://keras.io/examples/generative/vae/)
- [Vector-Quantized Variational Autoencoders](https://keras.io/examples/generative/vq_vae/)
- [MedMNIST](https://medmnist.com/)
- [PyTorch Lightning](https://lightning.ai/docs/pytorch/latest/)