Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/dansolombrino/kindergarten-vq-vae

Improving disentanglement properties in off-the-shelf Transformer models
https://github.com/dansolombrino/kindergarten-vq-vae

autoencoders deep-learning deep-neural-networks natural-language-processing nlp transformers

Last synced: 8 days ago
JSON representation

Improving disentanglement properties in off-the-shelf Transformer models

Host: GitHub
URL: https://github.com/dansolombrino/kindergarten-vq-vae
Owner: dansolombrino
Created: 2024-01-15T08:44:23.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2024-03-17T18:46:41.000Z (11 months ago)
Last Synced: 2024-12-14T09:34:49.201Z (2 months ago)
Topics: autoencoders, deep-learning, deep-neural-networks, natural-language-processing, nlp, transformers
Language: Python
Homepage:
Size: 4.92 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Kindergarten-VQ-VAE

## Getting started

- Create a Python virtual environment using the preferred environment manager
- Install dependencies listed in `requirements.txt`

## Folder structure

This is the project folder structure, highlighting the goal of each directory

- `Kindergarten-VQ-VAE` $\to$ project root directory
- `analyses` $\to$ utility scripts to perform project analyses
- `unsupervised_vq_disentanglement` $\to$ Analysis 1: does unsupervised Vector Quantization disentangle the 9 dSentences generative factors?
- `results` $\to$ data gathered from Analysis 1, grouped by run ID

- `common` $\to$ common utility scripts, classes and constants

- `datasets` $\to$ dataset files
- `dSentences` $\to$ dSentences dataset files

- `data` $\to$ data-related files (e.g. pre-processing, PyTorch DataSet implementation)
- `dSentences` $\to$ dSentences data-related files

- `models` $\to$ PyTorch models
- `bagon` $\to$ Attempt 1 $\to$ Pre-trained BERT Encoder, fine-tuned BERT Decoder LM head
- `bagon` $\to$ Attempt 2 $\to$ Pre-trained BERT Encoder, fine-tuned BERT Decoder LM head, VQ from scratch

## How to train

### Bagon model

- Activate the virtual environment for this project
- `cd` into project root folder
- Change hyperparameters in `models/bagon/config.py`, if desired
- `PYTHONPATH=. python3 models/bagon/main.py` from the project root folder. Alternatively:
- `export PYTHONPATH=.` once when you first start the terminal session you want to run the model in
- `python3 models/bagon/main.py`