Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/crisostomi/latent-aggregation
https://github.com/crisostomi/latent-aggregation
Last synced: about 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/crisostomi/latent-aggregation
- Owner: crisostomi
- License: mit
- Created: 2023-03-17T10:31:18.000Z (almost 2 years ago)
- Default Branch: master
- Last Pushed: 2023-11-11T14:04:49.000Z (about 1 year ago)
- Last Synced: 2024-10-26T22:24:53.311Z (3 months ago)
- Language: Jupyter Notebook
- Size: 11.7 MB
- Stars: 2
- Watchers: 5
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Latent Aggregation
Aggregating seemingly different latent spaces.
## Quickstart
[comment]: <> (> Fill me!)
## Development installation
Setup the development environment:
```bash
git clone [email protected]:crisostomi/latent-aggregation.git
cd latent-aggregation
conda env create -f env.yaml
conda activate la
pre-commit install
```Run the tests:
```bash
pre-commit run --all-files
```
We use HuggingFace Datasets throughout the project; assuming you already have a HF account (create one if you don't), you will have to login via
```
huggingface-cli login
```
which will prompt you to either create a new token or paste an existing one.### Update the dependencies
Re-install the project in edit mode:
```bash
pip install -e '.[dev]'
```## Experiment flow
Each experiment `exp_name` in `part_shared_part_novel, same_classes_disj_samples, totally_disjoint` has three scripts:
- `prepare_data_${exp_name}.py` divides the data in tasks according to what the experiment expects;
- `run_${exp_name}.py` trains the task-specific models and uses them to embed the data for each task;
- `analyze_${exp_name}.py` obtains the results for the experiment.Each script has a corresponding conf file in `conf/` with the same name.
So, to run the `part_shared_part_novel`, you have to first configure the experiment in `conf/prepare_data_part_shared_part_novel.yaml`. In this case, you have to choose a value for `num_shared_classes` and `num_novel_classes_per_task`. Now you will prepare the data via
```
python src/la/scripts/prepare_data_part_shared_part_novel.py
```
this will populate the `data/${dataset_name}/part_shared_part_novel/` folder. Then you'll embed the data by running
```
python src/la/scripts/run_part_shared_part_novel.py
```
so that now you will have the encoded data in `data/${dataset_name}/part_shared_part_novel/S${num_shared_classes}_N${num_novel_classes_per_task}`.
Having all the latent spaces, you can now run the actual experiment and collect the results by running
```
python src/la/scripts/analyze_part_shared_part_novel.py
```
The results can now be found in `results/part_shared_part_novel`.