https://github.com/borgwardtlab/rnaglib-experiments
https://github.com/borgwardtlab/rnaglib-experiments
Last synced: 3 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/borgwardtlab/rnaglib-experiments
- Owner: BorgwardtLab
- Created: 2024-05-16T11:02:19.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2025-09-23T07:34:47.000Z (8 months ago)
- Last Synced: 2025-09-23T09:24:23.994Z (8 months ago)
- Language: Python
- Size: 19 MB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# rnaglib-experiments
This repository contains all code used for our experiments conducted with the `rnaglib` task suite. For full documentation please refer to [rnaglib.org](rnaglib.org) and [REDACTED].
# Reproducing experiments
This repository provides the necessary code to reproduce the three main experiments reported in our preprint: number of layers ablation study, splitting strategy ablation study and representation ablation study
* Prior to running any of the three experiments detailed below, you need to create the datasets. To do so, please run `python create_datasets.py`. This will create the datasets relevant to each of the tasks and store them in a folder named `roots`
* To reproduce the number of layers ablation study, run `python run_exp_nb_layers.py`. Once this is done, JSON files containing the training data will be stored in `results`. Then, you will be able to run `python plotting_scripts/make_plot_nb_layers.py` which will create (if default parameters are kept) a file named `plotting_scripts/nb_layers_ablation_2.5D.pdf` corresponding to Figure 4a of our preprint. Please note that you can tune some parameters, for instance `representation` in order to visualize the impact of the number of layers when using a different representation than 2.5D. In this case, you need to change accordingly the parameters in `run_exp_nb_layers.py` and in `make_plot_nb_layers.py`
* To reproduce the splitting strategy ablation study, run `python run_exp_splitting.py`, which will train models and dump the relevant JSONs in `results`. In order to reproduce the associated plot, run `python plotting_scripts/make_plot_splitting.py`. This will create a file named `plotting_scripts/splitting_ablation.pdf` reproducing Figure 3a of our preprint.
* To reproduce the representation ablation study, run `python run_exp_representations.py`, which will train models and dump the relevant JSONs in `results`. In order to reproduce the associated plot, run `python plotting_scripts/make_plot_representation.py`. This will create a file named `plotting_scripts/representation_ablation.pdf` reproducing Figure 4b of our preprint.
* To reproduce the benchmark table , run `python run_exp_splitting.py`, which will train each model with its default splitting and 2.5D representations with its best hyperparameters. Then run `python plotting_scripts/make_table_benchmark.py`. This will create a file named `plotting_scripts/final_benchmark.pdf` reproducing Table 2 of our preprint, alongside a CSV file `plotting_scripts/final_benchmark.csv`.
Once a training has been made in specific conditions, it won't be re-run if a subsequently used script needs to run it, unless you change the `retrain` parameter to `True`. Therefore, running experiments which trainings partially overlap won't lead to a waste of time.
* To reproduce the timing results, run `python timing.py`