Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/kenwuqianghao/srbench

SRBench Results and Scripts for Symbolic GPT
https://github.com/kenwuqianghao/srbench

Last synced: 12 days ago
JSON representation

SRBench Results and Scripts for Symbolic GPT

Host: GitHub
URL: https://github.com/kenwuqianghao/srbench
Owner: KenWuqianghao
License: gpl-3.0
Created: 2024-10-08T13:59:47.000Z (about 1 month ago)
Default Branch: main
Last Pushed: 2024-11-03T21:46:50.000Z (15 days ago)
Last Synced: 2024-11-03T22:26:49.724Z (15 days ago)
Language: Jupyter Notebook
Homepage:
Size: 8.99 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE

Awesome Lists containing this project

README

# SRBench: A Living Benchmark for Symbolic Regression

The methods for symbolic regression (SR) have come a long way since the days of Koza-style genetic programming (GP).
Our goal with this project is to keep a living benchmark of modern symbolic regression, in the context of state-of-the-art ML methods.

Currently these are the challenges, as we see it:

- Lack of cross-pollination between the GP community and the ML community (different conferences, journals, societies etc)
- Lack of strong benchmarks in SR literature (small problems, toy datasets, weak comparator methods)
- Lack of a unified framework for SR, or GP

We are addressing the lack of pollination by making these comparisons open source, reproduceable and public, and hoping to share them widely with the entire ML research community.
We are trying to address the lack of strong benchmarks by providing open source benchmarking of many SR methods on large sets of problems, with strong baselines for comparison.
To handle the lack of a unified framework, we've specified minimal requirements for contributing a method to this benchmark: a scikit-learn compatible API.

# Results

[Browse the Current Results](postprocessing/)

This benchmark currently consists of **14** symbolic regression methods, **7** other ML methods, and **252** datasets from [PMLB](https://github.com/EpistasisLab/penn-ml-benchmarks), including real-world and synthetic datasets from processes with and without ground-truth models.

Methods currently benchmarked:

- Age-Fitness Pareto Optimization (Schmidt and Lipson 2009)
[paper](https://dl.acm.org/doi/pdf/10.1145/1830483.1830584)
,
[code](https://github.com/cavalab/ellyn)
- Age-Fitness Pareto Optimization with Co-evolved Fitness Predictors (Schmidt and Lipson 2009)
[paper](https://dl.acm.org/doi/pdf/10.1145/1830483.1830584?casa_token=8fAFUrPlfuUAAAAA:u0QJvX-cC8rPtdZri-Jd4ZxcnRSIF_Fu2Vn5n-oXVNu_i71J6ZECx28ucLPOLQY628drsEbg4aFvTw)
,
[code](https://github.com/cavalab/ellyn)
- AIFeynman 2.0 (Udrescu et al. 2020)
[paper](https://arxiv.org/abs/2006.10782)
,
[code](https://github.com/SJ001/AI-Feynman)
- Bayesian Symbolic Regression (Jin et al. 2020)
[paper](https://arxiv.org/abs/1910.08892)
,
[code](https://github.com/ying531/MCMC-SymReg)
- Deep Symbolic Regression (Petersen et al. 2020)
[paper](https://arxiv.org/pdf/1912.04871)
,
[code](https://github.com/brendenpetersen/deep-symbolic-optimization)
- Fast Function Extraction (McConaghy 2011)
[paper](http://trent.st/content/2011-GPTP-FFX-paper.pdf)
,
[code](https://github.com/natekupp/ffx)
- Feature Engineering Automation Tool (La Cava et al. 2017)
[paper](https://arxiv.org/abs/1807.00981)
,
[code](https://github.com/lacava/feat)
- epsilon-Lexicase Selection (La Cava et al. 2016)
[paper](https://arxiv.org/abs/1905.13266)
,
[code](https://github.com/cavalab/ellyn)
- GP-based Gene-pool Optimal Mixing Evolutionary Algorithm (Virgolin et al. 2017)
[paper](https://dl.acm.org/doi/pdf/10.1145/3071178.3071287?casa_token=CHa8EK_ic5gAAAAA:mOAOCu6CL-jHobGWKD2wco4NbpCyS-XTY5thb1dPPsyUkTkLHzmLMF41MWMGWLyFv1G8n-VFaqmXSw)
,
[code](https://github.com/marcovirgolin/GP-GOMEA/)
- gplearn (Stephens)
[code](https://github.com/trevorstephens/gplearn)
- Interaction-Transformation Evolutionary Algorithm (de Franca and Aldeia, 2020)
[paper](https://www.mitpressjournals.org/doi/abs/10.1162/evco_a_00285)
,
[code](https://github.com/folivetti/ITEA/)
- Multiple Regression GP (Arnaldo et al. 2014)
[paper](https://dl.acm.org/doi/pdf/10.1145/2576768.2598291?casa_token=Oh2e7jDBgl0AAAAA:YmYJhFniOrU0yIhsqrHGzUN_60veH56tfwizre94uImDpYyp9RcadUyv_VZf8gH7v3uo5SxjjIPPUA)
,
[code](https://github.com/flexgp/gp-learners)
- Operon (Burlacu et al. 2020)
[paper](https://dl.acm.org/doi/pdf/10.1145/3377929.3398099?casa_token=HJgFp342K0sAAAAA:3Xbelm-5YjcIgjMvqLcyoTYdB0wNR0S4bYcQBGUiwOuwqbFfV6YnE8YKGINija_V6wCi6dahvQ3Pxg)
,
[code](https://github.com/heal-research/operon)
- Semantic Backpropagation GP (Virgolin et al. 2019)
[paper](https://dl.acm.org/doi/pdf/10.1145/3321707.3321758?casa_token=v43VobsGalkAAAAA:Vj8S9mHAv-H4tLm_GCL4DJdfW3e5SVUtD6J3gIQh0vrNzM3s6psjl-bwO2NMnxLN0thRJ561OZ0sQA)
,
[code](https://github.com/marcovirgolin/GP-GOMEA)

# Contribute

We are actively updating and expanding this benchmark.
Want to add your method?
See our [Contribution Guide.](CONTRIBUTING.md)

# Cite

A pre-print of the current version of the benchmark is available:

La Cava, W., Orzechowski, P., Burlacu, B., de França, F. O., Virgolin, M., Jin, Y., Kommenda, M., & Moore, J. H. (2021).
Contemporary Symbolic Regression Methods and their Relative Performance.
To appear in _Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks._
[arXiv](https://arxiv.org/abs/2107.14351)

[v1.0](https://github.com/EpistasisLab/regression-benchmark/releases/tag/v1.0) was reported in our GECCO 2018 paper:

Orzechowski, P., La Cava, W., & Moore, J. H. (2018).
Where are we now? A large benchmark study of recent symbolic regression methods.
GECCO 2018. [DOI](https://doi.org/10.1145/3205455.3205539), [Preprint](https://www.researchgate.net/profile/Patryk_Orzechowski/publication/324769381_Where_are_we_now_A_large_benchmark_study_of_recent_symbolic_regression_methods/links/5ae779b70f7e9b837d392dc9/Where-are-we-now-A-large-benchmark-study-of-recent-symbolic-regression-methods.pdf)

# Contact

William La Cava (@lacava), william dot lacava at childrens dot harvard dot edu

Patryk Orzechowski (@athril), patryk dot orzechowski at gmail dot com

# Using SRBench

## Installation

We have provided a [conda environment](environment.yml), [configuration script](configure.sh) and [installation script](install.sh) that should make installation straightforward.
We've currently tested this on Ubuntu and CentOS.
Steps:

1. Install the conda environment:

```bash
conda env create -f environment.yml
conda activate srbench
```

2. Install the benchmark methods:

```bash
bash install.sh
```

3. Download the PMLB datasets:

```bash
git clone https://github.com/EpistasisLab/pmlb/ [/path/to/pmlb/]
cd /path/to/pmlb
git lfs pull
```

## Reproducing the benchmark results

Experiments are launched from the `experiments/` folder via the script `analyze.py`.
The script can be configured to run the experiment in parallel locally, on an LSF job scheduler, or on a SLURM job scheduler.
To see the full set of options, run `python analyze.py -h`.

**WARNING**: running some of the commands below will submit tens of thousands of experiments.
Use accordingly.

### Black-box experiment
After installing and configuring the conda environment, the complete black-box experiment can be started via the command:

```bash
python analyze.py /path/to/pmlb/datasets -n_trials 10 -results ../results_blackbox -time_limit 48:00
```

### Ground-truth experiment

**Train the models**: we train the models subject to varying levels of noise using the options below.

```bash
# submit the ground-truth dataset experiment.

for data in "/path/to/pmlb/datasets/strogatz_" "/path/to/pmlb/datasets/feynman_" ; do # feynman and strogatz datasets
for TN in 0 0.001 0.01 0.1; do # noise levels
python analyze.py \
$data"*" \ #data folder
-results ../results_sym_data \ # where the results will be saved
-target_noise $TN \ # level of noise to add
-sym_data \ # for datasets with symbolic models
-n_trials 10 \
-m 16384 \ # memory limit in MB
-time_limit 9:00 \ # time limit in hrs
-job_limit 100000 \ # this will restrict how many jobs actually get submitted.
-tuned # use the tuned version of the estimators, rather than performing hyperparameter tuning.
if [ $? -gt 0 ] ; then
break
fi
done
done
```

**Symbolic Assessment**: Following model training, the trained models are assessed for symbolic equivalence with the ground-truth data-generating processes.
This is handled in [assess_symbolic_model.py](experiment/assess_symbolic_model.py).
Use `analyze.py` to generate batch calls to this function as follows:

```bash
# assess the ground-truth models that were produced using sympy
for data in "/path/to/pmlb/datasets/strogatz_" "/path/to/pmlb/datasets/feynman_" ; do # feynman and strogatz datasets
for TN in 0 0.001 0.01 0.1; do # noise levels
python analyze.py \
-script assess_symbolic_model \
$data"*" \ #data folder
-results ../results_sym_data \ # where the results will be saved
-target_noise $TN \ # level of noise to add
-sym_data \ # for datasets with symbolic models
-n_trials 10 \
-m 8192 \ # memory limit in MB
-time_limit 1:00 \ # time limit in hrs
-job_limit 100000 \ # this will restrict how many jobs actually get submitted.
-tuned # use the tuned version of the estimators, rather than performing hyperparameter tuning.
if [ $? -gt 0 ] ; then
break
fi
done
done
```

**Output**: next to each `.json` file, an additional file named `.json.updated` is saved with the symbolic assessment included.

### Post-processing

Navigate to the [postprocessing](postprocessing) folder to begin postprocessing the experiment results.
The following two scripts collate the `.json` files into two `.feather` files to share results more easily.
You will notice these `.feather` files are loaded to generate figures in the notebooks.
They also perform some cleanup like shortening algorithm names, etc.

```
python collate_blackbox_results.py
python collate_groundtruth_results.py
```

**Visualization**

- [groundtruth_results.ipynb](postprocessing/groundtruth_results.ipynb): ground-truth results comparisons
- [blackbox_results.ipynb](postprocessing/blackbox_results.ipynb): ground-truth results comparisons
- [statistical_comparisons.ipynb](postprocessing/statistical_comparisons.ipynb): post-hoc statistical comparisons
- [pmlb_plots](postprocessing/pmlb_plots.ipynb): the [PMLB](https://github.com/EpistasisLab/pmlb) datasets visualization