https://github.com/amlalejini/gecco-2024-phylogeny-informed-subsampling
Repository associated with 2024 GECCO paper submission.
https://github.com/amlalejini/gecco-2024-phylogeny-informed-subsampling
evolutionary-computation genetic-programming subsampling
Last synced: 7 months ago
JSON representation
Repository associated with 2024 GECCO paper submission.
- Host: GitHub
- URL: https://github.com/amlalejini/gecco-2024-phylogeny-informed-subsampling
- Owner: amlalejini
- License: mit
- Created: 2024-01-27T19:47:08.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-02-19T14:56:02.000Z (over 1 year ago)
- Last Synced: 2024-04-21T02:19:13.230Z (over 1 year ago)
- Topics: evolutionary-computation, genetic-programming, subsampling
- Language: HTML
- Homepage: https://lalejini.com/GECCO-2024-phylogeny-informed-subsampling/bookdown/book/
- Size: 14.7 MB
- Stars: 3
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Phylogeny-informed subsampling
[](https://lalejini.com/GECCO-2024-phylogeny-informed-subsampling/bookdown/book/)
[](https://doi.org/10.5281/zenodo.10576330)
[](https://osf.io/h3f52/)## Overview
### Abstract
> Phylogenies (ancestry trees) tell the evolutionary history of an evolving population.
In evolutionary computing, phylogenies reveal how evolutionary algorithms steer populations through a search space by illuminating the step-by-step evolution of solutions.
To date, phylogenetic analyses have almost exclusively been applied in post-hoc analyses of evolutionary algorithms for performance tuning and research.
Here, we apply phylogenetic information at runtime to augment parent selection procedures that use training sets to assess candidate solution quality.
We propose phylogeny-informed fitness estimation, thinning a fraction of costly training case evaluations by substituting the fitness profiles of near relatives as a heuristic estimate.
We evaluate phylogeny-informed fitness estimation in the context of the down-sampled lexicase and cohort lexicase selection algorithms on two diagnostic analyses and four genetic programming (GP) problems.
Our results indicate that phylogeny-informed fitness estimation can mitigate the drawbacks of down-sampled lexicase, improving diversity maintenance and search space exploration.
However, the extent to which phylogeny-informed fitness estimation improves problem-solving success for GP varies by problem, subsampling method, and subsampling level.
This work serves as an initial step toward improving evolutionary algorithms by exploiting runtime phylogenetic analysis.## Repository guide
- `docs/` contains supplemental documentation for our methods.
- `experiments/` contains HPC job submission scripts, configuration files, and data analyses for all experiments.
- `include/` contains C++ implementations of experiment software (header only).
- `scripts/` contains generically useful scripts used in this work.
- `source/` contains .cpp files that can be compiled to run our experiments.