Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/ecrl/ecabc

Artificial Bee Colony for generic feature tuning
https://github.com/ecrl/ecabc

artificial-bee-colony feature-tuning hyperparameter-optimization neural-network

Last synced: about 2 months ago
JSON representation

Artificial Bee Colony for generic feature tuning

Awesome Lists containing this project

README

        

[![UML Energy & Combustion Research Laboratory](https://sites.uml.edu/hunter-mack/files/2021/11/ECRL_final.png)](http://faculty.uml.edu/Hunter_Mack/)

# ECabc: optimization algorithm for tuning user-defined parametric functions
[![GitHub version](https://badge.fury.io/gh/ECRL%2FECabc.svg)](https://badge.fury.io/gh/ECRL%2FECabc)
[![PyPI version](https://badge.fury.io/py/ecabc.svg)](https://badge.fury.io/py/ecabc)
[![GitHub license](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/ECRL/ecabc/blob/master/LICENSE)
[![DOI](http://joss.theoj.org/papers/10.21105/joss.01420/status.svg)](https://doi.org/10.21105/joss.01420)

**ECabc** is an open source Python package used to tune parameters for user-supplied functions based on the [Artificial Bee Colony by D. Karaboğa](http://scholarpedia.org/article/Artificial_bee_colony_algorithm). ECabc optimizes user supplied functions, or **fitness function**s, using a set of variables that exist within a search space. The bee colony consists of three types of bees: employers, onlookers and scouts. An **employer bee** exploits a solution comprised of a permutation of the variables in the search space, and evaluates the viability of the solution. An **onlooker bee** chooses an employer bee with an optimal solution and searches for new solutions near them. The **scout bee**, a variant of the employer bee, will search for a new solution if it has stayed too long at its current solution.



### Research applications
While it has several applications, ECabc has been successfully used by the Energy and Combustion Research Laboratory (ECRL) at the University of Massachusetts Lowell to tune the hyperparameters of ECNet, an open source Python package tailored to predicting fuel properties. ECNet provides scientists an open source tool for predicting key fuel properties of potential next-generation biofuels, reducing the need for costly fuel synthesis and experimentation. By increasing the accuracy of ECNet and similar models efficiently, ECabc helps to provide a higher degree of confidence in discovering new, optimal fuels. A single run of ECabc on ECNet yielded a lower average root mean square error (RMSE) for cetane number (CN) and yield sooting index (YSI) when compared to the RMSE generated by a year of manual tuning. While the manual tuning generated an RMSE of 10.13, the ECabc was able to yield an RMSE of 8.06 in one run of 500 iterations.

# Installation

### Prerequisites:
- Have python 3.X installed
- Have the ability to install python packages

### Method 1: pip
If you are working in a Linux/Mac environment:
```
sudo pip install ecabc
```

Alternatively, in a Windows or virtualenv environment:
```
pip install ecabc
```

To update your version of ECabc to the latest release version, use
```
pip install --upgrade ecabc
```

Note: if multiple Python releases are installed on your system (e.g. 2.7 and 3.7), you may need to execute the correct version of pip. For Python 3.X, change **"pip install ecabc"** to **"pip3 install ecabc"**.

### Method 2: From source
- Download the ECabc repository, navigate to the download location on the command line/terminal, and execute:
```
pip install .
```

There are currently no additional dependencies for ECabc.

# Usage

To start using ECabc, you need a couple items:
- a fitness function (cost function) to optimize
- parameters used by the fitness function

For example, let's define a fitness function to minimize the sum of three integers:

```python
def minimize_integers(integers):

return sum(integers)

```

Your fitness function must accept a **list** from ECabc. The list values represent the current "food source", i.e. parameter values, being exploited by a given bee.

Now that we have our fitness function, let's import the ABC object from ECabc, initialize the artificial bee colony, and add our parameters:

```python
from ecabc import ABC

def minimize_integers(integers):

return sum(integers)

abc = ABC(10, minimize_integers)
abc.add_param(0, 10, name='Int_1')
abc.add_param(0, 10, name='Int_2')
abc.add_param(0, 10, name='Int_3')
```

Here we initialize the colony with 10 employer bees, supply our fitness function, and add our parameters. Parameters are added with minimum/maximum values for its search space and optionally a name. By default, parameter mutations (searching a neighboring food source) will not exceed the specified parameter bounds [min_val, max_val]; if this limitation is not desired, supply the "restrict=False" argument:

```python
abc.add_param(0, 10, restrict=False, name='Int_1')
```

Once we have created our colony and added our parameters, we then need to "initialize" the colony's bees:

```python
from ecabc import ABC

def minimize_integers(integers):

return sum(integers)

abc = ABC(10, minimize_integers)
abc.add_param(0, 10, name='Int_1')
abc.add_param(0, 10, name='Int_2')
abc.add_param(0, 10, name='Int_3')
abc.initialize()
```

Initializing the colony's bees deploys employer bees (in this example, 10 bees) to random food sources (random parameter values are generated), their fitness is evaluated (in this example, lowest sum is better), and onlooker bees (equal to the number of employers) are deployed proportionally to neighboring food sources of well-performing bees.

We then send the colony through a predetermined of "search cycles":

```python
from ecabc import ABC

def minimize_integers(integers):

return sum(integers)

abc = ABC(10, minimize_integers)
abc.add_param(0, 10, name='Int_1')
abc.add_param(0, 10, name='Int_2')
abc.add_param(0, 10, name='Int_3')
abc.initialize()
for _ in range(10):
abc.search()
```

A search cycle consists of:
- each bee searches a neighboring food source (performs a mutation on one parameter)
- if the food source produces a better fitness than the bee's current food source, move there
- otherwise, the bee stays at its current food source
- if the bee has stayed for (NE * D) cycles (NE = number of employers, D = dimension of the function, 3 in our example), abandon the food source
- if the bee is an employer, go to a new random food source
- if the bee is an onlooker, go to a food source neighboring a well-performing bee

We can access the colony's average fitness score, average fitness function return value, best fitness score, best fitness function return value and best parameters at any time:

```python
print(abc.average_fitness)
print(abc.average_ret_val)
print(abc.best_fitness)
print(abc.best_ret_val)
print(abc.best_params)
```

ECabc can utilize multiple CPU cores for concurrent processing:

```python
abc = ABC(10, minimize_integers, num_processes=8)
```

Tying everything together, we have:

```python
from ecabc import ABC

def minimize_integers(integers):

return sum(integers)

abc = ABC(10, minimize_integers)
abc.add_param(0, 10, name='Int_1')
abc.add_param(0, 10, name='Int_2')
abc.add_param(0, 10, name='Int_3')
abc.initialize()
for _ in range(10):
abc.search()
print('Average fitness: {}'.format(abc.average_fitness))
print('Average obj. fn. return value: {}'.format(abc.average_ret_val))
print('Best fitness score: {}'.format(abc.best_fitness))
print('Best obj. fn. return value: {}'.format(abc.best_ret_val))
print('Best parameters: {}\n'.format(abc.best_params))
```

Running this script produces:

```
Average fitness: 0.08244866244866243
Average obj. fn. return value: 11.65
Best fitness score: 0.125
Best obj. fn. return value: 7
Best parameters: {'Int_1': 4, 'Int_2': 3, 'Int_3': 0}

Average fitness: 0.0885855117105117
Average obj. fn. return value: 10.8
Best fitness score: 0.125
Best obj. fn. return value: 7
Best parameters: {'Int_1': 4, 'Int_2': 3, 'Int_3': 0}

Average fitness: 0.10361832611832611
Average obj. fn. return value: 9.4
Best fitness score: 0.16666666666666666
Best obj. fn. return value: 5
Best parameters: {'Int_1': 2, 'Int_2': 3, 'Int_3': 0}

Average fitness: 0.11173502151443326
Average obj. fn. return value: 8.8
Best fitness score: 0.2
Best obj. fn. return value: 4
Best parameters: {'Int_1': 0, 'Int_2': 0, 'Int_3': 4}

Average fitness: 0.12448879551820731
Average obj. fn. return value: 7.95
Best fitness score: 0.2
Best obj. fn. return value: 4
Best parameters: {'Int_1': 1, 'Int_2': 3, 'Int_3': 0}

Average fitness: 0.1767694805194805
Average obj. fn. return value: 6.7
Best fitness score: 1.0
Best obj. fn. return value: 0
Best parameters: {'Int_1': 0, 'Int_2': 0, 'Int_3': 0}

Average fitness: 0.183255772005772
Average obj. fn. return value: 6.3
Best fitness score: 1.0
Best obj. fn. return value: 0
Best parameters: {'Int_1': 0, 'Int_2': 0, 'Int_3': 0}

Average fitness: 0.20172799422799423
Average obj. fn. return value: 5.65
Best fitness score: 1.0
Best obj. fn. return value: 0
Best parameters: {'Int_1': 0, 'Int_2': 0, 'Int_3': 0}

Average fitness: 0.23827561327561328
Average obj. fn. return value: 4.95
Best fitness score: 1.0
Best obj. fn. return value: 0
Best parameters: {'Int_1': 0, 'Int_2': 0, 'Int_3': 0}

Average fitness: 0.28456349206349213
Average obj. fn. return value: 4.35
Best fitness score: 1.0
Best obj. fn. return value: 0
Best parameters: {'Int_1': 0, 'Int_2': 0, 'Int_3': 0}
```

To run this script yourself, head over to our [examples](https://github.com/ecrl/ecabc/tree/master/examples) directory.

# Contributing, Reporting Issues and Other Support:

To contribute to ECabc, make a pull request. Contributions should include tests for new features added, as well as extensive documentation.

To report problems with the software or feature requests, file an issue. When reporting problems, include information such as error messages, your OS/environment and Python version.

For additional support/questions, contact Sanskriti Sharma ([email protected]), Hernan Gelaf-Romer ([email protected]), or Travis Kessler ([email protected]).