Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/facultyai/faculty-xval

Cross-validation of Keras and scikit-learn models with the Faculty platform
https://github.com/facultyai/faculty-xval

cross-validation faculty-platform keras machine-learning python scikit-learn

Last synced: about 3 hours ago
JSON representation

Cross-validation of Keras and scikit-learn models with the Faculty platform

Awesome Lists containing this project

README

        

![img|small](img/logo.png)

# faculty-xval

Cross validation of machine-learning models on Faculty platform. At present, the
package mostly offers a way to cross validate models in parallel by means of
Faculty jobs. To access the functionality one makes use of the class:

```python
faculty_xval.validation.JobsCrossValidator
```

Additional information is found in the example notebooks provided. Please have a
look at the section `Try out the examples` below.

The package supports `keras` and `sklearn` models. Whilst one can write custom
models that are compatible with `faculty-xval`, no guarantee is given that the
package handles these situations correctly, in particular because of issues
concerning the randomisation of weights.

Two sets of installation instructions are provided below:

- If you would like to simply use `faculty-xval`, please follow the
`User installation instructions`.
- If you would like to develop `faculty-xval` further, please follow the
`Developer installation instructions`.

## User installation instructions

### Create an environment

In your project on Faculty platform, create an environment named `faculty_xval`.
In the `PYTHON` section, select `Python 3` and `pip` from the dropdown menus.
Then, type `faculty-xval` in the text box, and click on the `ADD` button.

The environment installs the package `faculty-xval`, and should be applied on
every server that you create; this includes both interactive servers and job
servers, as explained next.

### Create a job definition

Create a new job definition named `cross_validation`. In the `COMMAND` section,
paste the following:

`faculty_xval_jobs_xval $in_paths`

Then, add a `PARAMETER` with the name `in_paths`, and ensure that the
`Make field mandatory` box is checked.

Finally, under `SERVER SETTINGS`, add `faculty_xval` to the `ENVIRONMENTS`
section.

For cross-validation jobs that are computationally intensive, we recommend using
dedicated servers as opposed to running on shared infrastructure. To achieve
this, click on `Large and GPU servers` under `SERVER RESOURCES`, and select an
appropriate server type from the dropdown menu.

Remember to click `SAVE` when you are finished.

## Developer installation instructions

### Select a username

Before beginning the installation process, pick an appropriate username, such as
`foo`. This does not necessarily need to match your Faculty platform username.
In the following instructions, your selected username will be referred to as
``.

### Clone the repository

Create the folder `/project/`. Then, run the commands:

```bash
cd /project/
git clone https://github.com/facultyai/faculty-xval.git
```

### Create an environment

Next, create an environment in your project named `faculty_xval_`.

In this environment, under `SCRIPTS`, paste in the following code to the `BASH`
section, remembering to change the `USER_NAME` definition on the second line to
your selected ``:

```bash
# Remember to change username!
USER_NAME=

# Install faculty-xval from local repository.
pip install /project/$USER_NAME/faculty-xval/

# Turn USER_NAME into an environment variable.
echo "export USER_NAME=$USER_NAME" > /etc/faculty_environment.d/app.sh
if [[ -d /etc/service/jupyter ]] ; then
sudo sv restart jupyter
fi
```

This environment should be applied on every server that you create; this
includes both 'normal' interactive servers and job servers, as explained next.

### Create a job definition

Next, create a new job definition named `cross_validation_`. In the
`COMMAND` section, paste the following:

`faculty_xval_jobs_xval $in_paths`

Then, add a `PARAMETER` with the name `in_paths`, and ensure that the
`Make field mandatory` box is checked.

Finally, under `SERVER SETTINGS`, add `faculty_xval_` to the
`ENVIRONMENTS` section.

For cross-validation jobs that are computationally intensive, we recommend using
dedicated servers as opposed to running in the cluster. To achieve this, click
on `Large and GPU servers` under `SERVER RESOURCES`, and select an appropriate
server type from the dropdown menu.

Remember to click `SAVE` when you are finished.

## Try out the examples

Please clone this repository. Examples of cross validation with `faculty-xval`
for the different types of model are provided in the directories
`examples/keras` and `examples/sklearn`. Usage instructions are then divided in
two notebooks:

- `jobs_cross_validator_run.ipynb` loads the data, instantiates the model, and
starts a Faculty job that carries out the cross validation.
- `jobs_cross_validator_analyse.ipynb` gathers the results from the cross
validation, reloads the target data, and calculates the model accuracy over
multiple train-test splits.

Note that the example notebooks must be run in the order just defined.