Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/DataCanvasIO/Cooka

A lightweight and visual AutoML system
https://github.com/DataCanvasIO/Cooka

automated-feature-engineering automated-machine-learning automl data-science deep-learning hyperparameter-optimization machine-learning neural-network

Last synced: about 2 months ago
JSON representation

A lightweight and visual AutoML system

Awesome Lists containing this project

README

        


[![Python Versions](https://img.shields.io/pypi/pyversions/cooka.svg)](https://pypi.org/project/hypergbm)
[![Downloads](https://pepy.tech/badge/cooka)](https://pepy.tech/project/hypergbm)
[![PyPI Version](https://img.shields.io/pypi/v/cooka.svg)](https://pypi.org/project/hypergbm)

[Doc](https://cooka.readthedocs.io) | [简体中文](README_zh_CN.md)

Cooka is a lightweight and visualization toolkit to manage datasets and design model learning experiments through web UI.
It's using [DeepTables](https://github.com/DataCanvasIO/DeepTables) and [HyperGBM](https://github.com/DataCanvasIO/HyperGBM) as experiment engine to complete feature engineering, neural architecture search and hyperparameter tuning automatically.

![DataCanvas AutoML Toolkit](https://raw.githubusercontent.com/DataCanvasIO/Cooka/main/docs/static/DAT_latest.png)

## Features overview
Through the web UI provided by cooka you can:

- Add and analyze datasets
- Design experiment
- View experiment process and result
- Using models
- Export experiment to jupyter notebook

Screen shots:



The machine learning algorithms supported are :
- XGBoost
- LightGBM
- Catboost

The neural networks supported are:
- WideDeep
- DeepFM
- xDeepFM
- AutoInt
- DCN
- FGCNN
- FiBiNet
- PNN
- AFM
- [...](https://deeptables.readthedocs.io/en/latest/models.html)

The search algorithms supported are:
- Evolution
- MCTS(Monte Carlo Tree Search)
- [...](https://github.com/DataCanvasIO/HyperGBM)

The supported feature engineering provided by [scikit-learn](https://scikit-learn.org) and [featuretools](https://github.com/alteryx/featuretools) are:

- Scaler
- StandardScaler
- MinMaxScaler
- RobustScaler
- MaxAbsScaler
- Normalizer

- Encoder
- LabelEncoder
- OneHotEncoder
- OrdinalEncoder

- Discretizer
- KBinsDiscretizer
- Binarizer

- Dimension Reduction
- PCA

- Feature derivation
- featuretools

- Missing value filling
- SimpleImputer

It can also extend the search space to support more feature engineering methods and modeling algorithms.

## Installation

### Using pip

The python version should be >= 3.6, for CentOS , install the system package:

```shell script
pip install --upgrade pip
pip install cooka
```

Start the web server:
```shell script
cooka server
```

Then open `http://` with your browser to use cooka.

By default, the cooka configuration file is at `~/.config/cooka/cooka.py`, to generate a template:
```shell script
mkdir -p ~/.config/cooka/
cooka generate-config > ~/.config/cooka/cooka.py
```

### Using Docker

Launch a Cooka docker container:

```shell script
docker run -ti -p 8888:8888 -p 8000:8000 -p 9001:9001 -e COOKA_NOTEBOOK_PORTAL=http://:8888 datacanvas/cooka:latest
```

Open `http://` with your browser to visit cooka.

## Citation

If you use Cooka in your research, please cite us as follows:

Haifeng Wu, Jian Yang. Cooka: A lightweight and visual AutoML system. https://github.com/DataCanvasIO/Cooka, 2021. Version 0.1.x
```
@misc{cooka,
author={Haifeng Wu, Jian Yang},
title={{Cooka}: {A lightweight and visual AutoML system}},
howpublished={https://github.com/DataCanvasIO/Cooka},
note={Version 0.1.x},
year={2021}
}
```

## DataCanvas

![](https://raw.githubusercontent.com/DataCanvasIO/Cooka/main/docs/static/dc_logo_1.png)

Cooka is an open source project created by [DataCanvas](https://www.datacanvas.com/).