https://github.com/vida-nyu/alpha-automl
Alpha-AutoML is a Python library for automatically generating end-to-end machine learning pipelines.
https://github.com/vida-nyu/alpha-automl
automl data-science machine-learning python
Last synced: 4 months ago
JSON representation
Alpha-AutoML is a Python library for automatically generating end-to-end machine learning pipelines.
- Host: GitHub
- URL: https://github.com/vida-nyu/alpha-automl
- Owner: VIDA-NYU
- License: apache-2.0
- Created: 2023-02-15T21:30:19.000Z (over 3 years ago)
- Default Branch: devel
- Last Pushed: 2024-05-28T20:58:07.000Z (about 2 years ago)
- Last Synced: 2024-05-29T07:45:38.399Z (about 2 years ago)
- Topics: automl, data-science, machine-learning, python
- Language: Python
- Homepage: https://alpha-automl.readthedocs.io
- Size: 48 MB
- Stars: 17
- Watchers: 12
- Forks: 3
- Open Issues: 11
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE.txt
Awesome Lists containing this project
README
[](https://pypi.org/project/alpha-automl)
[](https://opensource.org/licenses/Apache-2.0)
[](https://github.com/VIDA-NYU/alpha-automl/actions/workflows/build.yml)
[](https://alpha-automl.readthedocs.io/en/latest/?badge=latest)

Alpha-AutoML is an AutoML system that automatically searches for models and derives end-to-end pipelines that read,
pre-process the data, and train the model. Alpha-AutoML leverages recent advances in deep reinforcement learning and is
able to adapt to different application domains and problems through incremental learning.
Alpha-AutoML provides data scientists and data engineers the flexibility to address complex problems by leveraging the
Python ecosystem, including open-source libraries and tools, support for collaboration, and infrastructure that enables
transparency and reproducibility.
This repository is part of New York University's implementation of the
[Data Driven Discovery project (D3M)](https://datadrivendiscovery.org/).
## Documentation
Documentation is available [here](https://alpha-automl.readthedocs.io/).
## Installation
This package works with Python 3.6+ in Linux, Mac, and Windows.
You can install the latest stable version of this library from [PyPI](https://pypi.org/project/alpha-automl/):
```
pip install alpha-automl
```
To install the latest development version:
```
pip install git+https://github.com/VIDA-NYU/alpha-automl@devel
```
## Docker
### Pre-built Docker Image
We provide pre-built docker images with Jupyter and Alpha-AutoML pre-installed that you can use to quickly test Alpha-AutoML.
To test it, you can run the following command in your machine, and open Jupyter Notebook on your browser:
```
docker run -p 8888:8888 ghcr.io/vida-nyu/alpha-automl
```
Using this command, Jupyter Notebook will auto-generate a security token. The correct URL to access the Jupyter will be printed in the console output and will look like: `http://127.0.0.1:8888/?token=70ace7fa017c35ba0134dc7931add12bf55a69d4d4e6e54f`.
Alternatively, if you want to provide a custom security token, you can run:
```
docker run -p 8888:8888 -e JUPYTER_TOKEN="" ghcr.io/vida-nyu/alpha-automl
```
If you are running the Jupyter Notebook in a secure environment, the authentication can be disabled as follows:
```
docker run -p 8888:8888 ghcr.io/vida-nyu/alpha-automl --NotebookApp.token=''
```
### Docker Image From Scratch
If you need to build an image from sources, you can use our [Dockerfile](https://github.com/VIDA-NYU/alpha-automl/blob/devel/Dockerfile). You can use a docker-build argument to select the packages that will be installed in the image (e.g., `full`, `timeseries`, `nlp`, etc) as follows:
```
docker build -t alpha-automl --build-arg BUILD_OPTION=full .
```
Or simply a base version using (this will use less disk space but will not provide support for some tasks such as NLP and timeseries):
```
docker build -t alpha-automl:latest --target alpha-automl .
```
You can also build an image to use with JupyterHub as follows:
```
docker build -t alpha-automl:latest-jupyterhub --target alpha-automl-jupyterhub .
```
See also the documentation on how to setup Alpha-AutoML + JupyterHub on [Kubernetes](https://github.com/VIDA-NYU/alpha-automl/tree/devel/kubernetes).
## Others
Documentation for the Streamlit app for image triage developed by Jataware Corp is available [here](https://github.com/jataware/st-image-triage), see this [video demo](https://drive.google.com/file/d/1h3o0C0wNfT2AQduhqfgGEWl8fInFkdFZ/view).
## Acknowledgment
The development of Alpha-AutoML was supported by the DARPA D3M Program. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of DARPA.