https://github.com/kxsystems/automl
Automated Machine Learning Framework for kdb+
https://github.com/kxsystems/automl
automated-machine-learning kdb machine-learning q
Last synced: 4 months ago
JSON representation
Automated Machine Learning Framework for kdb+
- Host: GitHub
- URL: https://github.com/kxsystems/automl
- Owner: KxSystems
- License: apache-2.0
- Created: 2020-03-03T17:15:58.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2024-10-10T16:03:05.000Z (9 months ago)
- Last Synced: 2025-02-16T21:40:14.327Z (4 months ago)
- Topics: automated-machine-learning, kdb, machine-learning, q
- Language: q
- Homepage:
- Size: 2.38 MB
- Stars: 26
- Watchers: 6
- Forks: 19
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
# ⚠️ **This repository is outdated!** ⚠️
---
Since 9th October 2024 (the 4.0.0 release of the ML Toolkit), this project has been merged into the ml toolkit mono-repo. For the latest updates and active development, please visit [https://github.com/KxSystems/ml](https://github.com/KxSystems/ml).
This repository is preserved only to maintain old links and project history but will no longer be actively maintained.
---
# Automated machine learning in kdb+
[](https://github.com/kxsystems/automl/releases) [](https://travis-ci.com/KxSystems/automl)
The automated machine learning library described here is built largely on the tools available within the [Machine Learning Toolkit](https://github.com/kxsystems/ml). The purpose of this framework is help you automate the process of applying machine learning techniques to real-world problems. In the absence of expert machine-learning engineers this handles the following processes within a traditional workflow.
- Data preprocessing
- Feature engineering and feature selection
- Model selection
- Hyperparameter Tuning
- Report generation and model persistenceEach of these steps is outlined in depth within the [documentation](docs).
This allows you to understand how decisions are being made and the transformations which their data undergo during the production of the output models.At present the machine learning frameworks supported for this are based on:
1. One-to-one feature to target non time-series
2. FRESH based feature extraction and model production
3. NLP-based feature creation and word2vec transformation.The problems which can be solved by this framework will be expanded over time as will the available functionality.
## Requirements
The following requirements cover all those needed to run the libraries in the current build of the toolkit.
- [embedPy](https://github.com/KxSystems/embedPy)
- ML-Toolkit ≥ 3.0.0A number of Python dependencies also exist for the running of embedPy functions within both the the machine-learning utilities and FRESH libraries. Install as follows:
pip:
```bash
pip install -r requirements.txt
```or via conda:
```bash
conda install --file requirements.txt
```### Optional requirements for advanced modules
The above requirements allow you to access the base functionality of AutoML. Additional modules are available – including Sobol sequence hyperparameter search, LaTeX report generation and Keras, PyTorch and NLP models. However, given the large memory requirement for the dependencies of these modules, they are not included in the base functionality and must be installed if required.
**Sobol search** - via pip (see package details [here](https://pypi.org/project/sobol-seq/)):
```bash
sobol-seq
```**LaTeX** - via conda or pip:
```bash
pylatex
```**Keras** - via conda or pip:
```bash
keras
tensorflow
```**PyTorch** - via conda or pip:
```bash
torch
```**Theano** - via conda or pip:
```bash
theano
```**NLP**
The NLP functionality in AutoML requires the [KX NLP library](https://github.com/KxSystems/nlp) along with `gensim` which can be installed using conda or pip.
## Installation
Place the library file in `$QHOME` and load into a q instance using `automl/automl.q`
This will load all the functions contained within the `.ml` namespace
```q
$q automl/automl.q
q).automl.loadfile`:init.q
```## Documentation
:open_file_folder: [`docs`](docs)
## Status
Automated machine learning in kdb+ is still in development and is available here as a beta release, further functionality and improvements will be made to the library in the coming months.
Any issues with the framework should be raised in the issues section of this repository. Functionality suggestions or more general questions should be submitted via email to [email protected]