An open API service indexing awesome lists of open source software.

https://github.com/vmware-samples/efficient-multiclass-classification

Duet is a scikit-learn classifier for resource-efficient multiclass classification that incorporates the advantages of bagging and boosting decision-tree-based ensemble methods (DTEMs) by using two classifiers instead of a monolithic one. A simple bagging model is trained using the entire training dataset and is responsible for capturing the easier concepts. Then, a boosting model is trained using only a fraction of the dataset representing the concepts the bagging model finds hard.
https://github.com/vmware-samples/efficient-multiclass-classification

Last synced: 7 months ago
JSON representation

Duet is a scikit-learn classifier for resource-efficient multiclass classification that incorporates the advantages of bagging and boosting decision-tree-based ensemble methods (DTEMs) by using two classifiers instead of a monolithic one. A simple bagging model is trained using the entire training dataset and is responsible for capturing the easier concepts. Then, a boosting model is trained using only a fraction of the dataset representing the concepts the bagging model finds hard.

Awesome Lists containing this project

README

          

# Duet scikit classifier (v1.0)

## Overview

Duet is a decision tree ensemble method based multiclass classification
framework that offers a more efficient resource usage while preserving and even
improving the classification accuracy in comparison to standard monolithic
models.

Duet is based on a small bagging ensemble model and a booting model.

The current implementation of Duet is based on Random Forest and XGBoost.

## Documentation

More details about the Duet can be found in the following paper:

"Efficient Multiclass Classification with Duet" [EuroMLSys '22]


## Files:

#### duet_classifier.py
Duet scikit classifier

#### classification_example.py
Basic classification example by Duet

#### grid_search_example.py
Basic grid search example with Duet

## Prerequisities:
numpy

pandas

skleran

xgboost

or alternatively, run:

$ pip3 install -r requirements.txt

## Contributing

The efficient-multiclass-classification project team welcomes contributions from the community. Before you start working with efficient-multiclass-classification, please
read our [Developer Certificate of Origin](https://cla.vmware.com/dco). All contributions to this repository must be
signed as described on that page. Your signature certifies that you wrote the patch or have the right to pass it on
as an open-source patch. For more detailed information, refer to [CONTRIBUTING.md](CONTRIBUTING.md).

## License

BSD-3 License

## Contact us

For more information, support and advanced examples contact:

Yaniv Ben-Itzhak, [ybenitzhak@vmware.com](mailto:ybenitzhak@vmware.com)

Shay Vargaftik, [shayv@vmware.com](mailto:shayv@vmware.com)