Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/stewartpark/scikit-small-ensemble

scikit-small-ensemble is a library to make your ensemble models(Random Forest Classifier, etc) have a small memory footprint/usage.
https://github.com/stewartpark/scikit-small-ensemble

compression ensemble-learning lz4 mmap random-forest-classifier scikit-learn

Last synced: 2 months ago
JSON representation

scikit-small-ensemble is a library to make your ensemble models(Random Forest Classifier, etc) have a small memory footprint/usage.

Host: GitHub
URL: https://github.com/stewartpark/scikit-small-ensemble
Owner: stewartpark
Created: 2017-01-13T07:58:19.000Z (about 8 years ago)
Default Branch: master
Last Pushed: 2018-03-29T17:12:59.000Z (almost 7 years ago)
Last Synced: 2024-11-15T10:55:26.936Z (2 months ago)
Topics: compression, ensemble-learning, lz4, mmap, random-forest-classifier, scikit-learn
Language: Python
Size: 5.86 KB
Stars: 3
Watchers: 3
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        scikit-small-ensemble

=====================

scikit-small-ensemble is a library to make your ensemble models(Random Forest Classifier, etc) have a small memory footprint/usage.

Introduction

============

Ensemble models can be very memory-intensive sometimes, for example, depending on the number of estimators and its depths, if you think of a tree-based ensemble model. This library wraps each estimator and compress its contents in LZ4 unless it's used. It trades performance for reduced memory usage.

Installation

============

```

$ pip install scikit-small-ensemble

```

Usage

=====

```python

# random forest ensemble model

from scikit_small_ensemble import compress, memory_map

# WARNING: This changes the model object itself.

# ratio is [0.0, 1.0], where 1.0 is most compressed.

compress(model, ratio=0.2)

# Or, you can memory-map estimators from the disk on demand.

# memory_map(model, ratio=1.0)

# Use it like nothing happened.

# Memory usage becomes 10x lower with ratio=1.0

Y = model.predict_proba(X)

```