Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/stewartpark/scikit-small-ensemble
scikit-small-ensemble is a library to make your ensemble models(Random Forest Classifier, etc) have a small memory footprint/usage.
https://github.com/stewartpark/scikit-small-ensemble
compression ensemble-learning lz4 mmap random-forest-classifier scikit-learn
Last synced: 2 months ago
JSON representation
scikit-small-ensemble is a library to make your ensemble models(Random Forest Classifier, etc) have a small memory footprint/usage.
- Host: GitHub
- URL: https://github.com/stewartpark/scikit-small-ensemble
- Owner: stewartpark
- Created: 2017-01-13T07:58:19.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2018-03-29T17:12:59.000Z (almost 7 years ago)
- Last Synced: 2024-11-15T10:55:26.936Z (2 months ago)
- Topics: compression, ensemble-learning, lz4, mmap, random-forest-classifier, scikit-learn
- Language: Python
- Size: 5.86 KB
- Stars: 3
- Watchers: 3
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
scikit-small-ensemble
=====================scikit-small-ensemble is a library to make your ensemble models(Random Forest Classifier, etc) have a small memory footprint/usage.
Introduction
============Ensemble models can be very memory-intensive sometimes, for example, depending on the number of estimators and its depths, if you think of a tree-based ensemble model. This library wraps each estimator and compress its contents in LZ4 unless it's used. It trades performance for reduced memory usage.
Installation
============```
$ pip install scikit-small-ensemble
```Usage
=====```python
# random forest ensemble model
from scikit_small_ensemble import compress, memory_map# WARNING: This changes the model object itself.
# ratio is [0.0, 1.0], where 1.0 is most compressed.
compress(model, ratio=0.2)# Or, you can memory-map estimators from the disk on demand.
# memory_map(model, ratio=1.0)# Use it like nothing happened.
# Memory usage becomes 10x lower with ratio=1.0
Y = model.predict_proba(X)
```