{"id":19946126,"url":"https://github.com/quantco/slim-trees","last_synced_at":"2025-05-03T16:32:53.191Z","repository":{"id":152243137,"uuid":"601665070","full_name":"Quantco/slim-trees","owner":"Quantco","description":"Pickle your ML models more efficiently for deployment 🚀","archived":false,"fork":false,"pushed_at":"2025-05-01T18:04:30.000Z","size":3645,"stargazers_count":20,"open_issues_count":9,"forks_count":1,"subscribers_count":10,"default_branch":"main","last_synced_at":"2025-05-01T19:24:16.001Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Quantco.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-02-14T14:57:50.000Z","updated_at":"2025-05-01T18:04:33.000Z","dependencies_parsed_at":"2023-07-26T10:48:53.656Z","dependency_job_id":"b93f7dbc-109d-4fe8-8e64-2c1fb9d2229b","html_url":"https://github.com/Quantco/slim-trees","commit_stats":null,"previous_names":["quantco/slim-trees","pavelzw/pickle-compression"],"tags_count":16,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Quantco%2Fslim-trees","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Quantco%2Fslim-trees/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Quantco%2Fslim-trees/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Quantco%2Fslim-trees/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Quantco","download_url":"https://codeload.github.com/Quantco/slim-trees/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251931180,"owners_count":21666942,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-13T00:28:23.319Z","updated_at":"2025-05-03T16:32:50.467Z","avatar_url":"https://github.com/Quantco.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Slim Trees\n\n[![CI](https://github.com/quantco/slim-trees/actions/workflows/ci.yml/badge.svg)](https://github.com/quantco/slim-trees/actions/workflows/ci.yml)\n[![conda-forge](https://img.shields.io/conda/vn/conda-forge/slim-trees?logoColor=white\u0026logo=conda-forge)](https://anaconda.org/conda-forge/slim-trees)\n[![pypi-version](https://img.shields.io/pypi/v/slim-trees.svg?logo=pypi\u0026logoColor=white)](https://pypi.org/project/slim-trees)\n[![python-version](https://img.shields.io/pypi/pyversions/slim-trees?logoColor=white\u0026logo=python)](https://pypi.org/project/slim-trees)\n\n`slim-trees` is a Python package for saving and loading compressed `sklearn` Tree-based and `lightgbm` models.\nThe compression is performed by modifying how the model is pickled by Python's `pickle` module.\n\nWe presented this library at PyData Berlin 2023, check out the [slides](.github/assets/slim-trees-presentation.pdf)!\n\n## Installation\n\n```bash\npip install slim-trees\n# or\nmicromamba install slim-trees -c conda-forge\n# or\npixi add slim-trees\n```\n\n## Usage\n\nUsing `slim-trees` does not affect your training pipeline.\nSimply call `dump_sklearn_compressed` or `dump_lgbm_compressed` to save your model.\n\n\u003e [!WARNING]\n\u003e `slim-trees` does not save all the data that would be saved by `sklearn`:\n\u003e only the parameters that are relevant for inference are saved. If you want to save the full model including\n\u003e `impurity` etc. for analytic purposes, we suggest saving both the original using `pickle.dump` for analytics\n\u003e and the slimmed down version using `slim-trees` for production.\n\nExample for a `RandomForestClassifier`:\n\n```python\n# example, you can also use other Tree-based models\nfrom sklearn.ensemble import RandomForestClassifier\nfrom slim_trees import dump_sklearn_compressed\n\n# load training data\nX, y = ...\nmodel = RandomForestClassifier()\nmodel.fit(X, y)\n\ndump_sklearn_compressed(model, \"model.pkl\")\n# or alternatively with compression\ndump_sklearn_compressed(model, \"model.pkl.lzma\")\n```\n\nExample for a `LGBMRegressor`:\n\n```python\nfrom lightgbm import LGBMRegressor\nfrom slim_trees import dump_lgbm_compressed\n\n# load training data\nX, y = ...\nmodel = LGBMRegressor()\nmodel.fit(X, y)\n\ndump_lgbm_compressed(model, \"model.pkl\")\n# or alternatively with compression\ndump_lgbm_compressed(model, \"model.pkl.lzma\")\n```\n\nLater, you can load the model using `load_compressed` or `pickle.load`.\n\n```python\nimport pickle\nfrom slim_trees import load_compressed\n\nmodel = load_compressed(\"model.pkl\")\n\n# or alternatively with pickle.load\nwith open(\"model.pkl\", \"rb\") as f:\n    model = pickle.load(f)\n```\n\n### Save your model as `bytes`\n\nYou can also save the model as `bytes` instead of in a file similar to the `pickle.dumps` method.\n\n```python\nfrom slim_trees import dumps_sklearn_compressed, loads_compressed\n\nX, y = ...\nmodel = RandomForestClassifier()\nmodel.fit(X, y)\n\ndata = dumps_sklearn_compressed(model, compression=\"lzma\")\n...\nmodel_loaded = loads_compressed(data, compression=\"lzma\")\n```\n\n### Drop-in replacement for pickle\n\nYou can also use the `slim_trees.sklearn_tree.dump` or `slim_trees.lgbm_booster.dump` functions as drop-in replacements for `pickle.dump`.\n\n```python\nfrom slim_trees import sklearn_tree, lgbm_booster\n\n# for sklearn models\nwith open(\"model.pkl\", \"wb\") as f:\n    sklearn_tree.dump(model, f)  # instead of pickle.dump(...)\n\n# for lightgbm models\nwith open(\"model.pkl\", \"wb\") as f:\n    lgbm_booster.dump(model, f)  # instead of pickle.dump(...)\n```\n\n## Development Installation\n\nYou can install the package in development mode using the new conda package manager [`pixi`](https://github.com/prefix-dev/pixi):\n\n```bash\n❯ git clone https://github.com/quantco/slim-trees.git\n❯ cd slim-trees\n\n❯ pixi install\n❯ pixi run postinstall\n❯ pixi run test\n[...]\n❯ pixi run py312 python\n\u003e\u003e\u003e import slim_trees\n[...]\n```\n\n## Benchmark\n\nAs a general overview on what you can expect in terms of savings:\nThis is a 1.2G large sklearn `RandomForestRegressor`.\n\n![benchmark](.github/assets/benchmark.png)\n\nThe new file is 9x smaller than the original pickle file.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fquantco%2Fslim-trees","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fquantco%2Fslim-trees","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fquantco%2Fslim-trees/lists"}