Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/BayesWitnesses/m2cgen
Transform ML models into a native code (Java, C, Python, Go, JavaScript, Visual Basic, C#, R, PowerShell, PHP, Dart, Haskell, Ruby, F#, Rust) with zero dependencies
https://github.com/BayesWitnesses/m2cgen
c csharp dartlang go haskell java javascript lightgbm lightning machine-learning php python r ruby rust scikit-learn statistical-learning statsmodels xgboost
Last synced: 3 months ago
JSON representation
Transform ML models into a native code (Java, C, Python, Go, JavaScript, Visual Basic, C#, R, PowerShell, PHP, Dart, Haskell, Ruby, F#, Rust) with zero dependencies
- Host: GitHub
- URL: https://github.com/BayesWitnesses/m2cgen
- Owner: BayesWitnesses
- License: mit
- Created: 2019-01-13T02:32:55.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2024-08-03T17:30:36.000Z (6 months ago)
- Last Synced: 2024-10-17T12:54:13.506Z (3 months ago)
- Topics: c, csharp, dartlang, go, haskell, java, javascript, lightgbm, lightning, machine-learning, php, python, r, ruby, rust, scikit-learn, statistical-learning, statsmodels, xgboost
- Language: Python
- Homepage:
- Size: 1.22 MB
- Stars: 2,809
- Watchers: 50
- Forks: 241
- Open Issues: 57
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-go - m2cgen - A CLI tool to transpile trained classic ML models into a native Go code with zero dependencies, written in Python with Go language support. (Machine Learning / Search and Analytic Databases)
- zero-alloc-awesome-go - m2cgen - A CLI tool to transpile trained classic ML models into a native Go code with zero dependencies, written in Python with Go language support. (Machine Learning / Search and Analytic Databases)
- fucking-awesome-elixir - m2cgen - A CLI tool to transpile trained classic ML models into a native Elixir code with zero dependencies. (Artificial Intelligence)
- awesome-rust - BayesWitnesses/m2cgen
- awesome-rust-cn - BayesWitnesses/m2cgen - (开发工具 Development tools / 转化 Transpiling)
- awesome-list - m2cgen - Transform ML models into a native code (Java, C, Python, Go, JavaScript, Visual Basic, C#, R, PowerShell, PHP, Dart, Haskell, Ruby, F#, Rust) with zero dependencies. (Deep Learning Framework / Deployment & Distribution)
- awesome-golang-repositories - m2cgen
- awesome-fsharp - m2cgen - A CLI tool to transpile trained classic ML models into a native F# code with zero dependencies. [MIT] (Data Science)
- awesome-go-extra - m2cgen - 01-13T02:32:55Z|2022-08-23T02:01:29Z| (Machine Learning / Advanced Console UIs)
- awesome-haskell - m2cgen - A CLI tool to transpile trained classic ML models into a native Haskell code with zero dependencies. (Data Science)
- awesome-dart - m2cgen - A CLI tool to transpile trained classic ML models into a native Dart code with zero dependencies. (Tools)
- awesome-rust - BayesWitnesses/m2cgen - A CLI tool to transpile trained classic machine learning models into a native Rust code with zero dependencies. [![GitHub Actions Status](https://github.com/BayesWitnesses/m2cgen/workflows/GitHub%20Actions/badge.svg?branch=master)](https://github.com/BayesWitnesses/m2cgen/actions) (Development tools / Transpiling)
- awesome-elixir - m2cgen - A CLI tool to transpile trained classic ML models into a native Elixir code with zero dependencies. (Artificial Intelligence)
- awesome-hacking-lists - BayesWitnesses/m2cgen - Transform ML models into a native code (Java, C, Python, Go, JavaScript, Visual Basic, C#, R, PowerShell, PHP, Dart, Haskell, Ruby, F#, Rust) with zero dependencies (Python)
- awesome-go - m2cgen - A CLI tool to transpile trained classic ML models into a native Go code with zero dependencies, written in Python with Go language support. Stars:`2.8K`. (Machine Learning / Search and Analytic Databases)
- awesome-python-machine-learning-resources - GitHub - 26% open · ⏱️ 14.08.2022): (模型序列化和转换)
- my-awesome - BayesWitnesses/m2cgen - learning,php,python,r,ruby,rust,scikit-learn,statistical-learning,statsmodels,xgboost pushed_at:2024-08 star:2.8k fork:0.2k Transform ML models into a native code (Java, C, Python, Go, JavaScript, Visual Basic, C#, R, PowerShell, PHP, Dart, Haskell, Ruby, F#, Rust) with zero dependencies (Python)
- awesome-production-machine-learning - m2cgen - A lightweight library which allows to transpile trained classic machine learning models into a native code of C, Java, Go, R, PHP, Dart, Haskell, Rust and many other programming languages. (Deployment and Serving)
- StarryDivineSky - BayesWitnesses/m2cgen
- fucking-awesome-rust - BayesWitnesses/m2cgen - A CLI tool to transpile trained classic machine learning models into a native Rust code with zero dependencies. [![GitHub Actions Status](https://github.com/BayesWitnesses/m2cgen/workflows/GitHub%20Actions/badge.svg?branch=master)](https://github.com/BayesWitnesses/m2cgen/actions) (Development tools / Transpiling)
- fucking-awesome-rust - BayesWitnesses/m2cgen - A CLI tool to transpile trained classic machine learning models into a native Rust code with zero dependencies. [![GitHub Actions Status](https://github.com/BayesWitnesses/m2cgen/workflows/GitHub%20Actions/badge.svg?branch=master)](https://github.com/BayesWitnesses/m2cgen/actions) (Development tools / Transpiling)
README
# m2cgen
[![GitHub Actions Status](https://github.com/BayesWitnesses/m2cgen/workflows/GitHub%20Actions/badge.svg?branch=master)](https://github.com/BayesWitnesses/m2cgen/actions)
[![Coverage Status](https://codecov.io/gh/BayesWitnesses/m2cgen/branch/master/graph/badge.svg)](https://codecov.io/gh/BayesWitnesses/m2cgen)
[![License: MIT](https://img.shields.io/github/license/BayesWitnesses/m2cgen.svg)](https://github.com/BayesWitnesses/m2cgen/blob/master/LICENSE)
[![Python Versions](https://img.shields.io/pypi/pyversions/m2cgen.svg?logo=python&logoColor=white)](https://pypi.org/project/m2cgen)
[![PyPI Version](https://img.shields.io/pypi/v/m2cgen.svg?logo=pypi&logoColor=white)](https://pypi.org/project/m2cgen)
[![Downloads](https://pepy.tech/badge/m2cgen)](https://pepy.tech/project/m2cgen)**m2cgen** (Model 2 Code Generator) - is a lightweight library which provides an easy way to transpile trained statistical models into a native code (Python, C, Java, Go, JavaScript, Visual Basic, C#, PowerShell, R, PHP, Dart, Haskell, Ruby, F#, Rust, Elixir).
* [Installation](#installation)
* [Development](#development)
* [Supported Languages](#supported-languages)
* [Supported Models](#supported-models)
* [Classification Output](#classification-output)
* [Usage](#usage)
* [CLI](#cli)
* [FAQ](#faq)## Installation
Supported Python version is >= **3.7**.
```
pip install m2cgen
```## Development
Make sure the following command runs successfully before submitting a PR:
```
make pre-pr
```
Alternatively you can run the Docker version of the same command:
```
make docker-build docker-pre-pr
```## Supported Languages
- C
- C#
- Dart
- F#
- Go
- Haskell
- Java
- JavaScript
- PHP
- PowerShell
- Python
- R
- Ruby
- Rust
- Visual Basic (VBA-compatible)
- Elixir## Supported Models
| | Classification | Regression |
| --- | --- | --- |
| **Linear** |
- scikit-learn
- LogisticRegression
- LogisticRegressionCV
- PassiveAggressiveClassifier
- Perceptron
- RidgeClassifier
- RidgeClassifierCV
- SGDClassifier
- lightning
- AdaGradClassifier
- CDClassifier
- FistaClassifier
- SAGAClassifier
- SAGClassifier
- SDCAClassifier
- SGDClassifier
- scikit-learn
- ARDRegression
- BayesianRidge
- ElasticNet
- ElasticNetCV
- GammaRegressor
- HuberRegressor
- Lars
- LarsCV
- Lasso
- LassoCV
- LassoLars
- LassoLarsCV
- LassoLarsIC
- LinearRegression
- OrthogonalMatchingPursuit
- OrthogonalMatchingPursuitCV
- PassiveAggressiveRegressor
- PoissonRegressor
- RANSACRegressor(only supported regression estimators can be used as a base estimator)
- Ridge
- RidgeCV
- SGDRegressor
- TheilSenRegressor
- TweedieRegressor
- StatsModels
- Generalized Least Squares (GLS)
- Generalized Least Squares with AR Errors (GLSAR)
- Generalized Linear Models (GLM)
- Ordinary Least Squares (OLS)
- [Gaussian] Process Regression Using Maximum Likelihood-based Estimation (ProcessMLE)
- Quantile Regression (QuantReg)
- Weighted Least Squares (WLS)
- lightning
- AdaGradRegressor
- CDRegressor
- FistaRegressor
- SAGARegressor
- SAGRegressor
- SDCARegressor
- SGDRegressor
| **SVM** |
- scikit-learn
- LinearSVC
- NuSVC
- OneClassSVM
- SVC
- lightning
- KernelSVC
- LinearSVC
- scikit-learn
- LinearSVR
- NuSVR
- SVR
- lightning
- LinearSVR
| **Tree** |
- DecisionTreeClassifier
- ExtraTreeClassifier
- DecisionTreeRegressor
- ExtraTreeRegressor
| **Random Forest** |
- ExtraTreesClassifier
- LGBMClassifier(rf booster only)
- RandomForestClassifier
- XGBRFClassifier
- ExtraTreesRegressor
- LGBMRegressor(rf booster only)
- RandomForestRegressor
- XGBRFRegressor
| **Boosting** |
- LGBMClassifier(gbdt/dart/goss booster only)
- XGBClassifier(gbtree(including boosted forests)/gblinear booster only)
- LGBMRegressor(gbdt/dart/goss booster only)
- XGBRegressor(gbtree(including boosted forests)/gblinear booster only)
- |
You can find versions of packages with which compatibility is guaranteed by CI tests [here](https://github.com/BayesWitnesses/m2cgen/blob/master/requirements-test.txt#L1).
Other versions can also be supported but they are untested.
## Classification Output
### Linear / Linear SVM / Kernel SVM
#### Binary
Scalar value; signed distance of the sample to the hyperplane for the second class.
#### Multiclass
Vector value; signed distance of the sample to the hyperplane per each class.
#### Comment
The output is consistent with the output of ```LinearClassifierMixin.decision_function```.
### SVM
#### Outlier detection
Scalar value; signed distance of the sample to the separating hyperplane: positive for an inlier and negative for an outlier.
#### Binary
Scalar value; signed distance of the sample to the hyperplane for the second class.
#### Multiclass
Vector value; one-vs-one score for each class, shape (n_samples, n_classes * (n_classes-1) / 2).
#### Comment
The output is consistent with the output of ```BaseSVC.decision_function``` when the `decision_function_shape` is set to `ovo`.
### Tree / Random Forest / Boosting
#### Binary
Vector value; class probabilities.
#### Multiclass
Vector value; class probabilities.
#### Comment
The output is consistent with the output of the `predict_proba` method of `DecisionTreeClassifier` / `ExtraTreeClassifier` / `ExtraTreesClassifier` / `RandomForestClassifier` / `XGBRFClassifier` / `XGBClassifier` / `LGBMClassifier`.
## Usage
Here's a simple example of how a linear model trained in Python environment can be represented in Java code:
```python
from sklearn.datasets import load_diabetes
from sklearn import linear_model
import m2cgen as m2c
X, y = load_diabetes(return_X_y=True)
estimator = linear_model.LinearRegression()
estimator.fit(X, y)
code = m2c.export_to_java(estimator)
```
Generated Java code:
```java
public class Model {
public static double score(double[] input) {
return ((((((((((152.1334841628965) + ((input[0]) * (-10.012197817470472))) + ((input[1]) * (-239.81908936565458))) + ((input[2]) * (519.8397867901342))) + ((input[3]) * (324.39042768937657))) + ((input[4]) * (-792.1841616283054))) + ((input[5]) * (476.74583782366153))) + ((input[6]) * (101.04457032134408))) + ((input[7]) * (177.06417623225025))) + ((input[8]) * (751.2793210873945))) + ((input[9]) * (67.62538639104406));
}
}
```
**You can find more examples of generated code for different models/languages [here](https://github.com/BayesWitnesses/m2cgen/tree/master/generated_code_examples).**
## CLI
`m2cgen` can be used as a CLI tool to generate code using serialized model objects (pickle protocol):
```
$ m2cgen --language [--indent ] [--function_name ]
[--class_name ] [--module_name ] [--package_name ]
[--namespace ] [--recursion-limit ]
```
Don't forget that for unpickling serialized model objects their classes must be defined in the top level of an importable module in the unpickling environment.
Piping is also supported:
```
$ cat | m2cgen --language
```
## FAQ
**Q: Generation fails with `RecursionError: maximum recursion depth exceeded` error.**
A: If this error occurs while generating code using an ensemble model, try to reduce the number of trained estimators within that model. Alternatively you can increase the maximum recursion depth with `sys.setrecursionlimit()`.
**Q: Generation fails with `ImportError: No module named ` error while transpiling model from a serialized model object.**
A: This error indicates that pickle protocol cannot deserialize model object. For unpickling serialized model objects, it is required that their classes must be defined in the top level of an importable module in the unpickling environment. So installation of package which provided model's class definition should solve the problem.
**Q: Generated by m2cgen code provides different results for some inputs compared to original Python model from which the code were obtained.**
A: Some models force input data to be particular type during prediction phase in their native Python libraries. Currently, m2cgen works only with ``float64`` (``double``) data type. You can try to cast your input data to another type manually and check results again. Also, some small differences can happen due to specific implementation of floating-point arithmetic in a target language.