https://github.com/sf-tec/openmodels
OpenModels is a flexible and extensible library for serializing and deserializing machine learning models. It's designed to support any serialization format through a plugin-based architecture, providing a safe and transparent solution for exporting and sharing predictive models.
https://github.com/sf-tec/openmodels
json python scikit-learn serialization sklearn
Last synced: about 1 year ago
JSON representation
OpenModels is a flexible and extensible library for serializing and deserializing machine learning models. It's designed to support any serialization format through a plugin-based architecture, providing a safe and transparent solution for exporting and sharing predictive models.
- Host: GitHub
- URL: https://github.com/sf-tec/openmodels
- Owner: SF-Tec
- License: mit
- Created: 2024-06-07T11:52:02.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2025-03-03T10:08:53.000Z (over 1 year ago)
- Last Synced: 2025-04-14T03:12:11.698Z (about 1 year ago)
- Topics: json, python, scikit-learn, serialization, sklearn
- Language: Python
- Homepage:
- Size: 252 KB
- Stars: 3
- Watchers: 2
- Forks: 1
- Open Issues: 8
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# OpenModels
[](https://badge.fury.io/py/openmodels)
[](https://opensource.org/licenses/MIT)
[](https://pypi.org/project/openmodels/)
OpenModels is a flexible and extensible library for serializing and deserializing machine learning models. It's designed to support any serialization format through a plugin-based architecture, providing a safe and transparent solution for exporting and sharing predictive models.
## Key Features
- **Format Agnostic**: Supports any serialization format through a plugin-based system.
- **Extensible**: Easily add support for new model types and serialization formats.
- **Safe**: Provides alternatives to potentially unsafe serialization methods like Pickle.
- **Transparent**: Supports human-readable formats for easy inspection of serialized models.
## Installation
```bash
pip install openmodels
```
## Quick Start
```python
from openmodels import SerializationManager, SklearnSerializer
from sklearn.decomposition import PCA
from sklearn.datasets import make_classification
# Create and train a scikit-learn model
X, _ = make_classification(n_samples=1000, n_features=4, n_informative=2, n_redundant=0, random_state=0, shuffle=False)
model = PCA(n_components=2, random_state=0)
model.fit(X)
# Create a SerializationManager
manager = SerializationManager(SklearnSerializer())
# Serialize the model (default format is JSON)
serialized_model = manager.serialize(model)
# Deserialize the model
deserialized_model = manager.deserialize(serialized_model)
# Use the deserialized model
transformed_data = deserialized_model.transform(X[:5])
print(transformed_data)
```
## Extensibility
OpenModels is designed to be easily extended with new serialization formats and model types.
### Adding a New Format
To add a new serialization format, create a class that implements the `FormatConverter` protocol and register it with the `FormatRegistry`:
```python
from openmodels.protocols import FormatConverter
from openmodels.format_registry import FormatRegistry
from typing import Dict, Any
class YAMLConverter(FormatConverter):
@staticmethod
def serialize_to_format(data: Dict[str, Any]) -> str:
import yaml
return yaml.dump(data)
@staticmethod
def deserialize_from_format(formatted_data: str) -> Dict[str, Any]:
import yaml
return yaml.safe_load(formatted_data)
FormatRegistry.register("yaml", YAMLConverter)
```
### Adding a New Model Serializer
To add support for a new type of model, create a class that implements the `ModelSerializer` protocol:
```python
from openmodels.protocols import ModelSerializer
from typing import Any, Dict
class TensorFlowSerializer(ModelSerializer):
def serialize(self, model: Any) -> Dict[str, Any]:
# Implementation for serializing TensorFlow models
...
def deserialize(self, data: Dict[str, Any]) -> Any:
# Implementation for deserializing TensorFlow models
...
```
## Supported Models
OpenModels currently supports a wide range of scikit-learn models, including:
- Classification: LogisticRegression, RandomForestClassifier, SVC, etc.
- Regression: LinearRegression, RandomForestRegressor, SVR, etc.
- Clustering: KMeans
- Dimensionality Reduction: PCA
For a full list of supported models, please refer to the `SUPPORTED_ESTIMATORS` dictionary in `serializers/sklearn_serializer.py`.
## Contributing
We welcome contributions to OpenModels! Whether you want to add support for new models, implement new serialization formats, or improve the existing codebase, your help is appreciated.
Please refer to our [Contributing Guidelines](https://github.com/SF-Tec/openmodels/blob/main/CONTRIBUTING.md) for more information on how to get started.
## Running Tests
To run the tests:
1. Clone the repository:
```bash
git clone https://github.com/your-repo/openmodels.git
cd openmodels
```
2. Install the package and its dependencies:
```bash
pip install -e .
```
3. Run the tests:
```bash
pytest
```
## License
This project is licensed under the MIT License. See the [LICENSE](https://github.com/SF-Tec/openmodels/blob/main/LICENSE) file for details.
## Changelog
For a detailed changelog, please see the [CHANGELOG.md](https://github.com/SF-Tec/openmodels/blob/main/CHANGELOG.md) file.
## Support
If you encounter any issues or have questions, please [file an issue](https://github.com/SF-Tec/openmodels/issues/new) on our GitHub repository.
We're always looking to improve OpenModels. If you have any suggestions or feature requests, please let us know!