Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/milenkovicm/lightfusion
LightGBM Inference on Datafusion
https://github.com/milenkovicm/lightfusion
batch-inference datafusion inference lightgbm machine-learning rust sql udf userdefined-functions
Last synced: about 2 months ago
JSON representation
LightGBM Inference on Datafusion
- Host: GitHub
- URL: https://github.com/milenkovicm/lightfusion
- Owner: milenkovicm
- License: mit
- Created: 2024-03-23T19:07:22.000Z (10 months ago)
- Default Branch: master
- Last Pushed: 2024-11-06T20:57:33.000Z (about 2 months ago)
- Last Synced: 2024-11-06T21:42:35.915Z (about 2 months ago)
- Topics: batch-inference, datafusion, inference, lightgbm, machine-learning, rust, sql, udf, userdefined-functions
- Language: Rust
- Homepage:
- Size: 9.82 MB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# LightFusion (LightGBM Inference on Datafusion)
LightFusion is an very opinionated [LightGBM](https://lightgbm.readthedocs.io/en/stable/) datafusion integration, implemented to demonstrate datafusion `FunctionFactory` functionality merge request ([arrow-datafusion/pull#9333](https://github.com/apache/arrow-datafusion/pull/9333)).
> [!NOTE]
> It has not been envisaged as a actively maintained library.Other project utilizing `FunctionFactory`:
- [Torchfusion, Opinionated Torch Inference on DataFusion](https://github.com/milenkovicm/torchfusion)
- [DataFusion JVM User Defined Functions (UDF)](https://github.com/milenkovicm/adhesive)## How to use
A LightGBM model can be defined as and SQL UDF definition:
```sql
CREATE FUNCTION f0(DOUBLE[])
RETURNS DOUBLE[]
LANGUAGE LIGHTGBM
AS 'multiclass.lgbm'
```and called from sql like:
```sql
SELECT f0([0.109, 1.261, -0.274, 2.605, 0.472, -0.429, -0.983, 1.000, -0.095, -1.219, -0.369, -0.312,
-0.840, 1.281, -0.618, -0.532, -0.132, 0.443, 0.028, 2.201, 0.044, 1.671, 0.660, -0.114,
0.574, 0.276, 0.680, -0.670]) as inferred
```> [!WARNING]
> Model used for tests could probably be way better.## Configuration
FunctionFactor exposes set of configuration options which can be retrieved querying system catalog:
```text
+------------------------+-------+---------------------------------------------------------------------------+
| name | value | description |
+------------------------+-------+---------------------------------------------------------------------------+
| lightfusion.batch_size | 1 | Batch size to be used. Valid value positive non-zero integers. Default: 1 |
+------------------------+-------+---------------------------------------------------------------------------+
```Available configuration options can be changed:
```sql
SET lightfusion.batch_size = 16
```