An open API service indexing awesome lists of open source software.

https://github.com/feathr-ai/feathr-online


https://github.com/feathr-ai/feathr-online

Last synced: 4 months ago
JSON representation

Awesome Lists containing this project

README

          

# Feathr Online Transformation

[![PyPI version](https://badge.fury.io/py/feathrpiper.svg)](https://badge.fury.io/py/feathrpiper) [![codecov](https://codecov.io/gh/feathr-ai/feathr-online/branch/main/graph/badge.svg?token=1PX7CBZCZK)](https://codecov.io/gh/feathr-ai/feathr-online)

This project include 4 components:

* The transformation core, it's a shared component used by all the other components.
* The standalone executable, which is a HTTP server that can be used to transform data, it doesn't support UDF, and the docker image is published to DockerHub as `feathrfeaturestore/feathrpiper:latest`.
* The Python package, it supports UDF written in Python, the package is published to PyPI as `feathrpiper` and can be installed with `pip`.
* The Java package, it supports UDF written in Java, the package is published to GitHub Package Registry as `com.linkedin.feathr.online:feathrpiper`.

## To start the docker container

Run the following command:

```bash
docker run -p 8000:8000 feathrfeaturestore/feathrpiper:latest
```

The service will listen on port 8000, and you can send HTTP request to it to transform data, it uses the pre-packaged config located under the `conf` directory.
To use your own config, you can mount a volume to the container, for example:

```bash
mkdir conf

cat > conf/pipeline.conf < conf/lookup.json < -l [--address ] [--port ]
```

## TODO:

- [x] Aggregation, group by, count, avg, etc.
- [x] Join
- [ ] Error tracing, for now only a string representation of the error is recorded, need to record full stack trace under the debug mode.
- [ ] Hosted data, Parquet, CSV, Delta Lake, etc.?
- [ ] UDF in WASM?