An open API service indexing awesome lists of open source software.

https://github.com/ylem-co/ylem

Ylem is an open-source platform for real-time data streaming orchestration
https://github.com/ylem-co/ylem

data data-visualization dataorchestration etl etl-framework etl-pipeline ide ingestion orchestration pipelines processing real-time reverse-etl scheduler streaming streaming-data transformation workflows

Last synced: 12 days ago
JSON representation

Ylem is an open-source platform for real-time data streaming orchestration

Awesome Lists containing this project

README

          


Ylem. The open-source data streaming platform

![GitHub branch check runs](https://img.shields.io/github/check-runs/ylem-co/ylem/main?color=green)
![Static Badge](https://img.shields.io/badge/Go-1.23-black)
![Static Badge](https://img.shields.io/badge/React-18.3.1-black)
![Static Badge](https://img.shields.io/badge/license-Apache%202.0-black)
![Static Badge](https://img.shields.io/badge/tag-v0.0.1_pre_release-black)
![Static Badge](https://img.shields.io/badge/website-ylem.co-black)
![Static Badge](https://img.shields.io/badge/documentation-docs.ylem.co-black)
![Static Badge](https://img.shields.io/badge/community-join%20Slack-black)

# Ylem
The open-source data streaming platform is a one-stop-shop solution for orchestrating data streams on top of Apache Kafka, Amazon SQS, Google Pub/Sub, RabbitMQ, various APIs, and data storages.

Screenshot 2024-10-18 at 13 20 37

| | | |
| ------------- | ---------- | ----------- |
| User dashboard | Pipeline running | Pipeline log |

# Installation

## Install Docker 4

If you don't yet have Docker 4 installed, [install](https://www.docker.com/products/docker-desktop/) it from their official website for your OS.

## Install Ylem

### Option 1. Install from pre-build containers

The best way to install Ylem is to clone the repository https://github.com/ylem-co/ylem-installer and follow the installation instructions from it. It will install Ylem from the latest version of pre-build containers stored on Docker Hub.

Ylem will be available at http://localhost:7331/

### Option 2. Build and install from the source

If you want to compile Ylem from the source, run `docker compose up` or `docker compose up -d` from this repository. It will compile the code and run all the necessary containers.

Ylem is available at http://127.0.0.1:7330/

:warning: Please pay attention. Compiling from the source might take some time and will keep the resources of your machine busy.

#### To rebuild a particular container

If you want to rebuild a particular container from a source locally, run the following:

``` bash
docker compose build --no-cache %%CONTAINER_NAME%%
```

E.g.

``` bash
docker compose build --no-cache ylem_users
```

# Using your own Apache Kafka cluster

Ylem uses Apache Kafka to exchange messages for processing pipelines and tasks. By default Ylem already comes with the pre-configured Apache Kafka container, however, you might already have an Apache Kafka cluster in your infrastructure and might want to reuse it.

In this case, you need to take the steps described [here in our documentation](https://docs.ylem.co/open-source-edition/usage-of-apache-kafka).

# Configuring environment variables in .env files

Some particular integrations might require extra steps and using `.env` files. Configure them if you need to.

The list of such integrations and more information about them is in [our documentation](https://docs.ylem.co/open-source-edition/configuring-integrations-with-.env-variables).

# Folder structure in this repository

Ylem is a set of microservices. Each microservice is represented by one or more containers in the same network and communicates with each other via the API.

``` bash
|-- api # api microservice
|-- backend
|--|-- integrations # integrations with external APIs, databases and other software
|--|-- pipelines # pipelines, tasks, connectors
|--|-- statistics # statistics of pipeline and task runs
|--|-- users # users and organizations
|-- database # a container for storing databases for all the microservices
|-- processor
|--|-- python_processor # processor of the Python code written in pipelines
|--|-- taskrunner # task runner and load balancer
|-- server # Nginx container in front of all the microservice APIs allowing to avoid CORS issues on the UI side
|-- ui # user interface
```

Each microservice has its own README file containing more information about its usage and functionality.

# Documentation

The user and developer documentation of Ylem is available at https://docs.ylem.co/.

The [open-source section](https://docs.ylem.co/open-source-edition) contains information about the [task-processing architecture](https://docs.ylem.co/open-source-edition/task-processing-architecture) and [configuration of integrations](https://docs.ylem.co/open-source-edition/configuring-integrations-with-.env-variables) using .env files and parameters.

# Explore our additional integration packages

| Integration | Repository | Description |
| ------------- | ---------- | ----------- |
| Apache Kafka | https://github.com/ylem-co/ylem-kafka-trigger | Containerized Apache Kafka listener to stream data to Ylem |
| RabbitMQ | https://github.com/ylem-co/ylem-rabbitmq-consumer | Containerized RabbitMQ consumer to stream data to Ylem |
| AWS S3 | https://github.com/ylem-co/s3-lambda-trigger | AWS Lambda function to stream data from AWS S3 to Ylem |
| Tableau | https://github.com/ylem-co/tableau-http-wrapper | Containerized HTTP wrapper to stream data from Ylem to Tableau |

# Key contributors

* [olschaefer](https://github.com/olschaefer)
* [schneekatze](https://github.com/schneekatze)
* [lunoshot](https://github.com/lunoshot)
* [Ardem](https://github.com/Ardem)