An open API service indexing awesome lists of open source software.

https://github.com/esnet/stardust-flow-pipeline

The data processing pipeline the ESnet Stardust project uses to enrich metadata.
https://github.com/esnet/stardust-flow-pipeline

Last synced: 12 months ago
JSON representation

The data processing pipeline the ESnet Stardust project uses to enrich metadata.

Awesome Lists containing this project

README

          

# Stardust Flow Pipeline

This repository contains the Stardust Flow Pipeline based on [Logstash](https://www.elastic.co/logstash/) for flow data enrichment. It is based on the [NetSage flow pipeline](https://github.com/netsage-project/netsage-pipeline) but contains additonal features added by ESnet as part of its internal efforts.

In production, this is the component that reads flow data generated by [pmacct](https://github.com/pmacct/pmacct) and sent to [Kafka](https://kafka.apache.org), adds metadata from various sources and sends to [Elasticsearch](https://www.elastic.co).

This repository is made available primarily for informational purposes and takes significant effort to get working in a production environment. Specifically it assumes you have a working Kafka cluster and Elastic cluster, which are significant undertakings. You can find some example settings that are used for configuring elastic in `docs/elastic`.

## Running Docker container
This repository can be customized and used to build a docker image of the basic pipeline. A majority of the logic can be found in the `pipeline` directory. Some basic instructions are below:

1. Copy the `env.example` file to `.env`
```
cp env.example .env
```
2. Edit .env with credentials for your Kafka and Elasticsearch deployments.
3. Copy the server certificate for your Elasticsearch instance to `pipeline_etc/certificates/elastic.cer` _NOTE: You can alternatively edit `pipeline/99-outputs.conf` if you need to further adjust Elastic SSL settings_
4. Copy the keystore containing the server certificate for your Kafka instance to `pipeline_etc/certificates/kafka_ca.p12` _NOTE: You can alternatively edit `pipeline/01-inputs.conf` if you need to further adjust Kafka SSL settings_
5. Copy the keystore containing the client certificate used to authenticate to your Kafka instance to `pipeline_etc/certificates/kafka_user.p12` _NOTE: You can alternatively edit `pipeline/01-inputs.conf` if you need to adjust Kafka authentication settings_
6. Edit any metadata in `pipeline_etc` as needed
7. Build and start the docker container:
```
docker-compose up --build -d
```