https://github.com/esnet/stardust-flow-pipeline
The data processing pipeline the ESnet Stardust project uses to enrich metadata.
https://github.com/esnet/stardust-flow-pipeline
Last synced: 12 months ago
JSON representation
The data processing pipeline the ESnet Stardust project uses to enrich metadata.
- Host: GitHub
- URL: https://github.com/esnet/stardust-flow-pipeline
- Owner: esnet
- License: other
- Created: 2022-11-16T17:26:02.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2022-11-16T17:49:05.000Z (over 3 years ago)
- Last Synced: 2025-02-28T09:11:54.324Z (over 1 year ago)
- Language: Ruby
- Size: 37.2 MB
- Stars: 0
- Watchers: 33
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Stardust Flow Pipeline
This repository contains the Stardust Flow Pipeline based on [Logstash](https://www.elastic.co/logstash/) for flow data enrichment. It is based on the [NetSage flow pipeline](https://github.com/netsage-project/netsage-pipeline) but contains additonal features added by ESnet as part of its internal efforts.
In production, this is the component that reads flow data generated by [pmacct](https://github.com/pmacct/pmacct) and sent to [Kafka](https://kafka.apache.org), adds metadata from various sources and sends to [Elasticsearch](https://www.elastic.co).
This repository is made available primarily for informational purposes and takes significant effort to get working in a production environment. Specifically it assumes you have a working Kafka cluster and Elastic cluster, which are significant undertakings. You can find some example settings that are used for configuring elastic in `docs/elastic`.
## Running Docker container
This repository can be customized and used to build a docker image of the basic pipeline. A majority of the logic can be found in the `pipeline` directory. Some basic instructions are below:
1. Copy the `env.example` file to `.env`
```
cp env.example .env
```
2. Edit .env with credentials for your Kafka and Elasticsearch deployments.
3. Copy the server certificate for your Elasticsearch instance to `pipeline_etc/certificates/elastic.cer` _NOTE: You can alternatively edit `pipeline/99-outputs.conf` if you need to further adjust Elastic SSL settings_
4. Copy the keystore containing the server certificate for your Kafka instance to `pipeline_etc/certificates/kafka_ca.p12` _NOTE: You can alternatively edit `pipeline/01-inputs.conf` if you need to further adjust Kafka SSL settings_
5. Copy the keystore containing the client certificate used to authenticate to your Kafka instance to `pipeline_etc/certificates/kafka_user.p12` _NOTE: You can alternatively edit `pipeline/01-inputs.conf` if you need to adjust Kafka authentication settings_
6. Edit any metadata in `pipeline_etc` as needed
7. Build and start the docker container:
```
docker-compose up --build -d
```