{"id":19756651,"url":"https://github.com/esnet/stardust-flow-pipeline","last_synced_at":"2025-07-19T03:32:27.002Z","repository":{"id":77092385,"uuid":"566916931","full_name":"esnet/stardust-flow-pipeline","owner":"esnet","description":"The data processing pipeline the ESnet Stardust project uses to enrich metadata.","archived":false,"fork":false,"pushed_at":"2022-11-16T17:49:05.000Z","size":39011,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":33,"default_branch":"main","last_synced_at":"2025-02-28T09:11:54.324Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/esnet.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-11-16T17:26:02.000Z","updated_at":"2022-11-23T16:48:18.000Z","dependencies_parsed_at":"2023-02-24T22:15:28.340Z","dependency_job_id":null,"html_url":"https://github.com/esnet/stardust-flow-pipeline","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/esnet/stardust-flow-pipeline","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/esnet%2Fstardust-flow-pipeline","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/esnet%2Fstardust-flow-pipeline/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/esnet%2Fstardust-flow-pipeline/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/esnet%2Fstardust-flow-pipeline/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/esnet","download_url":"https://codeload.github.com/esnet/stardust-flow-pipeline/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/esnet%2Fstardust-flow-pipeline/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265883613,"owners_count":23843792,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-12T03:16:33.001Z","updated_at":"2025-07-19T03:32:26.981Z","avatar_url":"https://github.com/esnet.png","language":"Ruby","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Stardust Flow Pipeline\n\nThis repository contains the Stardust Flow Pipeline based on [Logstash](https://www.elastic.co/logstash/) for flow data enrichment. It is based on the [NetSage flow pipeline](https://github.com/netsage-project/netsage-pipeline) but contains additonal features added by ESnet as part of its internal efforts.\n\nIn production, this is the component that reads flow data generated by [pmacct](https://github.com/pmacct/pmacct) and sent to [Kafka](https://kafka.apache.org), adds metadata from various sources and sends to [Elasticsearch](https://www.elastic.co). \n\nThis repository is made available primarily for informational purposes and takes significant effort to get working in a production environment. Specifically it assumes you have a working Kafka cluster and Elastic cluster, which are significant undertakings. You can find some example settings that are used for configuring elastic in `docs/elastic`.\n\n## Running Docker container\nThis repository can be customized and used to build a docker image of the basic pipeline. A majority of the logic can be found in the `pipeline` directory. Some basic instructions are below:\n\n1. Copy the `env.example` file to `.env`\n```\ncp env.example .env\n```\n2. Edit .env with credentials for your Kafka and Elasticsearch deployments.\n3. Copy the server certificate for your Elasticsearch instance to `pipeline_etc/certificates/elastic.cer` _NOTE: You can alternatively edit `pipeline/99-outputs.conf` if you need to further adjust Elastic SSL settings_\n4. Copy the keystore containing the server certificate for your Kafka instance to `pipeline_etc/certificates/kafka_ca.p12` _NOTE: You can alternatively edit `pipeline/01-inputs.conf` if you need to further adjust Kafka SSL settings_\n5. Copy the keystore containing the client certificate used to authenticate to your Kafka instance to `pipeline_etc/certificates/kafka_user.p12` _NOTE: You can alternatively edit `pipeline/01-inputs.conf` if you need to adjust Kafka authentication settings_\n6. Edit any metadata in `pipeline_etc` as needed\n7. Build and start the docker container:\n```\ndocker-compose up --build -d\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fesnet%2Fstardust-flow-pipeline","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fesnet%2Fstardust-flow-pipeline","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fesnet%2Fstardust-flow-pipeline/lists"}