{"id":40943301,"url":"https://github.com/stratosphereips/netflowlabeler","last_synced_at":"2026-01-22T04:37:30.220Z","repository":{"id":44366824,"uuid":"372807768","full_name":"stratosphereips/netflowlabeler","owner":"stratosphereips","description":"A configurable rule-based labeling tool for network flow files.","archived":false,"fork":false,"pushed_at":"2023-05-22T08:18:16.000Z","size":359,"stargazers_count":16,"open_issues_count":12,"forks_count":4,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-09-05T01:35:54.420Z","etag":null,"topics":["data-science","dataset-generation","datasets","labeler","netflow","network-traffic","tool","zeek","zeek-analysis"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/stratosphereips.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"docs/CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2021-06-01T11:37:16.000Z","updated_at":"2025-08-25T16:59:01.000Z","dependencies_parsed_at":"2025-04-13T04:52:03.374Z","dependency_job_id":"6ac3e770-82f6-4479-8413-79e372771289","html_url":"https://github.com/stratosphereips/netflowlabeler","commit_stats":null,"previous_names":[],"tags_count":10,"template":false,"template_full_name":null,"purl":"pkg:github/stratosphereips/netflowlabeler","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stratosphereips%2Fnetflowlabeler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stratosphereips%2Fnetflowlabeler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stratosphereips%2Fnetflowlabeler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stratosphereips%2Fnetflowlabeler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/stratosphereips","download_url":"https://codeload.github.com/stratosphereips/netflowlabeler/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stratosphereips%2Fnetflowlabeler/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28654886,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-22T01:17:37.254Z","status":"online","status_checked_at":"2026-01-22T02:00:07.137Z","response_time":144,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-science","dataset-generation","datasets","labeler","netflow","network-traffic","tool","zeek","zeek-analysis"],"created_at":"2026-01-22T04:37:30.159Z","updated_at":"2026-01-22T04:37:30.207Z","avatar_url":"https://github.com/stratosphereips.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# NetflowLabeler\n[![Docker Image CI](https://github.com/stratosphereips/netflowlabeler/actions/workflows/docker-image.yml/badge.svg)](https://github.com/stratosphereips/netflowlabeler/actions/workflows/docker-image.yml)\n![GitHub last commit (branch)](https://img.shields.io/github/last-commit/stratosphereips/netflowlabeler/main)\n![Docker Pulls](https://img.shields.io/docker/pulls/stratosphereips/netflowlabeler?color=green)\n\n\n_Authors: Sebastian Garcia and Veronica Valeros, Stratosphere Laboratory, CTU in Prague_\n\nNetflowLabeler is a Python tool to add labels to text-based network flow files. To label a netflow file, simply add the labels and conditions to a configuration file, then use this tool to assign them. The assignment of labels adheres to our own label ontology, which is structured as a customizable configuration file. Within the configuration file, you have the ability to incorporate both generic and detailed labels. Currently, the tool supports Zeek files that are delimited by TABS. However, future updates will expand its capabilities to include Zeek files in JSON and CSV formats, Argus files in CSV and TABS formats, Nfdump files in CSV format, and Suricata files in JSON format.\n\n- __netflowlabeler.py__ can label conn.log files based on a configuration file.\n- __zeek-files-labeler.py__ can label the rest of the Zeek log files, using the labels in the conn.log file.\n\n\n## Usage\n\nTo label a conn.log file from a configuration file:\n\n```python\nnetflowlabeler.py -c \u003cconfigFile\u003e [-v \u003cverbose\u003e] [-d DEBUG] -f \u003cnetflowFile\u003e [-h]\n```\nTo label the rest of the Zeek files using an already labeled conn.log file (conn.log.labeled):\n\n```python\nzeek-files-labeler.py -l conn.log.labeled -f folder-with-zeek-log-files\n```\n\n## Features\n\n- You can have AND and OR conditions\n- You can have generic labels and detailed labels\n- You can use negative conditions\n- All columns that can be interpreted as numbers can be compared with \u003c, \u003e, \u003c= and \u003e=\n- You can add comments in any place\n- You can use CIDR notation for IP ranges\n- You can label all the Zeek log files, by using the labels you put in the conn.log file\n\n## Example Configuration File of Labels\n\nAn example of the confguration file syntax is shown below:\n\n```yaml\nBackground:\n    - srcIP=all\n# Here the generic label is Background and the detailed label is ARP\nBackground, ARP: \n    - Proto=ARP\nMalicious, From_Malware:\n    - srcIP=10.0.0.34\nMalicious-More, From_Other_Malware:\n    - srcIP!=10.0.0.34 \u0026 dstPort=23\nMalicious-HEre, From_This_Malware:\n    - srcIP=10.0.0.34 \u0026 State=SF\nMalicious, From_Local_Link_IPv6:\n    - srcIP=fe80::1dfe:6c38:93c9:c808\nTest-State:\n    - srcIP=10.0.0.34 \u0026 State=S0\nTest-largebytes:\n   - Bytes\u003e=100\nTest-smallbytes:\n   - Bytes\u003c=100\nBenign, FromWindows:\n    - Proto=UDP \u0026 srcIP=147.32.84.165 \u0026 dstPort=53     # (AND conditions go in one line)\n    - Proto=TCP \u0026 dstIP=1.1.1.1 \u0026 dstPort=53           # (all new lines are OR conditions)\n```\n\n0. The first part of the label is the generic label (Benign), after the comma is the detailed description (FromWindows). We encourage not to use : or spaces or , or TABs in the detailed description\n1. If there is no |, then the detailed label is empty. \n2. Don't use quotes for the text.\n3. Labels are assigned from top to bottom\n4. Each new label superseeds and overwrites the previous match\n\nThe position is the priority of the rule. First we check the first rule matches and if it does, then we assign that label. Then we check the second rule, etc.\n\n\nThese are the possible fields that you can use in a configuration file to create the rules used for labeling.\n\n- Date\n- start\n- Duration\n- Proto\n- srcIP\n- srcPort\n- dstIP\n- dstPort\n- State\n- Tos\n- Packets\n- Bytes\n- Flows\n\nThe fields 'Bytes', 'Packets' and 'IPBytes' are computed in Zeek from the fields for the src and dst values. For example, Bytes=srcbytes + dstbytes\n\n## Docker Image\n\nNetflow labeler has a public docker image with the latest version. \n\nTo test the labeler is working correctly, run the following command. The command will run the netflow labeler tool on a Zeek example conn.log file and then cat the labeled file to the standard output. You should see the fresh labels in the output (e.g.: search for the string 'Test-smallbytes').\n\n```bash\ndocker run --tty -it stratosphereips/netflowlabeler:latest /bin/bash -c 'python3 netflowlabeler.py -c labels.config  -f examples/conn.tab.log ; cat examples/conn.tab.log.labeled'\n```\n\nTo mount your logs path to the container and run the netflow labeler interactively:\n```bash\ndocker run -v /full/path/to/logs/:/netflowlabeler/data --rm -it stratosphereips/netflowlabeler:latest /bin/bash\n```\n\nTo mount your logs path to the container and automatically run the netflow labeler on it with your own labels.config file:\n```bash\ndocker run -v /full/path/to/logs/:/netflowlabeler/data --rm -it stratosphereips/netflowlabeler:latest python3 netflowlabeler.py -c data/labels.config -f data/conn.log\n```\n\n## Netflow Labeler High Level Diagram\n\n```mermaid\nflowchart LR;\n    NetFlow[\"Netflow File\"]--\u003elabeler;\n    Config[\"Labels Config\"]--\u003elabeler;\n    subgraph ONE[\"Interpret Input File\"]\n        labeler--\u003eload_conditions;\n        load_conditions--\u003eprocess_netflow;\n        process_netflow--\u003edefine_type;\n        define_type--\u003edefine_columns;\n    end\n    subgraph TWO[\"Label NetFlow File\"]\n        define_columns-.-\u003eprocess_argus;\n        define_columns-.-\u003eprocess_nfdump;\n        define_columns--\u003eprocess_zeek;\n        process_argus-.-\u003eoutput_netflow_line_to_file;\n        process_nfdump-.-\u003eoutput_netflow_line_to_file;\n        process_zeek--\u003eoutput_netflow_line_to_file;\n    end\n    output_netflow_line_to_file--\u003eOutput[\"Labeled NetFlow File\"];\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstratosphereips%2Fnetflowlabeler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fstratosphereips%2Fnetflowlabeler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstratosphereips%2Fnetflowlabeler/lists"}