{"id":13509889,"url":"https://github.com/cloudflare/goflow","last_synced_at":"2025-02-26T06:13:40.816Z","repository":{"id":43522923,"uuid":"123599091","full_name":"cloudflare/goflow","owner":"cloudflare","description":"The high-scalability sFlow/NetFlow/IPFIX collector used internally at Cloudflare.","archived":false,"fork":false,"pushed_at":"2025-01-23T17:59:58.000Z","size":196,"stargazers_count":898,"open_issues_count":35,"forks_count":180,"subscribers_count":42,"default_branch":"master","last_synced_at":"2025-02-19T05:15:17.951Z","etag":null,"topics":["cisco","flow","go","ipfix","juniper","kafka","netflow","sflow"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cloudflare.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-03-02T15:55:12.000Z","updated_at":"2025-02-19T05:00:11.000Z","dependencies_parsed_at":"2024-01-25T04:47:46.860Z","dependency_job_id":"a8b67c33-db84-4b24-96a2-9454ac3078c3","html_url":"https://github.com/cloudflare/goflow","commit_stats":{"total_commits":72,"total_committers":18,"mean_commits":4.0,"dds":0.5416666666666667,"last_synced_commit":"742cddc5dc37bac910151b9519e2720776ba404d"},"previous_names":[],"tags_count":24,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cloudflare%2Fgoflow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cloudflare%2Fgoflow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cloudflare%2Fgoflow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cloudflare%2Fgoflow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cloudflare","download_url":"https://codeload.github.com/cloudflare/goflow/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240801103,"owners_count":19859729,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cisco","flow","go","ipfix","juniper","kafka","netflow","sflow"],"created_at":"2024-08-01T02:01:16.104Z","updated_at":"2025-02-26T06:13:40.738Z","avatar_url":"https://github.com/cloudflare.png","language":"Go","funding_links":[],"categories":["Go","kafka","Network Monitoring","Networking \u0026 Performance"],"sub_categories":["SD-WAN","Traffic Analysis \u0026 Filtering"],"readme":"# GoFlow\n\n\u003e [!WARNING]\n\u003e This software is no longer maintained. We advise replacing your production use of this software with the fork [goflow2](https://github.com/netsampler/goflow2).\n\nThis application is a NetFlow/IPFIX/sFlow collector in Go.\n\nIt gathers network information (IP, interfaces, routers) from different flow protocols,\nserializes it in a protobuf format and sends the messages to Kafka using Sarama's library.\n\n## Why\n\nThe diversity of devices and the amount of network samples at Cloudflare required its own pipeline.\nWe focused on building tools that could be easily monitored and maintained.\nThe main goal is to have full visibility of a network while allowing other teams to develop on it.\n\n### Modularity\n\nIn order to enable load-balancing and optimizations, the GoFlow library has a `decoder` which converts\nthe payload of a flow packet into a Go structure.\n\nThe `producer` functions (one per protocol) then converts those structures into a protobuf (`pb/flow.pb`)\nwhich contains the fields a network engineer is interested in.\nThe flow packets usually contains multiples samples\nThis acts as an abstraction of a sample.\n\nThe `transport` provides different way of processing the protobuf. Either sending it via Kafka or\nprint it on the console.\n\nFinally, `utils` provide functions that are directly used by the CLI utils.\nGoFlow is a wrapper of all the functions and chains thems into producing bytes into Kafka.\nThere is also one CLI tool per protocol.\n\nYou can build your own collector using this base and replace parts:\n* Use different transport (eg: RabbitMQ instead of Kafka)\n* Convert to another format (eg: Cap'n Proto, Avro, instead of protobuf)\n* Decode different samples (eg: not only IP networks, add MPLS)\n* Different metrics system (eg: use [expvar](https://golang.org/pkg/expvar/) instead of Prometheus)\n\n### Protocol difference\n\nThe sampling protocols can be very different:\n\n**sFlow** is a stateless protocol which sends the full header of a packet with router information\n(interfaces, destination AS) while **NetFlow/IPFIX** rely on templates that contain fields (eg: source IPv6).\n\nThe sampling rate in NetFlow/IPFIX is provided by **Option Data Sets**. This is why it can take a few minutes\nfor the packets to be decoded until all the templates are received (**Option Template** and **Data Template**).\n\nBoth of these protocols bundle multiple samples (**Data Set** in NetFlow/IPFIX and **Flow Sample** in sFlow)\nin one packet.\n\nThe advantages of using an abstract network flow format, such as protobuf, is it enables summing over the\nprotocols (eg: per ASN or per port, rather than per (ASN, router) and (port, router)).\n\n## Features\n\nCollection:\n* NetFlow v5\n* IPFIX/NetFlow v9\n  * Handles sampling rate provided by the Option Data Set\n* sFlow v5: RAW, IPv4, IPv6, Ethernet samples, Gateway data, router data, switch data\n\nProduction:\n* Convert to protobuf\n* Sends to Kafka producer\n* Prints to the console\n\nMonitoring:\n* Prometheus metrics\n* Time to decode\n* Samples rates\n* Payload information\n* NetFlow Templates\n\n## Run\n\nDownload the latest release and just run the following command:\n\n```\n./goflow -h\n```\n\nEnable or disable a protocol using `-nf=false` or `-sflow=false`.\nDefine the port and addresses of the protocols using `-nf.addr`, `-nf.port` for NetFlow and `-sflow.addr`, `-slow.port` for sFlow.\n\nSet the brokers or the Kafka brokers SRV record using: `-kafka.brokers 127.0.0.1:9092,[::1]:9092` or `-kafka.srv`.\nDisable Kafka sending `-kafka=false`.\nYou can hash the protobuf by key when you send it to Kafka.\n\nYou can collect NetFlow/IPFIX, NetFlow v5 and sFlow using the same collector\nor use the single-protocol collectors.\n\nYou can define the number of workers per protocol using `-workers` .\n\n## Docker\n\nWe also provide a all-in-one Docker container. To run it in debug mode without sending into Kafka:\n\n```\n$ sudo docker run --net=host -ti cloudflare/goflow:latest -kafka=false\n```\n\n## Environment\n\nTo get an example of pipeline, check out [flow-pipeline](https://github.com/cloudflare/flow-pipeline)\n\n### How is it used at Cloudflare\n\nThe samples flowing into Kafka are **processed** and special fields are inserted using other databases:\n* User plan\n* Country\n* ASN and BGP information\n\nThe extended protobuf has the same base of the one in this repo. The **compatibility** with other software\nis preserved when adding new fields (thus the fields will be lost if re-serialized).\n\nOnce the updated flows are back into Kafka, they are **consumed** by **database inserters** (Clickhouse, Amazon Redshift, Google BigTable...)\nto allow for static analysis. Other teams access the network data just like any other log (SQL query).\n\n### Output format\n\nIf you want to develop applications, build `pb/flow.proto` into the language you want:\n\nExample in Go:\n```\nPROTOCPATH=$HOME/go/bin/ make proto\n```\n\nExample in Java:\n\n```\nexport SRC_DIR=\"path/to/goflow-pb\"\nexport DST_DIR=\"path/to/java/app/src/main/java\"\nprotoc -I=$SRC_DIR --java_out=$DST_DIR $SRC_DIR/flow.proto\n```\n\nThe fields are listed in the following table.\n\nYou can find information on how they are populated from the original source:\n* For [sFlow](https://sflow.org/developers/specifications.php)\n* For [NetFlow v5](https://www.cisco.com/c/en/us/td/docs/net_mgmt/netflow_collection_engine/3-6/user/guide/format.html)\n* For [NetFlow v9](https://www.cisco.com/en/US/technologies/tk648/tk362/technologies_white_paper09186a00800a3db9.html)\n* For [IPFIX](https://www.iana.org/assignments/ipfix/ipfix.xhtml)\n\n| Field | Description | NetFlow v5 | sFlow | NetFlow v9 | IPFIX |\n| - | - | - | - | - | - |\n|Type|Type of flow message|NETFLOW_V5|SFLOW_5|NETFLOW_V9|IPFIX|\n|TimeReceived|Timestamp of when the message was received|Included|Included|Included|Included|\n|SequenceNum|Sequence number of the flow packet|Included|Included|Included|Included|\n|SamplingRate|Sampling rate of the flow|Included|Included|Included|Included|\n|FlowDirection|Direction of the flow| | |DIRECTION (61)|flowDirection (61)|\n|SamplerAddress|Address of the device that generated the packet|IP source of packet|Agent IP|IP source of packet|IP source of packet|\n|TimeFlowStart|Time the flow started|System uptime and first|=TimeReceived|System uptime and FIRST_SWITCHED (22)|flowStartXXX (150, 152, 154, 156)|\n|TimeFlowEnd|Time the flow ended|System uptime and last|=TimeReceived|System uptime and LAST_SWITCHED (23)|flowEndXXX (151, 153, 155, 157)|\n|Bytes|Number of bytes in flow|dOctets|Length of sample|IN_BYTES (1) OUT_BYTES (23)|octetDeltaCount (1) postOctetDeltaCount (23)|\n|Packets|Number of packets in flow|dPkts|=1|IN_PKTS (2) OUT_PKTS (24)|packetDeltaCount (1) postPacketDeltaCount (24)|\n|SrcAddr|Source address (IP)|srcaddr (IPv4 only)|Included|Included|IPV4_SRC_ADDR (8) IPV6_SRC_ADDR (27)|sourceIPv4Address/sourceIPv6Address (8/27)|\n|DstAddr|Destination address (IP)|dstaddr (IPv4 only)|Included|Included|IPV4_DST_ADDR (12) IPV6_DST_ADDR (28)|destinationIPv4Address (12)destinationIPv6Address (28)|\n|Etype|Ethernet type (0x86dd for IPv6...)|IPv4|Included|Included|Included|\n|Proto|Protocol (UDP, TCP, ICMP...)|prot|Included|PROTOCOL (4)|protocolIdentifier (4)|\n|SrcPort|Source port (when UDP/TCP/SCTP)|srcport|Included|L4_SRC_PORT (7)|sourceTransportPort (7)|\n|DstPort|Destination port (when UDP/TCP/SCTP)|dstport|Included|L4_DST_PORT (11)|destinationTransportPort (11)|\n|InIf|Input interface|input|Included|INPUT_SNMP (10)|ingressInterface (10)|\n|OutIf|Output interface|output|Included|OUTPUT_SNMP (14)|egressInterface (14)|\n|SrcMac|Source mac address| |Included|IN_SRC_MAC (56)|sourceMacAddress (56)|\n|DstMac|Destination mac address| |Included|OUT_DST_MAC (57)|postDestinationMacAddress (57)|\n|SrcVlan|Source VLAN ID| |From ExtendedSwitch|SRC_VLAN (59)|vlanId (58)|\n|DstVlan|Destination VLAN ID| |From ExtendedSwitch|DST_VLAN (59)|postVlanId (59)|\n|VlanId|802.11q VLAN ID| |Included|SRC_VLAN (59)|postVlanId (59)|\n|IngressVrfID|VRF ID| | | |ingressVRFID (234)|\n|EgressVrfID|VRF ID| | | |egressVRFID (235)|\n|IPTos|IP Type of Service|tos|Included|SRC_TOS (5)|ipClassOfService (5)|\n|ForwardingStatus|Forwarding status| | |FORWARDING_STATUS (89)|forwardingStatus (89)|\n|IPTTL|IP Time to Live| |Included|IPTTL (52)|minimumTTL (52|\n|TCPFlags|TCP flags|tcp_flags|Included|TCP_FLAGS (6)|tcpControlBits (6)|\n|IcmpType|ICMP Type| |Included|ICMP_TYPE (32)|icmpTypeXXX (176, 178) icmpTypeCodeXXX (32, 139)|\n|IcmpCode|ICMP Code| |Included|ICMP_TYPE (32)|icmpCodeXXX (177, 179) icmpTypeCodeXXX (32, 139)|\n|IPv6FlowLabel|IPv6 Flow Label| |Included|IPV6_FLOW_LABEL (31)|flowLabelIPv6 (31)|\n|FragmentId|IP Fragment ID| |Included|IPV4_IDENT (54)|fragmentIdentification (54)|\n|FragmentOffset|IP Fragment Offset| |Included|FRAGMENT_OFFSET (88)|fragmentOffset (88) and fragmentFlags (197)|\n|BiFlowDirection|BiFlow Identification| | | |biflowDirection (239)|\n|SrcAS|Source AS number|src_as|From ExtendedGateway|SRC_AS (16)|bgpSourceAsNumber (16)|\n|DstAS|Destination AS number|dst_as|From ExtendedGateway|DST_AS (17)|bgpDestinationAsNumber (17)|\n|NextHop|Nexthop address|nexthop|From ExtendedGateway|IPV4_NEXT_HOP (15) BGP_IPV4_NEXT_HOP (18) IPV6_NEXT_HOP (62) BGP_IPV6_NEXT_HOP (63)|ipNextHopIPv4Address (15) bgpNextHopIPv4Address (18) ipNextHopIPv6Address (62) bgpNextHopIPv6Address (63)|\n|NextHopAS|Nexthop AS number| |From ExtendedGateway| | |\n|SrcNet|Source address mask|src_mask|From ExtendedRouter|SRC_MASK (9) IPV6_SRC_MASK (29)|sourceIPv4PrefixLength (9) sourceIPv6PrefixLength (29)|\n|DstNet|Destination address mask|dst_mask|From ExtendedRouter|DST_MASK (13) IPV6_DST_MASK (30)|destinationIPv4PrefixLength (13) destinationIPv6PrefixLength (30)|\n|HasEncap|Indicates if has GRE encapsulation||Included|||\n|xxxEncap fields|Same as field but inside GRE||Included|||\n|HasMPLS|Indicates the presence of MPLS header||Included|||\n|MPLSCount|Count of MPLS layers||Included|||\n|MPLSxTTL|TTL of the MPLS label||Included|||\n|MPLSxLabel|MPLS label||Included|||\n\nIf you are implementing flow processors to add more data to the protobuf,\nwe suggest you use field IDs ≥ 1000.\n\n### Implementation notes\n\nThe pipeline at Cloudflare is connecting collectors with flow processors\nthat will add more information: with IP address, add country, ASN, etc.\n\nFor aggregation, we are using Materialized tables in Clickhouse.\nDictionaries help correlating flows with country and ASNs.\nA few collectors can treat hundred of thousands of samples.\n\nWe also experimented successfully flow aggregation with Flink using a\n[Keyed Session Window](https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/windows.html#session-windows):\nthis sums the `Bytes x SamplingRate` and `Packets x SamplingRate` received during a 5 minutes **window** while allowing 2 more minutes\nin the case where some flows were delayed before closing the **session**.\n\nThe BGP information provided by routers can be unreliable (if the router does not have a BGP full-table or it is a static route).\nYou can use Maxmind [prefix to ASN](https://dev.maxmind.com/geoip/geoip2/geolite2/) in order to solve this issue.\n\n## License\n\nLicensed under the BSD 3 License.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcloudflare%2Fgoflow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcloudflare%2Fgoflow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcloudflare%2Fgoflow/lists"}