{"id":17287444,"url":"https://github.com/utoni/ndpid","last_synced_at":"2025-04-06T03:05:52.756Z","repository":{"id":41399429,"uuid":"254873584","full_name":"utoni/nDPId","owner":"utoni","description":"Tiny nDPI based deep packet inspection daemons / toolkit.","archived":false,"fork":false,"pushed_at":"2025-03-06T18:07:47.000Z","size":154647,"stargazers_count":76,"open_issues_count":7,"forks_count":18,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-03-30T02:04:23.665Z","etag":null,"topics":["daemon","dpi","json","json-serialization","libndpi","linux","ndpi","toolkit"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/utoni.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"COPYING","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-04-11T13:27:53.000Z","updated_at":"2025-03-15T08:32:18.000Z","dependencies_parsed_at":"2024-05-13T09:47:18.034Z","dependency_job_id":"743d5e5d-f71d-43e2-9af3-d58c7808f9dc","html_url":"https://github.com/utoni/nDPId","commit_stats":{"total_commits":745,"total_committers":31,"mean_commits":"24.032258064516128","dds":"0.18120805369127513","last_synced_commit":"9fc35e7a7e29d9a346865651387a02514080c6b4"},"previous_names":[],"tags_count":14,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/utoni%2FnDPId","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/utoni%2FnDPId/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/utoni%2FnDPId/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/utoni%2FnDPId/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/utoni","download_url":"https://codeload.github.com/utoni/nDPId/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247427005,"owners_count":20937200,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["daemon","dpi","json","json-serialization","libndpi","linux","ndpi","toolkit"],"created_at":"2024-10-15T10:02:35.960Z","updated_at":"2025-04-06T03:05:52.743Z","avatar_url":"https://github.com/utoni.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Build](https://github.com/utoni/nDPId/actions/workflows/build.yml/badge.svg)](https://github.com/utoni/nDPId/actions/workflows/build.yml)\n[![Gitlab-CI](https://gitlab.com/utoni/nDPId/badges/main/pipeline.svg)](https://gitlab.com/utoni/nDPId/-/pipelines)\n[![Circle-CI](https://circleci.com/gh/utoni/nDPId.svg?style=shield \"Circle-CI\")](https://app.circleci.com/pipelines/github/utoni/nDPId)\n[![Lines of Code](https://sonarcloud.io/api/project_badges/measure?project=lnslbrty_nDPId\u0026metric=ncloc)](https://sonarcloud.io/summary/new_code?id=lnslbrty_nDPId)\n[![Code Smells](https://sonarcloud.io/api/project_badges/measure?project=lnslbrty_nDPId\u0026metric=code_smells)](https://sonarcloud.io/summary/new_code?id=lnslbrty_nDPId)\n[![Bugs](https://sonarcloud.io/api/project_badges/measure?project=lnslbrty_nDPId\u0026metric=bugs)](https://sonarcloud.io/summary/new_code?id=lnslbrty_nDPId)\n[![Vulnerabilities](https://sonarcloud.io/api/project_badges/measure?project=lnslbrty_nDPId\u0026metric=vulnerabilities)](https://sonarcloud.io/summary/new_code?id=lnslbrty_nDPId)\n[![Reliability Rating](https://sonarcloud.io/api/project_badges/measure?project=lnslbrty_nDPId\u0026metric=reliability_rating)](https://sonarcloud.io/summary/new_code?id=lnslbrty_nDPId)\n[![Quality Gate Status](https://sonarcloud.io/api/project_badges/measure?project=lnslbrty_nDPId\u0026metric=alert_status)](https://sonarcloud.io/summary/new_code?id=lnslbrty_nDPId)\n![Docker Automated build](https://img.shields.io/docker/automated/utoni/ndpid)\n\n# References\n\n[ntop Webinar 2022](https://www.ntop.org/webinar/ntop-webinar-on-dec-14th-community-meeting-and-future-plans/)\n[ntopconf 2023](https://www.ntop.org/ntopconf2023/)\n\n# Disclaimer\n\nPlease respect \u0026 protect the privacy of others.\n\nThe purpose of this software is not to spy on others, but to detect network anomalies and malicious traffic.\n\n# Abstract\n\nnDPId is a set of daemons and tools to capture, process and classify network traffic.\nIts minimal dependencies (besides a half-way modern C library and POSIX threads) are libnDPI (\u003e=4.13.0 or current github dev branch) and libpcap.\n\nThe daemon `nDPId` is capable of multithreading for packet processing, but w/o mutexes for performance reasons.\nInstead, synchronization is achieved by a packet distribution mechanism.\nTo balance the workload to all threads (more or less) equally, a unique identifier represented as hash value is calculated using a 3-tuple consisting of: IPv4/IPv6 src/dst address; IP header value of the layer4 protocol; and (for TCP/UDP) src/dst port. Other protocols e.g. ICMP/ICMPv6 lack relevance for DPI, thus nDPId does not distinguish between different ICMP/ICMPv6 flows coming from the same host. This saves memory and performance, but might change in the future.\n\n`nDPId` uses libnDPI's JSON serialization interface to generate a JSON messages for each event it receives from the library and which it then sends out to a UNIX-socket (default: `/tmp/ndpid-collector.sock` ). From such a socket, `nDPIsrvd` (or other custom applications) can retrieve incoming JSON-messages and further proceed working/distributing messages to higher-level applications.\n\nUnfortunately, `nDPIsrvd` does not yet support any encryption/authentication for TCP connections (TODO!).\n\n# Architecture\n\nThis project uses a kind of microservice architecture.\n\n```text\n                connect to UNIX socket [1]        connect to UNIX/TCP socket [2]                \n_______________________   |                                 |   __________________________\n|     \"producer\"      |___|                                 |___|       \"consumer\"       |\n|---------------------|      _____________________________      |------------------------|\n|                     |      |        nDPIsrvd           |      |                        |\n| nDPId --- Thread 1 \u003e| ---\u003e |\u003e           |             \u003c| ---\u003e |\u003c example/c-json-stdout |\n| (eth0) `- Thread 2 \u003e| ---\u003e |\u003e collector | distributor \u003c| ---\u003e |________________________|\n|        `- Thread N \u003e| ---\u003e |\u003e    \u003e\u003e\u003e forward \u003e\u003e\u003e      \u003c| ---\u003e |                        |\n|_____________________|  ^   |____________|______________|   ^  |\u003c example/py-flow-info  |\n|                     |  |                                   |  |________________________|\n| nDPId --- Thread 1 \u003e|  `- send serialized data [1]         |  |                        |\n| (eth1) `- Thread 2 \u003e|                                      |  |\u003c example/...           |\n|        `- Thread N \u003e|         receive serialized data [2] -'  |________________________|\n|_____________________|                                                                   \n\n```\nwhere:\n\n* `nDPId` capture traffic, extract traffic data (with libnDPI) and send a JSON-serialized output stream to an already existing UNIX-socket;\n* `nDPIsrvd`:\n\n    * create and manage an \"incoming\" UNIX-socket (ref [1] above), to fetch data from a local `nDPId`;\n    * apply a buffering logic to received data;\n    * create and manage an \"outgoing\" UNIX or TCP socket (ref [2] above) to relay matched events\n      to connected clients\n\n* `consumers` are common/custom applications being able to receive selected flows/events, via both UNIX-socket or TCP-socket.\n\n\n# JSON stream format\n\nJSON messages streamed by both `nDPId` and `nDPIsrvd` are presented with:\n\n* a 5-digit-number describing (as decimal number) the **entire** JSON message including the newline `\\n` at the end;\n* the JSON messages\n\n```text\n[5-digit-number][JSON message]\n```\n\nas with the following example:\n\n```text\n01223{\"flow_event_id\":7,\"flow_event_name\":\"detection-update\",\"thread_id\":12,\"packet_id\":307,\"source\":\"wlan0\", ...snip...}\n00458{\"packet_event_id\":2,\"packet_event_name\":\"packet-flow\",\"thread_id\":11,\"packet_id\":324,\"source\":\"wlan0\", ...snip...}\n00572{\"flow_event_id\":1,\"flow_event_name\":\"new\",\"thread_id\":11,\"packet_id\":324,\"source\":\"wlan0\", ...snip...}\n```\n\nThe full stream of `nDPId` generated JSON-events can be retrieved directly from `nDPId`, without relying on `nDPIsrvd`, by providing a properly managed UNIX-socket.\n\nTechnical details about the JSON-message format can be obtained from the related `.schema` file included in the `schema` directory\n\n\n# Events\n\n`nDPId` generates JSON messages whereby each string is assigned to a certain event.\nThose events specify the contents (key-value-pairs) of the JSON message.\nThey are divided into four categories, each with a number of subevents.\n\n## Error Events\nThey are 17 distinct events, indicating that layer2 or layer3 packet processing failed or not enough flow memory available:\n\n1. Unknown datalink layer packet\n2. Unknown L3 protocol\n3. Unsupported datalink layer\n4. Packet too short\n5. Unknown packet type\n6. Packet header invalid\n7. IP4 packet too short\n8. Packet smaller than IP4 header:\n9. nDPI IPv4/L4 payload detection failed\n10. IP6 packet too short\n11. Packet smaller than IP6 header\n12. nDPI IPv6/L4 payload detection failed\n13. TCP packet smaller than expected\n14. UDP packet smaller than expected\n15. Captured packet size is smaller than expected packet size\n16. Max flows to track reached\n17. Flow memory allocation failed\n\nDetailed JSON-schema is available [here](schema/error_event_schema.json)\n\n## Daemon Events\nThere are 4 distinct events indicating startup/shutdown or status events as well as a reconnect event if there was a previous connection failure (collector):\n\n1. init: `nDPId` startup\n2. reconnect: (UNIX) socket connection lost previously and was established again\n3. shutdown: `nDPId` terminates gracefully\n4. status: statistics about the daemon itself e.g. memory consumption, zLib compressions (if enabled)\n\nDetailed JSON-schema is available [here](schema/daemon_event_schema.json)\n\n\n## Packet Events\nThere are 2 events containing base64 encoded packet payloads either belonging to a flow or not:\n\n1. packet: does not belong to any flow\n2. packet-flow: belongs to a flow e.g. TCP/UDP or ICMP\n\nDetailed JSON-schema is available [here](schema/packet_event_schema.json)\n\n## Flow Events\nThere are 9 distinct events related to a flow:\n\n1. new: a new TCP/UDP/ICMP flow seen which will be tracked\n2. end: a TCP connection terminates\n3. idle: a flow timed out, because there was no packet on the wire for a certain amount of time\n4. update: inform nDPIsrvd or other apps about a long-lasting flow, whose detection was finished a long time ago but is still active\n5. analyse: provide some information about extracted features of a flow (Experimental; disabled per default, enable with `-A`)\n6. guessed: `libnDPI` was not able to reliably detect a layer7 protocol and falls back to IP/Port based detection\n7. detected: `libnDPI` sucessfully detected a layer7 protocol\n8. detection-update: `libnDPI` dissected more layer7 protocol data (after detection already done)\n9. not-detected: neither detected nor guessed\n\nDetailed JSON-schema is available [here](schema/flow_event_schema.json). Also, a graphical representation of *Flow Events* timeline is available [here](schema/flow_events_diagram.png). \n\n# Flow States\n\nA flow can have three different states while it is been tracked by `nDPId`.\n\n1. skipped: the flow will be tracked, but no detection will happen to reduce memory usage.\n   See command line argument `-I` and `-E`\n2. finished: detection finished and the memory used for the detection is freed\n3. info: detection is in progress and all flow memory required for `libnDPI` is allocated (this state consumes most memory)\n\n# Build (CMake)\n\n`nDPId` build system is based on [CMake](https://cmake.org/)\n\n```shell\ngit clone https://github.com/utoni/nDPId.git\n[...]\ncd ndpid\nmkdir build\ncd build\ncmake ..\n[...]\nmake\n```\n\nsee below for a full/test live-session\n\n![](examples/ndpid_install_and_run.gif)\n\nBased on your build environment and/or desiderata, you could need:\n\n```shell\nmkdir build\ncd build\nccmake ..\n```\n\nor to build with a staticially linked libnDPI:\n\n```shell\ncmake -S . -B ./build \\\n    -DSTATIC_LIBNDPI_INSTALLDIR=[path/to/your/libnDPI/installdir] \\\n    -DNDPI_NO_PKGCONFIG=ON\ncmake --build ./build\n```\n\nIf you use the latter, make sure that you've configured libnDPI with `./configure --prefix=[path/to/your/libnDPI/installdir]`\nand remember to set the all-necessary CMake variables to link against shared libraries used by your nDPI build.\nYou'll also need to use `-DNDPI_NO_PKGCONFIG=ON` if `STATIC_LIBNDPI_INSTALLDIR` does not contain a pkg-config file.\n\ne.g.:\n\n```shell\ncmake -S . -B ./build \\\n    -DSTATIC_LIBNDPI_INSTALLDIR=[path/to/your/libnDPI/installdir] \\\n    -DNDPI_NO_PKGCONFIG=ON \\\n    -DNDPI_WITH_GCRYPT=ON -DNDPI_WITH_PCRE=OFF -DNDPI_WITH_MAXMINDDB=OFF\ncmake --build ./build\n```\n\nOr let a shell script do the work for you:\n\n```shell\ncmake -S . -B ./build \\\n    -DBUILD_NDPI=ON\ncmake --build ./build\n```\n\nThe CMake cache variable `-DBUILD_NDPI=ON` builds a version of `libnDPI` residing as a git submodule in this repository.\n\n# run\n\nAs mentioned above, in order to run `nDPId`, a UNIX-socket needs to be provided in order to stream our related JSON-data.\n\nSuch a UNIX-socket can be provided by both the included `nDPIsrvd` daemon, or, if you simply need a quick check, with the [ncat](https://nmap.org/book/ncat-man.html) utility, with a simple `ncat -U /tmp/listen.sock -l -k`. Remember that OpenBSD `netcat` is not able to handle multiple connections reliably.\n\nOnce the socket is ready, you can run `nDPId` capturing and analyzing your own traffic, with something similar to: `sudo nDPId -c /tmp/listen.sock`\nIf you're using OpenBSD `netcat`, you need to run: `sudo nDPId -c /tmp/listen.sock -o max-reader-threads=1`\nMake sure that the UNIX socket is accessible by the user (see -u) to whom nDPId changes to, default: nobody.\n\nOf course, both `ncat` and `nDPId` need to point to the same UNIX-socket (`nDPId` provides the `-c` option, exactly for this. By default, `nDPId` refers to `/tmp/ndpid-collector.sock`, and the same default-path is also used by `nDPIsrvd` for the incoming socket).\n\nGive `nDPId` some real-traffic. You can capture your own traffic, with something similar to:\n\n```shell\nsocat -u UNIX-Listen:/tmp/listen.sock,fork - # does the same as `ncat`\nsudo chown nobody:nobody /tmp/listen.sock # default `nDPId` user/group, see `-u` and `-g`\nsudo ./nDPId -c /tmp/listen.sock -l\n```\n\n`nDPId` supports also UDP collector endpoints:\n\n```shell\nnc -d -u 127.0.0.1 7000 -l -k\nsudo ./nDPId -c 127.0.0.1:7000 -l\n```\n\nor you can generate a nDPId-compatible JSON dump with:\n\n```shell\n./nDPId-test [path-to-a-PCAP-file]\n```\n\nYou can also automatically fire both `nDPId` and `nDPIsrvd` automatically, with:\n\nDaemons:\n```shell\nmake -C [path-to-a-build-dir] daemon\n```\n\nOr a manual approach with:\n\n```shell\n./nDPIsrvd -d\nsudo ./nDPId -d\n```\n\nor for a usage printout:\n```shell\n./nDPIsrvd -h\n./nDPId -h\n```\n\nAnd why not a flow-info example?\n```shell\n./examples/py-flow-info/flow-info.py\n```\n\nor anything below `./examples`.\n\n# nDPId tuning\n\nIt is possible to change `nDPId` internals w/o recompiling by using `-o subopt=value`.\nBut be careful: changing the default values may render `nDPId` useless and is not well tested.\n\nSuboptions for `-o`:\n\nFormat: `subopt` (unit, comment): description\n\n * `max-flows-per-thread` (N, caution advised): affects max. memory usage\n * `max-idle-flows-per-thread` (N, safe): max. allowed idle flows whose memory gets freed after `flow-scan-interval`\n * `max-reader-threads` (N, safe): amount of packet processing threads, every thread can have a max. of `max-flows-per-thread` flows\n * `daemon-status-interval` (ms, safe): specifies how often daemon event `status` is generated\n * `compression-scan-interval` (ms, untested): specifies how often `nDPId` scans for inactive flows ready for compression\n * `compression-flow-inactivity` (ms, untested): the shortest period of time elapsed before `nDPId` considers compressing a flow (e.g. nDPI flow struct) that neither sent nor received any data\n * `flow-scan-interval` (ms, safe): min. amount of time after which `nDPId` scans for idle or long-lasting flows\n * `generic-max-idle-time` (ms, untested): time after which a non TCP/UDP/ICMP flow times out\n * `icmp-max-idle-time` (ms, untested): time after which an ICMP flow times out\n * `udp-max-idle-time` (ms, caution advised): time after which an UDP flow times out\n * `tcp-max-idle-time` (ms, caution advised): time after which a TCP flow times out\n * `tcp-max-post-end-flow-time` (ms, caution advised): a TCP flow that received a FIN or RST waits this amount of time before flow tracking stops and the flow memory is freed\n * `max-packets-per-flow-to-send` (N, safe): max. `packet-flow` events generated for the first N packets of each flow\n * `max-packets-per-flow-to-process` (N, caution advised): max. amount of packets processed by `libnDPI`\n * `max-packets-per-flow-to-analyze` (N, safe): max. packets to analyze before sending an `analyse` event, requires `-A`\n * `error-event-threshold-n` (N, safe): max. error events to send until threshold time has passed\n * `error-event-threshold-time` (N, safe): time after which the error event threshold resets\n\n# test\n\nThe recommended way to run regression / diff tests:\n\n```shell\ncmake -S . -B ./build-like-ci \\\n    -DBUILD_NDPI=ON -DENABLE_ZLIB=ON -DBUILD_EXAMPLES=ON\n# optional: -DENABLE_CURL=ON -DENABLE_SANITIZER=ON\n./test/run_tests.sh ./libnDPI ./build-like-ci/nDPId-test\n# or: make -C ./build-like-ci test\n```\n\nRun `./test/run_tests.sh` to see some usage information.\n\nRemember that all test results are tied to a specific libnDPI commit hash\nas part of the `git submodule`. Using `test/run_tests.sh` for other commit hashes\nwill most likely result in PCAP diffs.\n\n# Code Coverage\n\nYou may generate code coverage by using:\n\n```shell\ncmake -S . -B ./build-coverage \\\n    -DENABLE_COVERAGE=ON -DENABLE_ZLIB=ON\n# optional: -DBUILD_NDPI=ON\nmake -C ./build-coverage coverage-clean\nmake -C ./build-coverage clean\nmake -C ./build-coverage all\n./test/run_tests.sh ./libnDPI ./build-coverage/nDPId-test\nmake -C ./build-coverage coverage\nmake -C ./build-coverage coverage-view\n```\n\n# Contributors\n\nSpecial thanks to Damiano Verzulli ([@verzulli](https://github.com/verzulli)) from [GARRLab](https://www.garrlab.it) for providing server and test infrastructure.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Futoni%2Fndpid","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Futoni%2Fndpid","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Futoni%2Fndpid/lists"}