{"id":19543510,"url":"https://github.com/hitsz-ids/duetector","last_synced_at":"2025-04-26T17:32:29.097Z","repository":{"id":187387862,"uuid":"676793330","full_name":"hitsz-ids/duetector","owner":"hitsz-ids","description":"duetector🔍: Data Usage Extensible Detector for data usage observability.","archived":false,"fork":false,"pushed_at":"2024-11-04T14:23:50.000Z","size":2071,"stargazers_count":10,"open_issues_count":12,"forks_count":8,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-11-04T15:30:37.263Z","etag":null,"topics":["bcc","data-usage","ebpf","kata-containers","observability"],"latest_commit_sha":null,"homepage":"https://dataucon.idslab.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hitsz-ids.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-08-10T03:08:35.000Z","updated_at":"2024-11-04T14:23:53.000Z","dependencies_parsed_at":"2023-12-26T01:44:15.246Z","dependency_job_id":"7b0554de-df49-4dce-932a-9ef63a20fd24","html_url":"https://github.com/hitsz-ids/duetector","commit_stats":null,"previous_names":["hitsz-ids/duetector"],"tags_count":18,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hitsz-ids%2Fduetector","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hitsz-ids%2Fduetector/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hitsz-ids%2Fduetector/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hitsz-ids%2Fduetector/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hitsz-ids","download_url":"https://codeload.github.com/hitsz-ids/duetector/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224041285,"owners_count":17245874,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bcc","data-usage","ebpf","kata-containers","observability"],"created_at":"2024-11-11T03:19:20.729Z","updated_at":"2024-11-11T03:19:22.310Z","avatar_url":"https://github.com/hitsz-ids.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n\u003ca href=\"https://github.com/hitsz-ids/dataucon\"\u003e\u003cimg alt=\"DataUCon\" src=\"https://raw.githubusercontent.com/hitsz-ids/dataucon/main/img/white-icon-simple.png\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\u003ch2 align=\"center\"\u003eduetector🔍: Data Usage Extensible detector(eBPF Support)\u003c/h2\u003e\n\u003cp align=\"center\"\u003e\n\u003ca href=\"https://github.com/hitsz-ids/duetector/actions\"\u003e\u003cimg alt=\"Actions Status\" src=\"https://github.com/hitsz-ids/duetector/actions/workflows/python-package.yml/badge.svg\"\u003e\u003c/a\u003e\n\u003ca href='https://duetector.readthedocs.io/en/latest/?badge=latest'\u003e\u003cimg src='https://readthedocs.org/projects/duetector/badge/?version=latest' alt='Documentation Status' /\u003e\u003c/a\u003e\n\u003ca href=\"https://results.pre-commit.ci/latest/github/hitsz-ids/duetector/main\"\u003e\u003cimg alt=\"pre-commit.ci status\" src=\"https://results.pre-commit.ci/badge/github/hitsz-ids/duetector/main.svg\"\u003e\u003c/a\u003e\n\u003ca href=\"https://github.com/hitsz-ids/duetector/blob/main/LICENSE\"\u003e\u003cimg alt=\"LICENSE\" src=\"https://img.shields.io/github/license/hitsz-ids/duetector\"\u003e\u003c/a\u003e\n\u003ca href=\"https://github.com/hitsz-ids/duetector/releases/\"\u003e\u003cimg alt=\"Releases\" src=\"https://img.shields.io/github/v/release/hitsz-ids/duetector\"\u003e\u003c/a\u003e\n\u003ca href=\"https://github.com/hitsz-ids/duetector/releases/\"\u003e\u003cimg alt=\"Pre Releases\" src=\"https://img.shields.io/github/v/release/hitsz-ids/duetector?include_prereleases\u0026label=pre-release\u0026logo=github\"\u003e\u003c/a\u003e\n\u003ca href=\"https://github.com/hitsz-ids/duetector\"\u003e\u003cimg alt=\"Last Commit\" src=\"https://img.shields.io/github/last-commit/hitsz-ids/duetector\"\u003e\u003c/a\u003e\n\u003ca href=\"https://github.com/hitsz-ids/duetector\"\u003e\u003cimg alt=\"Python version\" src=\"https://img.shields.io/pypi/pyversions/duetector\"\u003e\u003c/a\u003e\n\u003ca href=\"https://github.com/hitsz-ids/duetector/contributors\"\u003e\u003cimg alt=\"contributors\" src=\"https://img.shields.io/github/all-contributors/hitsz-ids/duetector?color=ee8449\u0026style=flat-square\"\u003e\u003c/a\u003e\n\u003ca href=\"https://join.slack.com/t/hitsz-ids/shared_invite/zt-2395mt6x2-dwf0j_423QkAgGvlNA5E1g\"\u003e\u003cimg alt=\"slack\" src=\"https://img.shields.io/badge/slack-join%20chat-ff69b4.svg?style=flat-square\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n \u003ca href=\"./README.md\"\u003eEnglish\u003c/a\u003e | \u003ca href=\"./README_zh.md\"\u003e中文\u003c/a\u003e\n\u003c/p\u003e\n\n## Introduction\n\n\u003e duetector is one of the components in the DataUCON project, which is designed to provide support for data usage control. [Intro DataUCON](https://dataucon.idslab.io/).\n\nduetector🔍 is an extensible data usage control detector that provides support for data usage control by probing for data usage behavior in the Linux kernel(based on eBPF).\n\n**🐛🐞🧪 The project is under heavy development, looking forward to any bug reports, feature requests, pull requests!**\n\nIn the [ABAUC control model](https://github.com/hitsz-ids/dataucon), duetector can be used as a PIP (Policy Information Point) to obtain data usage behavior, so as to provide information about data usage behavior for PDP (Policy Decision Point). Provide information on data usage behavior to PDP (Policy Decision Point).\n\nTry simple user case: [Simplest Open Count](./docs/usercases/simplest-open-count/README.md).\n\nJoin our [slack channel](https://join.slack.com/t/hitsz-ids/shared_invite/zt-2395mt6x2-dwf0j_423QkAgGvlNA5E1g).\n\n## Table of Contents\n\n- [Features](#Features)\n- [Installation](#Installation)\n- [Quick Start](#quick-start)\n- [API](#API-documentation)\n- [Maintainers](#Maintainers)\n- [Contributors](#Contributors)\n- [How to contribute](#How-to-contribute)\n- [License](#License)\n\n## Feature\n\n- Plug-in system support, see [examples](./examples/README.md) for more details\n  - [x] Custom `Tracer` and `TracerManager`\n  - [x] Custom `Filters` and `FilterManager`\n  - [x] Custom `Collector` and `CollectorManager`\n  - [x] Custom `Analyzer` and `AnalyzerManager`\n- Configuration Management\n  - [x] Configuration using a single configuration file\n  - [x] Generate Plugin Configuration\n  - [ ] Support for dynamically loading configurations\n- `Tracer` Support\n  - [x] eBPF-based tracer\n  - [x] Shell command tracer\n  - [x] Subprocess tracer\n- `Filter` Support\n  - [x] Pattern matching, based on regular expressions\n- Data Collection and Analysis\n  - [x] `Analyzer` Support SQL database\n  - [x] `Collector` Support SQL database and *OpenTelemetry(Experimental)*\n- User Interface\n  - [x] CLI Tools\n  - [x] PIP Service\n  - [ ] Control Panel\n- Enhancements\n  - [ ] `RunC` containers identification\n\nThe eBPF program requires kernel support, see [Kernel Support](./docs/kernel_config.md)\n\n## Installation\n\nThe code is distributed via Pypi, and you can install it with the following command\n\n```bash\npip install duetector\n```\n\nCurrently, the code relies on [BCC](https://github.com/iovisor/bcc) for on-the-fly compilation of eBPF code, we recommend [installing the latest BCC compiler](https://github.com/iovisor/bcc/blob/master/INSTALL.md)\n\nOr use the Docker image that we provide, which uses [JupyterLab](https://github.com/jupyterlab/jupyterlab) as the **example** user application, or you can modify the [Dockerfile](./docker/Dockerfile) and [startup script](./docker/start.sh) to customize the user application.\n\n```bash\ndocker pull dataucon/duetector:latest\n```\n\nPre-releases will not be updated to `latest`, you can specify the tag to pull, e.g. `v0.0.1a`\n\n```bash\ndocker pull dataucon/duetector:v0.0.1a\n```\n\nFor more details on running with docker images see [here](./docs/how-to/run-with-docker.md)\n\n## Quick start\n\n\u003e More documentation and examples can be found [here](./docs/).\n\n### Start detector\n\nStart monitor using the command line, since bcc requires root privileges, we use the `sudo` command, which will start all probes and collect the probes into the `duetector-dbcollector.sqlite3` file in the current directory\n\n```bash\nsudo duectl start\n```\n\nPress `CRTL+C` to exit monitoring and you will see a summary output on the screen\n\n```\n{'DBCollector': {'OpenTracer': {'count': 31, 'first at': 249920233249912, 'last': Tracking(tracer='OpenTracer', pid=641616, uid=1000, gid= 1000, comm='node', cwd=None, fname='SOME-FILE', timestamp=249923762308577, extended={})}}}\n```\n\nEnable `DEBUG` log\n\n```bash\nsudo DUETECTOR_LOG_LEVEL=DEBUG duectl start\n```\n\nAt startup, the configuration file will be automatically generated at `~/.config/duetector`, and you can specify the configuration file to use with `--config`.\n\n```bash\nsudo duectl start --config \u003cconfig-file-path\u003e\n```\n\nConfiguration using environment variables is also supported:\n\n```bash\nUsage: duectl start [OPTIONS]\n\n  Start A bcc monitor and wait for KeyboardInterrupt\n\nOptions:\n  ...\n  --load_env BOOLEAN            Weather load env variables,Prefix: DUETECTOR_,\n                                Separator:__, e.g. DUETECTOR_config__a means\n                                config.a, default: True\n  ...\n```\n\nWhen using a plugin, the default configuration file will not contain the plugin's configuration, use the dynamically-generated configuration directive to generate a configuration file with the plugin's configuration, this directive also supports merging existing configuration files and environment variables.\n\n```bash\nduectl generate-dynamic-config --help\n```\n\nUse `generate-config` to restore the default state in case of configuration file errors.\n\n```bash\nduectl generate-config\n```\n\nGoing a step further, running in the background you can use the `duectl-daemon start` command, which will run a daemon in the background, which you can stop using `duectl-daemon stop`\n\nUse `duectl-daemon --help` for more details:\n\n```bash\nUsage: duectl-daemon [OPTIONS] COMMAND [ARGS]...\n\nOptions:\n  --help  Show this message and exit.\n\nCommands:\n  start   Start a background process of command `duectl start`.\n  status  Show status of process.\n  stop    Stop the process.\n```\n\n### Analyzing with analyzer\n\nWe provide an [Analyzer](https://duetector.readthedocs.io/en/latest/analyzer/index.html) that can query the data in storage, try it in [user case](./docs/usercases/simplest-open-count/README.md)\n\n### Using duetector server\n\nWe provide a Duetector Server as an external PIP service and control interface\n\nA Duetector Server can be started using `duectl-server` and will listen on `0.0.0.0:8120` by default, you can modify it using `--host` and `--port`.\n\n```bash\n$ duectl-server start --help\nUsage: duectl-server start [OPTIONS]\n\n  Start duetector server\n\nOptions:\n  --config TEXT       Config file path, default:\n                      ``~/.config/duetector/config.toml``.\n  --load_env BOOLEAN  Weather load env variables, Prefix: ``DUETECTOR_``,\n                      Separator:``__``, e.g. ``DUETECTOR_config__a`` means\n                      ``config.a``, default: True\n  --workdir TEXT      Working directory, default: ``.``.\n  --host TEXT         Host to listen, default: ``0.0.0.0``.\n  --port INTEGER      Port to listen, default: ``8120``.\n  --workers INTEGER   Number of worker processes, default: ``1``.\n  --help              Show this message and exit.\n```\n\nAfter the service has started, visit `http://{ip}:{port}/docs` to see the API documentation.\n\nSimilarly, using `duectl-server-daemon start` you can run a Duetector Server in the background, and you can stop it using `duectl-server-daemon stop`\n\n```bash\n$ duectl-server-daemon\nUsage: duectl-server-daemon [OPTIONS] COMMAND [ARGS]...\n\nOptions:\n  --help  Show this message and exit.\n\nCommands:\n  start   Start a background process of command ``duectl-server start``.\n  status  Show status of process.\n  stop    Stop the process.\n```\n\n## API documentation\n\nSee [docs of duetector](https://duetector.readthedocs.io/)\n\n## Maintainers\n\nThis project is initiated by **Institute of Data Security, Harbin Institute of Technology (Shen Zhen)**, if you are interested in this project and [DataUCON](https://dataucon.idslab.io/) project and willing to work together to improve it, welcome to join our open source community.\n\n## Contributors\n\n\u003c!-- ALL-CONTRIBUTORS-LIST:START - Do not remove or modify this section --\u003e\n\n\u003c!-- prettier-ignore-start --\u003e\n\n\u003c!-- markdownlint-disable --\u003e\n\n\u003ctable\u003e\n  \u003ctbody\u003e\n    \u003ctr\u003e\n      \u003ctd align=\"center\" valign=\"top\" width=\"14.28%\"\u003e\u003ca href=\"https://github.com/wh1isper\"\u003e\u003cimg src=\"https://avatars.githubusercontent.com/u/43375501?v=4?s=100\" width=\"100px;\" alt=\"wh1isper\"/\u003e\u003cbr /\u003e\u003csub\u003e\u003cb\u003ewh1isper\u003c/b\u003e\u003c/sub\u003e\u003c/a\u003e\u003cbr /\u003e\u003ca href=\"#code-wh1isper\" title=\"Code\"\u003e💻\u003c/a\u003e\u003c/td\u003e\n      \u003ctd align=\"center\" valign=\"top\" width=\"14.28%\"\u003e\u003ca href=\"https://github.com/WYXsb\"\u003e\u003cimg src=\"https://avatars.githubusercontent.com/u/62527555?v=4?s=100\" width=\"100px;\" alt=\"MayDown\"/\u003e\u003cbr /\u003e\u003csub\u003e\u003cb\u003eMayDown\u003c/b\u003e\u003c/sub\u003e\u003c/a\u003e\u003cbr /\u003e\u003ca href=\"#code-WYXsb\" title=\"Code\"\u003e💻\u003c/a\u003e\u003c/td\u003e\n      \u003ctd align=\"center\" valign=\"top\" width=\"14.28%\"\u003e\u003ca href=\"https://github.com/tsdsnk\"\u003e\u003cimg src=\"https://avatars.githubusercontent.com/u/93241244?v=4?s=100\" width=\"100px;\" alt=\"tsdsnk\"/\u003e\u003cbr /\u003e\u003csub\u003e\u003cb\u003etsdsnk\u003c/b\u003e\u003c/sub\u003e\u003c/a\u003e\u003cbr /\u003e\u003ca href=\"#doc-tsdsnk\" title=\"Documentation\"\u003e📖\u003c/a\u003e\u003c/td\u003e\n      \u003ctd align=\"center\" valign=\"top\" width=\"14.28%\"\u003e\u003ca href=\"https://github.com/zhemulin\"\u003e\u003cimg src=\"https://avatars.githubusercontent.com/u/89471919?v=4?s=100\" width=\"100px;\" alt=\"zhemulin\"/\u003e\u003cbr /\u003e\u003csub\u003e\u003cb\u003ezhemulin\u003c/b\u003e\u003c/sub\u003e\u003c/a\u003e\u003cbr /\u003e\u003ca href=\"#doc-zhemulin\" title=\"Documentation\"\u003e📖\u003c/a\u003e\u003c/td\u003e\n      \u003ctd align=\"center\" valign=\"top\" width=\"14.28%\"\u003e\u003ca href=\"https://github.com/aklly\"\u003e\u003cimg src=\"https://avatars.githubusercontent.com/u/87172923?v=4?s=100\" width=\"100px;\" alt=\"Mortal\"/\u003e\u003cbr /\u003e\u003csub\u003e\u003cb\u003eMortal\u003c/b\u003e\u003c/sub\u003e\u003c/a\u003e\u003cbr /\u003e\u003ca href=\"#doc-aklly\" title=\"Documentation\"\u003e📖\u003c/a\u003e\u003c/td\u003e\n      \u003ctd align=\"center\" valign=\"top\" width=\"14.28%\"\u003e\u003ca href=\"https://github.com/mingzhedream\"\u003e\u003cimg src=\"https://avatars.githubusercontent.com/u/58738872?v=4?s=100\" width=\"100px;\" alt=\"mingzhedream\"/\u003e\u003cbr /\u003e\u003csub\u003e\u003cb\u003emingzhedream\u003c/b\u003e\u003c/sub\u003e\u003c/a\u003e\u003cbr /\u003e\u003ca href=\"#doc-mingzhedream\" title=\"Documentation\"\u003e📖\u003c/a\u003e\u003c/td\u003e\n    \u003c/tr\u003e\n  \u003c/tbody\u003e\n\u003c/table\u003e\n\n\u003c!-- markdownlint-restore --\u003e\n\n\u003c!-- prettier-ignore-end --\u003e\n\n\u003c!-- ALL-CONTRIBUTORS-LIST:END --\u003e\n\n## How to contribute\n\nStarting with the [good first issue](https://github.com/hitsz-ids/duetector/issues/70) and reading our [contributing guidelines](./CONTRIBUTING.md).\n\nLearn about the designing and architecture of this project here: [docs/design](./docs/design/README.md).\n\n## License\n\nThis project uses Apache-2.0 license, please refer to [LICENSE](./LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhitsz-ids%2Fduetector","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhitsz-ids%2Fduetector","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhitsz-ids%2Fduetector/lists"}