{"id":15192461,"url":"https://github.com/dcso/balboa","last_synced_at":"2025-10-27T18:30:20.412Z","repository":{"id":44535582,"uuid":"152093398","full_name":"DCSO/balboa","owner":"DCSO","description":"server for indexing and querying passive DNS observations ","archived":false,"fork":false,"pushed_at":"2025-07-07T21:36:36.000Z","size":3003,"stargazers_count":46,"open_issues_count":13,"forks_count":7,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-07-08T00:56:10.998Z","etag":null,"topics":["api","dns","golang","graphql","graphql-api","hacktoberfest","monitoring","passive","passive-dns","passivedns","pdns","rocksdb","security","suricata"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DCSO.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2018-10-08T14:25:40.000Z","updated_at":"2025-04-17T16:10:02.000Z","dependencies_parsed_at":"2024-06-20T02:44:42.288Z","dependency_job_id":"b15c9e81-a904-4153-8a42-dad376633d1c","html_url":"https://github.com/DCSO/balboa","commit_stats":{"total_commits":178,"total_committers":12,"mean_commits":"14.833333333333334","dds":0.5674157303370786,"last_synced_commit":"f3d1bd62580713ec8394f830be496a3fa2dea363"},"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/DCSO/balboa","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DCSO%2Fbalboa","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DCSO%2Fbalboa/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DCSO%2Fbalboa/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DCSO%2Fbalboa/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DCSO","download_url":"https://codeload.github.com/DCSO/balboa/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DCSO%2Fbalboa/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":281319623,"owners_count":26481035,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-27T02:00:05.855Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api","dns","golang","graphql","graphql-api","hacktoberfest","monitoring","passive","passive-dns","passivedns","pdns","rocksdb","security","suricata"],"created_at":"2024-09-27T21:40:21.870Z","updated_at":"2025-10-27T18:30:19.653Z","avatar_url":"https://github.com/DCSO.png","language":"C","readme":"# 📑 balboa\n\n![Build Status](https://github.com/DCSO/balboa/actions/workflows/go.yml/badge.svg)\n[![Go Report Card](https://goreportcard.com/badge/github.com/DCSO/balboa)](https://goreportcard.com/report/github.com/DCSO/balboa)\n\n\nbalboa is the BAsic Little Book Of Answers. It consumes and indexes\nobservations from [passive DNS](https://www.farsightsecurity.com/technical/passive-dns/)\ncollection, providing a [GraphQL](https://graphql.org/) interface to access\nthe aggregated contents of the observations database. We built balboa to handle\npassive DNS data aggregated from metadata gathered by\n[Suricata](https://suricata-ids.org).\n\nThe API should be suitable for integration into existing multi-source\nobservable integration frameworks. It is possible to produce results in a\n[Common Output Format](https://datatracker.ietf.org/doc/draft-dulaunoy-dnsop-passive-dns-cof/)\ncompatible schema using either a GraphQL API (see below) or a REST API compatible with\n[CIRCL's](https://www.circl.lu/services/passive-dns/).\n\nThe balboa software...\n\n- is fast for queries and input/updates\n- implements storage using pluggable backends, potentially on separate machines\n- supports tracking and specifically querying multiple sensors\n- makes use of multiple cores for query and ingest\n- accepts input from multiple sources simultaneously\n  - HTTP (POST)\n  - AMQP\n  - Unix socket\n  - network socket (NMSG format only)\n- can tag and filter observations based on various properties\n- can store observations to one or multiple backends based on matched selectors\n- accepts various input formats\n  - JSON-based\n    - [FEVER](https://github.com/DCSO/fever)\n    - [gopassivedns](https://github.com/Phillipmartin/gopassivedns)\n    - [Packetbeat](https://www.elastic.co/guide/en/beats/packetbeat/master/packetbeat-dns-options.html) (via\n      Logstash)\n    - [Suricata EVE DNS v1 and v2](http://suricata.readthedocs.io/en/latest/output/eve/eve-json-format.html#event-type-dns)\n  - flat text file\n    - Edward Fjellskål's [PassiveDNS](https://github.com/gamelinux/passivedns) tabular format (default order `-f SMcsCQTAtn`)\n  - binary\n    - Farsight Security [NMSG format](https://www.farsightsecurity.com/txt-record/2015/01/28/nmsg-intro/) via network socket\n\n## Building and Installation\n\n```text\n$ go get github.com/DCSO/balboa/cmd/balboa\n...\n```\nThis will drop a `balboa` executable in your Go bin path.\n\nTo build the backends:\n\n```text\n$ cd $GOPATH/src/github.com/DCSO/balboa/backend\n$ make\n...\n```\n\nThis will create a binary executable in the `build/` subdirectories of each backends directory.\n\n### Dependencies\n\n- Go 1.10 or later\n- For the bundled RocksDB backend: [RocksDB](https://rocksdb.org/) 5.0 or later (shared lib, with LZ4 support)\n\nOn Debian, for example, one can satisfy these dependencies with:\n\n```text\n% apt install golang-go librocksdb-dev\n...\n```\n\n## Usage\n\n### Configuring feeders\n\nFeeders are used to get observations into the database. They run concurrently\nand process inputs in the background, making results accessible via the query\ninterface as soon as the resulting upsert transactions have been completed in\nthe database. What feeders are to be created is defined in a YAML configuration\nfile (to be passed via the `-f` parameter to `balboa serve`). Example:\n\n```yaml\nfeeder:\n    - name: AMQP Input\n      type: amqp\n      url: amqp://guest:guest@localhost:5672\n      exchange: [ tdh.pdns ]\n      input_format: fever_aggregate\n    - name: HTTP Input\n      type: http\n      listen_host: 127.0.0.1\n      listen_port: 8081\n      input_format: fever_aggregate\n    - name: Socket Input\n      type: socket\n      path: /tmp/balboa.sock\n      input_format: gopassivedns\n```\n\nA balboa instance given this feeder configuration would support the following\ninput options:\n\n- JSON in FEVER's aggregate format delivered via AMQP from a temporary queue\n  attached to the exchange `tdh.pdns` on `localhost` port 5762, authenticated\n  with user `guest` and password `guest`\n- JSON in FEVER's aggregate format parsed from HTTP POST requests on port 8081 on the local system\n- JSON in gopassivedns's format, fed into the UNIX socket `/tmp/balboa.sock` created by balboa\n\nAll of these feeders accept input simultaneously, there is no distinction made\nas to where an observation has come from. It is possible to specify multiple\nfeeders of the same type but with different settings as long as their `name`s\nare unique.\n\n### Configuring selectors\n\nBalboa provides a selector engine which can be used to select or filter observations.\nThe selector engine is configured in a YAML file which is provided via the `-s` parameter to balboa.\n\nAvailable selector implementations:\n\n* regex: match the `RRNAME` field of the observation with one or multiple selectors\n* lua: process observations with lua scripts, see *selector.lua* for an example\n\nExample:\n\n```yaml\nselectors:\n  - name: Filter Unwanted TLDs\n    type: regex\n    mode: filter\n    regexp:\n      - unwanted_regex.txt\n    tags:\n      - filtered_tlds\n  - name: CobaltStrike Regex\n    type: regex\n    mode: select\n    regexp:\n      - cobaltstrike_regex.txt\n    ingest:\n      - filtered_tlds\n    tags:\n      - possible_cobaltstrike\n```\n\nThis configuration will tag all observations which are **not** matched by the regular expressions in `unwanted_regex.txt` with the tag `filtered_tlds`.\nAll observations which are tagged with `filtered_tlds` and which match one or more regular expressions in `cobaltstrike_regex.txt` are tagged with `possible_cobaltstrike`.\n\n### Configuring the database backend\n\nMultiple database backends are supported to store pDNS observations\npersistently. Each database backend is provided as a self-contained binary\n(executable). The frontend connects to exactly one database backend. The\nbackend, however, supports multiple client or frontend connections.\nEach backend can either configure all observations (no `tags` parameter) or a list of tags (conditional or).\n\nThe backend configuration is defined in a YAML file (to be passed via the `-b` parameter to `balboa server`). Example:\n\n```yaml\n- name: cobaltstrike\n  host: \"localhost:4242\"\n  tags:\n    - possible_cobaltstrike\n- name: all filtered observations\n  host: \"localhost:4343\"\n  tags:\n    - filtered_tlds\n```\n\nA balboa instance with this backend configuration will store all events tagged with `possible_cobaltstrike` to the backend\nlistening on port `localhost:4242` and all events tagged with `filtered_tlds` to the backend on `localhost:4343`.\n\n### Running the backend and frontend services, consuming input\n\nAll interaction with the frontend on the command line takes place via the\n`balboa` frontend executable. The frontend depends on a backend service,\nwhich is usually its own executable.\nFor instance, the RocksDB backend can be started using:\n\n```text\n$ balboa-rocksdb -h\n`balboa-rocksdb` provides a pdns database backend for `balboa`\n\nUsage: balboa-rocksdb [options]\n\n    -h display help\n    -D daemonize (default: off)\n    -d \u003cpath\u003e path to rocksdb database (default: `/tmp/balboa-rocksdb`)\n    -l listen address (default: 127.0.0.1)\n    -p listen port (default: 4242)\n    -v increase verbosity; can be passed multiple times\n    -j thread throttle limit, maximum concurrent connections (default: 64)\n    --membudget \u003cmemory-in-bytes\u003e rocksdb membudget option (value: 134217728)\n    --parallelism \u003cnumber-of-threads\u003e rocksdb parallelism option (value: 8)\n    --max_log_file_size \u003csize\u003e rocksdb log file size option (value: 10485760)\n    --max_open_files \u003cnumber\u003e rocksdb max number of open files (value: 300)\n    --keep_log_file_num \u003cnumber\u003e rocksdb max number of log files (value: 2)\n    --database_path \u003cpath\u003e same as `-d`\n    --version show version thenp exit\n\n$ balboa-rocksdb --database_path /data/pdns -l 127.0.0.1 -p 4242\n```\n\nAfter starting the backend the `balboa` frontend can be started as follows:\n\n```text\n$ balboa serve -l ''\nINFO[0000] starting feeder AMQPInput2\nINFO[0000] starting feeder HTTP Input\nINFO[0000] accepting submissions on port 8081\nINFO[0000] starting feeder Socket Input\nINFO[0000] starting feeder Suricata Socket Input\nINFO[0000] ConsumeFeed() starting\nINFO[0000] serving GraphQL on port 8080\n...\n```\n\nAfter startup, the feeders are free to be used for data ingest. For example,\none might do some of the following to test data consumption (assuming the\nfeeders above are used):\n\n- for AMQP:\n\n    ```text\n    $ scripts/mkjson.py | rabbitmqadmin publish routing_key=\"\" exchange=tdh.pdns\n    ...\n    ```\n\n- for HTTP:\n\n    ```text\n    $ scripts/mkjson.py | curl -d@- -qs --header \"X-Sensor-ID: abcde\" http://localhost:8081/submit\n    ...\n    ```\n\n- for socket:\n\n    ```text\n    $ sudo gopassivedns -dev eth0 | socat /tmp/balboa.sock STDIN\n    ...\n    ```\n\n### Querying the server\n\nThe intended main interface for interacting with the server is via GraphQL. For\nexample, the query\n\n```graphql\nquery {\n  entries(rrname: \"test.foobar.de\", sensor_id: \"abcde\", limit: 1) {\n    rrname\n    rrtype\n    rdata\n    time_first\n    time_last\n    sensor_id\n    count\n  }\n}\n```\n\nwould return something like\n\n```json\n{\n  \"data\": {\n    \"entries\": [\n      {\n        \"rrname\": \"test.foobar.de\",\n        \"rrtype\": \"A\",\n        \"rdata\": \"1.2.3.4\",\n        \"time_first\": 1531943211,\n        \"time_last\": 1531949570,\n        \"sensor_id\": \"abcde\",\n        \"count\": 3\n      }\n    ]\n  }\n}\n```\n\nThis also works with `rdata` as the query parameter, but at least one of\n`rrname` or `rdata` must be stated. If there is no `sensor_id` parameter, then\nall results will be returned regardless of where the DNS answer was observed.\nUse the `time_first_rfc3339` and `time_last_rfc3339` instead of `time_first`\nand `time_last`, respectively, to get human-readable timestamps.\n\nWhen multiple backends are configured a query will be dispatched to every backend.\nAccordingly, when an observation is stored in multiple backends, the result to the query\nwill contain duplicates.\n\n### Aliases\n\nSometimes it is interesting to ask for all the domain names that resolve to the\nsame IP address. For this reason, the GraphQL API supports a virtual `aliases`\nfield that returns all Entries with RRType `A` or `AAAA` that share the same\naddress in the Rdata field.\n\nExample:\n\n```graphql\n{\n  entries(rrname: \"heise.de\", rrtype: A) {\n    rrname\n    rdata\n    rrtype\n    time_first_rfc3339\n    time_last_rfc3339\n    aliases {\n      rrname\n    }\n  }\n}\n```\n\n```json\n{\n  \"data\": {\n    \"entries\": [\n      {\n        \"rrname\": \"heise.de\",\n        \"rdata\": \"193.99.144.80\",\n        \"rrtype\": \"A\",\n        \"time_first_rfc3339\": \"2018-07-10T08:05:45Z\",\n        \"time_last_rfc3339\": \"2018-10-18T09:24:38Z\",\n        \"aliases\": [\n          {\n            \"rrname\": \"ct.de\"\n          },\n          {\n            \"rrname\": \"ix.de\"\n          },\n          {\n            \"rrname\": \"redirector.heise.de\"\n          },\n          {\n            \"rrname\": \"www.ix.de\"\n          }\n        ]\n      }\n    ]\n  }\n}\n```\n\n### Bulk queries\n\nThere is also a shortcut tool to make 'bulk' querying easier. For example, to\nget all the information on the hosts in range 1.2.0.0/16 as observed by sensor\n`abcde`, one can use:\n\n```text\n$ balboa query --sensor abcde 1.2.0.0/16\n{\"count\":6,\"time_first\":1531943211,\"time_last\":1531949570,\"rrtype\":\"A\",\"rrname\":\"test.foobar.de\",\"rdata\":\"1.2.3.4\",\"sensor_id\":\"abcde\"}\n{\"count\":1,\"time_first\":1531943215,\"time_last\":1531949530,\"rrtype\":\"A\",\"rrname\":\"baz.foobar.de\",\"rdata\":\"1.2.3.7\",\"sensor_id\":\"abcde\"}\n```\n\nNote that this tool currently only does a lot of concurrent individual queries!\nTo improve performance in these cases it might be worthwhile to allow for range\nqueries on the server side as well in the future.\n\n### Other tools\n\nRun `balboa` without arguments to list available subcommands and get a short\ndescription of what they do.\n\nSee also `README.md` in the `backend` directory.\n\n## Author/Contact\n\nSascha Steinbiss\n\n## License\n\nBSD-3-clause\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdcso%2Fbalboa","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdcso%2Fbalboa","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdcso%2Fbalboa/lists"}