{"id":13647713,"url":"https://github.com/njaard/sonnerie","last_synced_at":"2025-04-12T16:36:11.633Z","repository":{"id":40533525,"uuid":"147880351","full_name":"njaard/sonnerie","owner":"njaard","description":"A simple timeseries database","archived":false,"fork":false,"pushed_at":"2024-04-05T19:10:36.000Z","size":468,"stargazers_count":255,"open_issues_count":6,"forks_count":17,"subscribers_count":11,"default_branch":"master","last_synced_at":"2024-04-25T03:14:03.968Z","etag":null,"topics":["cli","rust","timeseries-database"],"latest_commit_sha":null,"homepage":null,"language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/njaard.png","metadata":{"files":{"readme":"README.md","changelog":"Changelog.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2018-09-07T22:24:22.000Z","updated_at":"2024-04-10T11:53:54.000Z","dependencies_parsed_at":"2024-01-14T10:14:41.518Z","dependency_job_id":"48b5e1e2-7664-40cd-a32d-73c44994bedb","html_url":"https://github.com/njaard/sonnerie","commit_stats":{"total_commits":293,"total_committers":5,"mean_commits":58.6,"dds":"0.017064846416382284","last_synced_commit":"4320e44dc2df822b6403d30bc5eaccb310538dcf"},"previous_names":[],"tags_count":24,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/njaard%2Fsonnerie","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/njaard%2Fsonnerie/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/njaard%2Fsonnerie/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/njaard%2Fsonnerie/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/njaard","download_url":"https://codeload.github.com/njaard/sonnerie/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248597227,"owners_count":21130838,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cli","rust","timeseries-database"],"created_at":"2024-08-02T01:03:43.991Z","updated_at":"2025-04-12T16:36:11.595Z","avatar_url":"https://github.com/njaard.png","language":"Rust","funding_links":[],"categories":["Rust","Timeseries"],"sub_categories":["Databases"],"readme":"[![GitHub license](https://img.shields.io/badge/license-BSD-blue.svg)](https://raw.githubusercontent.com/njaard/sonnerie/master/LICENSE)\n[![Crates.io](https://img.shields.io/crates/v/sonnerie.svg)](https://crates.io/crates/sonnerie)\n[![docs](https://img.shields.io/badge/docs-api-green)](https://docs.rs/sonnerie)\n\nRefer to the [Changelog](Changelog.md) for information on releases.\n\n# Introduction\n\nSonnerie is a time-series database. Map a string to a list of timestamps and value.\nStore multiple of these series in a single database. Insert tens of millions\nof samples in minutes, on rotational media or solid-state.\n\nSonnerie is optimized for storing data that comes in as\nmany values over many series (insertion of millions of items takes minutes and\ndoesn't block other readers or writers), and for reading one series at a time in\n10s of milliseconds. It is also very good at dumping lexicographically sequential\nseries (which means: everything).\n\nSonnerie can very efficiently do random insertions and updates, and works\nwell for huge databases. Due to the compact disk format, sparse data such\nas keys with only a few timestamps can be very efficiently stored.\n\nSonnerie is mostly intended for on-disk archival, realtime updates, and realtime accesses\nof individual series. For analytical purposes, one would load the necessary data\ninto memory and process it through other means.\n\n# Features\n* A straight-forward protocol for reading and writing\n* Easy setup: insert data on the command line.\n* No query language\n* Transactional: a transaction is completely committed or not at all.\n* Isolated: A transaction doesn't see updates from other transactions or expose its changes until it has been committed.\nNote the semantics of \"last record wins\" - if two transactions write two values with same key and timestamp, then the record from\nthe last completed transaction will be the retained one.\n* Durable: committed data is resistant to loss from unexpected shutdown.\n* Nanosecond-resolution timestamps (64 bit), 1970-2554\n* No weird dependencies, no virtual machines, one single native binary for the command line tool\n* Floating point, integer, and string values, multiple columns per sample\n* Concurrent reading of ranges with Rayon - do a \"map-reduce\" style query from Rust\nin 30 seconds per billion records per core on modern hardware.\n\nSonnerie runs on Unix-like systems and is developed on Linux.\n\n# Quick Start\n\n## Install\n\nSonnerie is implemented in Rust, a systems programming language that runs\nblazingly fast. Installation from source therefor requires you to\n[install the rust compiler](https://www.rust-lang.org/en-US/install.html),\nwhich is as simple as: `curl https://sh.rustup.rs -sSf | sh`.\n\nSonnerie can then be installed from Cargo: `cargo install sonnerie`.\n\nSonnerie consists of one executable, `sonnerie` (`~/.cargo/bin/sonnerie`)\n\n## Create a database\n\nCreate a database by creating a directory and an empty file named \"`main`\":\n\n\tmkdir database\n\ttouch database/main\n\n## Insert data\n\techo -e \"\\\n\tfibonacci 2020-01-01T00:00:00 1\n\tfibonacci 2020-01-02T00:00:00 1\n\tfibonacci 2020-01-03T00:00:00 2\n\tfibonacci 2020-01-04T00:00:00 3\n\tfibonacci 2020-01-05T00:00:00 5\n\tfibonacci 2020-01-06T00:00:00 8\" \\\n\t| sonnerie -d database/ add --format u --timestamp-format=%FT%T\n\nIf the \"add\" command succeeds, then the transaction is committed to disk.\n\nItems added with `sonnerie add` must be sorted lexicographically by their\nkey and then chronologically. This requirement does not exist in\n`sonnerie-serve`.\n\n## Read the data back\n\n\tsonnerie -d database/ read %\n\nThe `%` is a wildcard as is used in \"`LIKE`\" in SQL and filters\non the key. Searching based on a prefix is very efficient:\n\n\tsonnerie -d database/ read fib%\n\nSonnerie outputs the matched values:\n\n\tfibonacci 2020-01-01 00:00:00     1\n\tfibonacci 2020-01-02 00:00:00     1\n\tfibonacci 2020-01-03 00:00:00     2\n\tfibonacci 2020-01-04 00:00:00     3\n\tfibonacci 2020-01-05 00:00:00     5\n\tfibonacci 2020-01-06 00:00:00     8\n\n## Delete records\n\n\tsonnerie -d database/ delete --after-time=2020-01-04\n\nInstantaneously removes all values at the specified time and later, also available\nis `--before-time` and similar functions for filtering by key range.\n\nThe data is immediately removed from the database. A later compaction will\npurge it and recover disk space.\n\nDeletions merely suppress output; reading a sequence of records that\ncontain the deleted items will still have to iterate through the deleted records.\nYou should thus compact (with --major) to not pay that penalty.\n\n# Usage\n\n## Row format\nEach series has a **`format`**. The format is specified as a\nbunch of single character codes, one for each value.\n\nThe character codes are:\n* `f` - a 32 bit float (f32)\n* `F` - a 64 bit float (f64)\n* `u` - a 32 bit unsigned integer (u32)\n* `U` - a 64 bit unsigned integer (u64)\n* `i` - a 32 bit signed integer (i32)\n* `I` - a 64 bit signed integer (i64)\n* `s` - a UTF-8 encoded string type. When strings are outputted, they are\nencoded in \"backslash escaped\" form, so all whitespace and backslashes are\npreceded by a backslash.\n\nIn the above \"fibonacci\" example, we're using the \"u\" format.\n\nMulti-column rows are permitted; for two floating point values representing\nlatitude and longitude:\n\n\toceanic-airlines 2018-01-01T00:00:00 ff 37.686751 -122.602227\n\toceanic-airlines 2018-01-01T00:00:01 ff 37.686810 -122.603713\n\toceanic-airlines 2018-01-01T00:00:02 ff 37.686873 -122.605997\n\toceanic-airlines 2018-01-01T00:00:03 ff 37.687022 -122.609997\n\toceanic-airlines 2018-01-01T00:00:04 ff 37.687364 -122.610945\n\toceanic-airlines 2018-01-01T00:00:05 ff 37.687503 -122.615211\n\n## Sonnerie allows heterogeneous formats.\nA single key may change its format, for example:\n\n\tkeyname 2020-01-01T00:00:00 u 42\n\tkeyname 2020-01-02T00:00:00 f 3.1415\n\tkeyname 2020-01-03T00:00:00 s Now\\ a\\ string\n\nWhile a key may change its format, it has more storage overhead,\nso it's best to not allow keys to oscillate between types.\n\nThis is permitted new in version 0.6, older versions had an \"unsafe\" mode\nthat allowed the test to be bypassed for performance.\n\n## No server is necessary\n\nAll actions can be done by running `sonnerie -d /path/to/data/`. Furthermore,\na file, (after it gets its \".tmp\" suffix removed) will never change, though\nthe files may sometimes get replaced. This means you can\nreplicate a database by hardlinking all the files (`ln`).\n\n## The database must be compacted\n\nOn a regular (possibly daily) basis, you must compact the database. This\nrolls a bunch of transaction files into a single large transaction file.\nThis is important for performance. By the time about 100 transaction files\nare present, performance suffers greatly. Therefor, compact the database\nat approximately the rate necessary to prevent that.\n\nThere are two types of compactions, a major and a minor one. A major\none replaces the entire database, which requires reading\nand rewriting the entire database. A minor one replaces all of the transaction\nfiles with a single new transaction file. This is a lot faster because it\nrequires only reading and rewriting the contents of the transaction files\nand not the `main` file.\n\nA major compaction is accomplished with:\n\n    sonnerie -d /path/to/data/ compact --major\n\nAnd a minor compaction:\n\n    sonnerie -d /path/to/data/ compact\n\nCompacting doesn't block readers or writers, but only one can\nhappen at any given moment, so a lock is placed to prevent multiple\nconcurrent compactions.\n\nCompactions are atomic, so you can cancel it (with `^C`) at any time.\n\n## You can compact and filter\n\nIn case some data in the database needs to be modified, you can use\n`compact` with the `--gegnum` option. Gegnum means \"through\" in Icelandic.\n\nThis command removes records that start with `bad-objects`:\n\n    compact --major --gegnum 'grep -v ^bad-objects'\n\nDo a normal compaction, but also count records:\n\n    compact --major --gegnum 'pv -l'\n\nThe `--gegnum` runs its command inside a /bin/sh, so pipelines work. Filter\nout bad objects AND modify the names of other objects:\n\n    compact --major --gegnum 'grep -v ^bad-objects | sed \"s/^old-name/new-name/\"'\n\nThe outut of the `--gegnum` command must be in sorted order.\n\nYou can also see a preview of its output by piping your command into `| tee /dev/stderr`.\n\nNote that the rows come as \"key\\ttimestamp\\tformat\\tvalue\"\n\nYou can also \"read | filter | add\" into a different database, but `gegnum` allows\nyou to modify an existing database which is useful for online maintenance on a database\nthat gets concurrent updates.\n\n# sonnerie-serve\nA server is provided so that you can conveniently read and write to the database\nvia HTTP.\n\nRun `sonnerie-serve -d /path/to/database/ -l 0.0.0.0:5555` and then you may\nmake `PUT` and `GET` requests:\n\n* Read the named series:\n\n\t`curl http://localhost:5555/fibonacci`\n\n(The response is the entire series in a format similar to `sonnerie read`)\n\n* Read series by wildcard:\n\n\t`curl http://localhost:5555/fib%`\n\n(The response is each series, in alphabetical order, in a format similar to\n`sonnerie read`)\n\n* Output human-readable timestamps:\n\n\t`curl http://localhost:5555/fib%?human`\n\n(The timestamps are in ISO-8601 instead of nanoseconds)\n\n* Add more data:\n\n\t`curl -X PUT http://localhost:5555/ --data-binary 'fibonacci 1578384000000000000 u 13'`\n\n(`200 OK` means that the transaction was committed)\n\nUnlike `sonnerie add`, `sonnerie-serve` allows unsorted input.\n\nNote that because sonnerie `mmap`s its files, sonnerie-serve will show\nhuge values for its virtual memory usage (`VIRT` in top), but actual\nmemory utilization will be reasonable.\n\nYou may continue to read and modify your sonnerie database by the command\nline or even via another concurrently-running `sonnerie-serve`s.\n\nAn alternate approach is to use \"sshfs\" to mount the database remotely. This\napproach is very performant because only compressed data goes through the network\nand the server doesn't need to do any of the decompressing. Avoid nfs\nbecause compactions will cause files to get deleted, and then the client will get an\nIO error, as NFS cannot track files that are closed on the server.\n\n# Contributing\nBug reports and pull requests are always welcome no matter how big or small.\nDevelopment of Sonnerie is people-first and we comply with Rust's \n[Code of Conduct](https://www.rust-lang.org/policies/code-of-conduct).\n\nIf you use Sonnerie, please provide feedback!\n\n# Sonnerie is used in production\nSonnerie is used by Headline with a \u003e100GiB database and 10s\nof billions of rows.\n\n# Performance\nAn approximate average lookup latency for a random key in a large database is\naround 15ms on an SSD and much slower on a busy rotational media device. Sequential\naccess (i.e., reading the whole database in lexicographical order) is somewhere around\n2k keys/sec and 3M records/sec, very much depending on the data itself.\n\n# Copyright\n\nSonnerie was implemented by Charles Samuels at\n[Headline](https://headline.com).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnjaard%2Fsonnerie","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnjaard%2Fsonnerie","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnjaard%2Fsonnerie/lists"}