{"id":16572096,"url":"https://github.com/eddelbuettel/dtts","last_synced_at":"2025-06-24T18:35:52.518Z","repository":{"id":43294426,"uuid":"264952934","full_name":"eddelbuettel/dtts","owner":"eddelbuettel","description":" Time-series functionality based on nanotime and data.table","archived":false,"fork":false,"pushed_at":"2024-12-31T16:49:40.000Z","size":119,"stargazers_count":14,"open_issues_count":0,"forks_count":2,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-02-27T13:21:37.450Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/eddelbuettel.png","metadata":{"files":{"readme":"README.md","changelog":"ChangeLog","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-05-18T13:47:04.000Z","updated_at":"2024-12-31T16:49:44.000Z","dependencies_parsed_at":"2024-10-27T11:25:51.143Z","dependency_job_id":null,"html_url":"https://github.com/eddelbuettel/dtts","commit_stats":{"total_commits":49,"total_committers":3,"mean_commits":"16.333333333333332","dds":"0.36734693877551017","last_synced_commit":"a3885061c6b8ba45f83b8c269876f94349d23e18"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eddelbuettel%2Fdtts","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eddelbuettel%2Fdtts/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eddelbuettel%2Fdtts/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eddelbuettel%2Fdtts/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/eddelbuettel","download_url":"https://codeload.github.com/eddelbuettel/dtts/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243830912,"owners_count":20354848,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-11T21:26:21.076Z","updated_at":"2025-03-16T20:31:20.413Z","avatar_url":"https://github.com/eddelbuettel.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"## dtts: Time-series functionality based on `nanotime` and `data.table`.\n\n[![CI](https://github.com/eddelbuettel/dtts/workflows/ci/badge.svg)](https://github.com/eddelbuettel/dtts/actions?query=workflow%3Aci)\n[![License](https://eddelbuettel.github.io/badges/GPL2+.svg)](https://www.gnu.org/licenses/gpl-2.0.html)\n[![CRAN](https://www.r-pkg.org/badges/version/dtts)](https://cran.r-project.org/package=dtts)\n[![r-universe](https://eddelbuettel.r-universe.dev/badges/dtts)](https://eddelbuettel.r-universe.dev/dtts)\n[![Dependencies](https://tinyverse.netlify.app/badge/dtts)](https://cran.r-project.org/package=dtts)\n[![Downloads](https://cranlogs.r-pkg.org/badges/dtts?color=brightgreen)](https://www.r-pkg.org/pkg/dtts)\n[![Code Coverage](https://codecov.io/gh/eddelbuettel/dtts/graph/badge.svg)](https://app.codecov.io/gh/eddelbuettel/dtts)\n[![Last Commit](https://img.shields.io/github/last-commit/eddelbuettel/dtts)](https://github.com/eddelbuettel/dtts)\n\n## Motivation\n\nCombining package [`nanotime`](https://CRAN.R-project.org/package=nanotime) for\noperating with nanosecond time-resolution with package\n[`data.table`](https://CRAN.R-project.org/package=data.table) leverages\nthe conciseness, high performance, and memory efficiency of the\nlatter to provide high-resolution, high-performance time series operations.\n\nOur time-series representation is simply a `data.table` with a first column\nof type `nanotime` and a key on it. This means all the standard `data.table`\nfunctions can be used, and this package consolidates this functionality.\n\nSpecifically, `dtts` proposes alignment functions that are particularly\nversatile, and allow to work across time-zones.\n\n## Usage\n\n### Creating a `data.table`-based time-series with a `nanotime` index\n\nThree operations are necessary to create a `data.table`-based\ntime-series for use with the functions defined in this package:\n1. Create the time index, i.e. a vector of `nanotime`\n2. Create a `data.table` with the first column being the time index \n   and specifying it as a key\n\nFor instance, this code creates a time-series of 10 rows spaced every\nhour with a data column `V1` containing random data:\n\n~~~ R\nlibrary(data.table)\nlibrary(nanotime)\nt1 \u003c- seq(as.nanotime(Sys.time()), by=as.nanoduration(\"01:00:00\"), length.out=10)\ndt1 \u003c- data.table(index=t1, V1=runif(10), key=\"index\")\n~~~\n\nproduces:\n\n~~~\n                               index        V1\n 1: 2021-11-21T06:23:12.404650+00:00 0.7206800\n 2: 2021-11-21T07:23:12.404650+00:00 0.9677868\n 3: 2021-11-21T08:23:12.404650+00:00 0.6211587\n 4: 2021-11-21T09:23:12.404650+00:00 0.7669201\n 5: 2021-11-21T10:23:12.404650+00:00 0.6426368\n 6: 2021-11-21T11:23:12.404650+00:00 0.4026811\n 7: 2021-11-21T12:23:12.404650+00:00 0.2512213\n 8: 2021-11-21T13:23:12.404650+00:00 0.3476128\n 9: 2021-11-21T14:23:12.404650+00:00 0.9663271\n10: 2021-11-21T15:23:12.404650+00:00 0.4744729\n~~~\n\n(Note that we can also write this in a single `data.table` statement as\n\n~~~ R\ndt1 \u003c- data.table(index = seq(as.nanotime(Sys.time()), by=as.nanoduration(\"01:00:00\"), length.out=10),\n                  V1 = runif(10),\n                  key = \"index\")\n~~~\n\n### Alignment functions\n\nAlignment is the process of matching the time of the observations of\none time series to another. All alignment functions in this package\nwork in a similar way. For each point in the vector `y` onto which `x`\nis aligned, a pair or arguments named `start` and `end` define an\ninterval around this point. As an example let us take `start` equal to\n-1 hour and `end` equal to 0 hour. This means that a `y` of 2021-11-20\n11:00:00 defines an interval from 2021-11-20 10:00:00 to 2021-11-20\n11:00:00. The alignment process will then use that interval to pick\npoints in order to compute one or more statistics on that interval for\nthe corresponding point in `y`.\n\nIn addition to the arguments `start` and `end`, two other arguments,\nbooleans named `sopen` and `eopen`, define if the start and end,\nrespectively, of the interval are open or not.\n\nFinally, note that when the interval is specified with a `nanoperiod`\ntype, the argument `tz` is necessary in order to give meaning to the\ninterval. With `nanoperiod`, alignments are time-zone aware and\ncorrect across daylight saving time.\n\nThis figure shows an alignment using the \"closest\"\npoint as data:\n\n\u003cimg src=\"./inst/images/align_closest.svg\"\u003e\n\n\nThis figure shows an alignment using a statistic\n(here simply counting the number of elements in the intervals):\n\n\u003cimg src=\"./inst/images/align_count.svg\"\u003e\n\n\n#### `align_idx`\n\nThis function takes two vectors of type `nanotime`. It aligns the\nfirst one onto the second one and returns the indices of the first\nvector that align with the second vector. There is no choice of\naggregation function here as this function works uniquely on\n`nanotime` vectors. The algorithm selects the point in `x` that falls\nin the interval that is closest to the point of alignment in `y`. The\nindex of the point that falls in that interval is returned at the\nposition of the vector `y`. If no point exists in that interval `NaN`\nis returned.\n\n\n~~~ R\nlibrary(dtts)\n\nt1 \u003c- seq(as.nanotime(\"1970-01-01T00:00:00+00:00\"), by=as.nanoduration(\"00:00:01\"), length.out=100)\nt2 \u003c- seq(as.nanotime(\"1970-01-01T00:00:10+00:00\"), by=as.nanoduration(\"00:00:10\"), length.out=10)\n\nalign_idx(t1, t2, start=as.nanoduration(\"-00:00:10\"))\n~~~\n\nWhich produces:\n\n~~~\n [1]  10  20  30  40  50  60  70  80  90 100\n~~~\n\n#### `align`\n\nThis function takes a `data.table` and aligns it onto `y`, a vector of\n`nanotime`. Like `align_idx`, it uses the arguments `start`, `end`,\n`sopen` and `eopen` to define the intervals around the points in `y`. \n\nInstead of the result being an index, it is a new `data.table`\ntime-series with the first `nanotime` column being the vector `y`, and\nthe rows of this time-series are taken from the `data.table` `x`. If\nno function is specified (i.e. `func` is `NULL`), the function returns\nthe row of the point in `x` that is in the interval and that is\nclosest to the point in `y` on which the alignment is made. If `func`\nis defined, it receives for each point in `y` all the rows in `x` that\nare in the defined interval. So `func` must be a statistic that\nreturns one row, but it may return one or more columns. Common examples\nare means (e.g. using `colMeans`), counts, etc.\n\n\nIn the following example a time-series `dt1` is created with a data\ncolumn `V1` which has the integer index as value and it is aligned\nonto a `nanotime` vector `t2`\n\n~~~ R\nlibrary(dtts)\n\nt1 \u003c- seq(as.nanotime(\"1970-01-01T00:00:00+00:00\"), by=as.nanoduration(\"00:00:01\"), length.out=100)\ndt1 \u003c- data.table(index=t1, V1=0:99)\nsetkey(dt1, index)\n\nt2 \u003c- seq(as.nanotime(\"1970-01-01T00:00:10+00:00\"), by=as.nanoduration(\"00:00:10\"), length.out=10)\n\nalign(dt1, t2, start=as.nanoduration(\"-00:00:10\"), func=colMeans)\n~~~\n\nWhich produces:\n\n~~~\n                        index   V1\n 1: 1970-01-01T00:00:10+00:00  4.5\n 2: 1970-01-01T00:00:20+00:00 14.5\n 3: 1970-01-01T00:00:30+00:00 24.5\n 4: 1970-01-01T00:00:40+00:00 34.5\n 5: 1970-01-01T00:00:50+00:00 44.5\n 6: 1970-01-01T00:01:00+00:00 54.5\n 7: 1970-01-01T00:01:10+00:00 64.5\n 8: 1970-01-01T00:01:20+00:00 74.5\n 9: 1970-01-01T00:01:30+00:00 84.5\n10: 1970-01-01T00:01:40+00:00 94.5\n~~~\n\n#### `grid_align`\n\nThis function adds one more dimension to the function `align`. Instead\nof taking a vector `y`, it constructs a grid that has as interval the\nvalue supplied in the argument `by`. The interval is controllable\n(with arguments `ival_start`, `ival_end`, `ival_sopen`, `ival_eopen`)\nbut it is likely that in most cases the default will be used which is\nthe grid interval. As in the case of `align`, the caller can specify\n`func`. Finally, note that `by` can be either a `nanoduration` or a\n`nanoperiod`. In the latter case, as for the other functions, the\nargument `tz` must be supplied so that the `nanoperiod` interval can\nbe anchored to a specific timezone.\n\nThe following example is the same as for the `align` function, but\nshows that the vector `t2` does not need to be supplied as it is\ninstead constructed by `grid_align`:\n\n~~~ R\nlibrary(dtts)\n\nt1 \u003c- seq(as.nanotime(\"1970-01-01T00:00:00+00:00\"), by=as.nanoduration(\"00:00:01\"), length.out=100)\ndt1 \u003c- data.table(index=t1, V1=0:99)\nsetkey(dt1, index)\n\ngrid_align(dt1, as.nanoduration(\"00:00:10\"), func=colMeans)\n~~~\n\nWhich produces:\n\n~~~\n                        index   V1\n 1: 1970-01-01T00:00:10+00:00  4.5\n 2: 1970-01-01T00:00:20+00:00 14.5\n 3: 1970-01-01T00:00:30+00:00 24.5\n 4: 1970-01-01T00:00:40+00:00 34.5\n 5: 1970-01-01T00:00:50+00:00 44.5\n 6: 1970-01-01T00:01:00+00:00 54.5\n 7: 1970-01-01T00:01:10+00:00 64.5\n 8: 1970-01-01T00:01:20+00:00 74.5\n 9: 1970-01-01T00:01:30+00:00 84.5\n10: 1970-01-01T00:01:40+00:00 94.5\n~~~\n\n#### Frequency\n\nUsing `grid_align` and `nrow` it is possible to get the frequency of a\ntime-series, i.e. to count the number of elements in each interval of\na grid.\n\nTaking the same example as above, we see that the result is the count\nof elements of `dt1` that are in each interval:\n\n~~~ R\nlibrary(dtts)\n\nt1 \u003c- seq(as.nanotime(\"1970-01-01T00:00:00+00:00\"), by=as.nanoduration(\"00:00:01\"), length.out=100)\ndt1 \u003c- data.table(index=t1, V1=0:99)\nsetkey(dt1, index)\n\ngrid_align(dt1, as.nanoduration(\"00:00:10\"), func=nrow)\n~~~\n\nWhich produces:\n\n~~~\n                        index V1\n 1: 1970-01-01T00:00:10+00:00 10\n 2: 1970-01-01T00:00:20+00:00 10\n 3: 1970-01-01T00:00:30+00:00 10\n 4: 1970-01-01T00:00:40+00:00 10\n 5: 1970-01-01T00:00:50+00:00 10\n 6: 1970-01-01T00:01:00+00:00 10\n 7: 1970-01-01T00:01:10+00:00 10\n 8: 1970-01-01T00:01:20+00:00 10\n 9: 1970-01-01T00:01:30+00:00 10\n10: 1970-01-01T00:01:40+00:00 10\n~~~\n\n#### ops\n\n`ops` performs arithmetic operations between two time-series and has\nthe following signature, where `x` and `y` are time-series and `op` is\na string denoting an arithmetic operator.\n\n~~~ R\nops(x, y, op_string)\n~~~\n\nEach entry in the left time-series operand defines an interval from\nthe previous entry, and the value associated with this interval will\nbe applied to all the observations in the right time-series operand\nthat fall in the interval. Note that the interval is closed at the\nbeginning and open and the end. The available values for op are \"*\",\n\"/\", \"+\", \"-\".\n\nThis function is particulary useful to apply a multiplier or to add a\nconstant that changes over time; one example would be the adjustment\nof stock prices for splits.\n\nHere is a visualization of `ops`:\n\n\u003cimg src=\"./inst/images/ops.svg\"\u003e\n\n\nHere is an example:\n\n~~~ R\none_second_duration  \u003c- as.nanoduration(\"00:00:01\")\nt1 \u003c- nanotime(1:2 * one_second_duration * 3)\nt2 \u003c- nanotime(1:4 * one_second_duration)\ndt1 \u003c- data.table(index=t1, data1 = 1:length(t1))\nsetkey(dt1, index)\ndt2 \u003c- data.table(index=t2, data1 = 1:length(t2))\nsetkey(dt2, index)\nops(dt1, dt2, \"+\")\n~~~\n\nWhich produces:\n```\n                       index data1\n1: 1970-01-01T00:00:01+00:00     2\n2: 1970-01-01T00:00:02+00:00     3\n3: 1970-01-01T00:00:03+00:00     3\n4: 1970-01-01T00:00:04+00:00     4\n```\n\n### Time-series subsetting\n\nUsing `nanoival`, it is possible to do complex subsetting of a time-series:\n\n~~~ R\none_second \u003c- 1e9\nindex \u003c- seq(nanotime(\"2022-12-12 12:12:10+00:00\"), length.out=10, by=one_second)\ndts \u003c- data.table(index=index, data=1:length(index), key=\"index\")\nival \u003c- as.nanoival(c(\"-2022-12-12 12:12:10+00:00 -\u003e 2022-12-12 12:12:14+00:00-\"),\n                     (\"+2022-12-12 12:12:18+00:00 -\u003e 2022-12-12 12:12:20+00:00+\"))\ndts[index %in% ival]\n\n~~~\n\n## Status\n\n`dtts` currently proposes only a set of alignment functions, but it is\nlikely that other time-series functions will be impletemented so that\n`nanotime`-based time-series have reasonably complete time-series\nfunctionality.\n\nSee the [issue tickets](https://github.com/eddelbuettel/dtts/issues)\nfor an up to date list of potentially desirable, possibly planned, or\nat least discussed items.\n\n## Installation\n\nThe package is on [CRAN](https://cran.r-project.org) and can be installed via a standard\n\n\n```r\ninstall.packages(\"dtts\")\n```\n\nand development versions can be installed via\n\n```r\nremotes::install_github(\"eddelbuettel/dtts\")\n```\n\n## Author\n\nDirk Eddelbuettel, Leonardo Silvestri\n\n## License\n\nGPL (\u003e= 2)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feddelbuettel%2Fdtts","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Feddelbuettel%2Fdtts","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feddelbuettel%2Fdtts/lists"}