{"id":13857398,"url":"https://github.com/enricoschumann/tsdb","last_synced_at":"2025-04-22T18:51:30.907Z","repository":{"id":143717907,"uuid":"97119001","full_name":"enricoschumann/tsdb","owner":"enricoschumann","description":"A terribly-simple data base for time series","archived":false,"fork":false,"pushed_at":"2025-03-20T19:57:21.000Z","size":277,"stargazers_count":11,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-29T17:51:09.484Z","etag":null,"topics":["csv","r","time-series","tsdb"],"latest_commit_sha":null,"homepage":"http://enricoschumann.net/R/packages/tsdb/","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/enricoschumann.png","metadata":{"files":{"readme":"README.org","changelog":"ChangeLog","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-07-13T12:13:38.000Z","updated_at":"2025-03-20T19:57:25.000Z","dependencies_parsed_at":"2024-02-09T01:47:27.309Z","dependency_job_id":"10cd1a13-454f-4d05-9b69-9e7617e268bb","html_url":"https://github.com/enricoschumann/tsdb","commit_stats":null,"previous_names":[],"tags_count":11,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/enricoschumann%2Ftsdb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/enricoschumann%2Ftsdb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/enricoschumann%2Ftsdb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/enricoschumann%2Ftsdb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/enricoschumann","download_url":"https://codeload.github.com/enricoschumann/tsdb/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250303692,"owners_count":21408652,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csv","r","time-series","tsdb"],"created_at":"2024-08-05T03:01:35.560Z","updated_at":"2025-04-22T18:51:30.885Z","avatar_url":"https://github.com/enricoschumann.png","language":"R","funding_links":[],"categories":["R"],"sub_categories":[],"readme":"#+TITLE: tsdb: a terribly-simple database for time series\n#+AUTHOR: Enrico Schumann\n#+OPTIONS: toc:nil\n#+BIND: org-latex-default-packages-alist nil\n#+BIND: org-use-sub-superscripts {}\n#+PROPERTY: tangle yes\n#+PROPERTY: header-args :comments link\n#+PROPERTY: header-args:R :session *R*\n#+PROPERTY: header-args :eval never-export\n# ------------------ LATEX ------------------\n#+LATEX_CLASS: scrartcl\n#+LATEX_CLASS_OPTIONS: [a4paper,fontsize=11pt]\n#+LATEX_HEADER: \\addtokomafont{disposition}{\\rmfamily}\n#+LATEX_HEADER: \\addtokomafont{descriptionlabel}{\\rmfamily}\n#+LATEX_HEADER: \\setlength{\\parindent}{0em}\n#+LATEX_HEADER: \\setlength{\\parskip}{2ex plus0.5ex minus0.5ex}\n#+LATEX_HEADER: \\newcommand{\\pmwr}{\\textsc{pm}w\\textsc{r}}\n#+LATEX_HEADER: \\newcommand{\\pl}{\\textsc{pl}}\n#+LATEX_HEADER: \\newcommand{\\R}{\\textsf{R}}\n#+LATEX_HEADER: \\usepackage[backend=bibtex,citestyle=authoryear]{biblatex}\n#+LATEX_HEADER: \\addbibresource{Library.bib}\n#+LATEX_HEADER: \\usepackage[left=3cm,right=5cm,top=2cm,bottom=4cm,twoside]{geometry}\n#+LATEX_HEADER: \\usepackage[libertine]{newtxmath}\n#+LATEX_HEADER: \\usepackage{fontspec}\n#+LATEX_HEADER: \\setmainfont{Linux Libertine O}\n#+LATEX_HEADER: \\setmonofont[Scale=0.91]{inconsolata}\n#+LATEX_HEADER: \\usepackage{graphicx}\n#+LATEX_HEADER: \\usepackage[dvipsnames]{xcolor}\n#+LATEX_HEADER: \\definecolor{grey20}{gray}{0.20}\n#+LATEX_HEADER: \\definecolor{grey30}{gray}{0.30}\n#+LATEX_HEADER: \\definecolor{grey40}{gray}{0.40}\n#+LATEX_HEADER: \\definecolor{grey90}{gray}{0.90}\n#+LATEX_HEADER: \\definecolor{grey96}{gray}{0.96}\n#+LATEX_HEADER: \\usepackage{listings}\n#+LATEX_HEADER: \\lstset{language=R,basicstyle=\\ttfamily,frame=single,commentstyle=\\ttfamily\\color{OliveGreen},\n#+LATEX_HEADER:         numberstyle=\\ttfamily\\footnotesize\\color{gray},stringstyle=\\ttfamily\\color{blue},\n#+LATEX_HEADER:         backgroundcolor=\\color{grey96},rulecolor=\\color{grey90},showstringspaces=false,\n#+LATEX_HEADER:         }\n#+LATEX_HEADER: \\lstnewenvironment{results}\n#+LATEX_HEADER:   {\\lstset{basicstyle=\\ttfamily\\color{grey30},backgroundcolor={},frame=single,numbers=none,showstringspaces=false,rulecolor=\\color{grey96}}}{}\n#+LATEX_HEADER: \\usepackage{mdframed}\n#+LATEX_HEADER: \\newenvironment{FAQ}\n#+LATEX_HEADER:  {\\begin{mdframed}}{\\end{mdframed}}\n#+LATEX_HEADER: \\newenvironment{FAA}\n#+LATEX_HEADER:  {\\begin{mdframed}}{\\end{mdframed}}\n#+LATEX_HEADER: \\usepackage{makeidx}\\makeindex\n#+LATEX_HEADER: \\usepackage[hidelinks]{hyperref}\n# ------------------ HTML ------------------\n#+HTML_HEAD: \u003cmeta name = \"viewport\" content=\"width=device-width\"\u003e\n#+HTML_HEAD: \u003cstyle\u003e\n#+HTML_HEAD:  html,body {\n#+HTML_HEAD:    font-family: sans-serif;\n#+HTML_HEAD:    padding: 0;\n#+HTML_HEAD:    margin: 0;\n#+HTML_HEAD:  }\n#+HTML_HEAD:  body {\n#+HTML_HEAD:      line-height: 1.45;\n#+HTML_HEAD:  }\n#+HTML_HEAD:  #content {\n#+HTML_HEAD:    font-family: serif;\n#+HTML_HEAD:    border: 1px solid #eeeeee;\n#+HTML_HEAD:    border-radius: 3px;\n#+HTML_HEAD:    color: #222222; width: 100%;\n#+HTML_HEAD:    width: 700px;\n#+HTML_HEAD:    padding-top: 2ex;\n#+HTML_HEAD:    padding: 1em;\n#+HTML_HEAD:    margin: 0.5em;\n#+HTML_HEAD:    margin-left: auto;margin-right: auto;\n#+HTML_HEAD:  }\n#+HTML_HEAD:  @media (max-width: 700px) {\n#+HTML_HEAD:    html,body,#content {\n#+HTML_HEAD:      width: 95%;\n#+HTML_HEAD:    }\n#+HTML_HEAD:  }\n#+HTML_HEAD:  .example {\n#+HTML_HEAD:    border: 1px solid rgb(240,240,240);\n#+HTML_HEAD:    padding: 4px;\n#+HTML_HEAD:    color: rgb(110,110,110);\n#+HTML_HEAD:    overflow: auto;\n#+HTML_HEAD:  }\n#+HTML_HEAD:  .src {\n#+HTML_HEAD:    border: 1px solid rgb(240,240,240);\n#+HTML_HEAD:    color: rgb(30,30,30);\n#+HTML_HEAD:    background-color: rgb(230,230,230);\n#+HTML_HEAD:    padding: 4px;\n#+HTML_HEAD:    overflow: auto;\n#+HTML_HEAD:  }\n#+HTML_HEAD:  .src:hover {\n#+HTML_HEAD:    background-color: rgb(240,240,240);\n#+HTML_HEAD:    padding: 4px;\n#+HTML_HEAD:  }\n#+HTML_HEAD:  dt {\n#+HTML_HEAD:    font-weight: bold;\n#+HTML_HEAD:  }\n#+HTML_HEAD:  li {\n#+HTML_HEAD:    margin-bottom: 0.5ex;\n#+HTML_HEAD:  }\n#+HTML_HEAD:  code {\n#+HTML_HEAD:    font-size: 115%;\n#+HTML_HEAD:    color: rgb(60,60,60);\n#+HTML_HEAD:  }\n#+HTML_HEAD:  .org-right {\n#+HTML_HEAD:    text-align: right;\n#+HTML_HEAD:  }\n#+HTML_HEAD:  nav ul {\n#+HTML_HEAD:    list-style-type: none;\n#+HTML_HEAD:  }\n#+HTML_HEAD: \u003c/style\u003e\n\n#+BEGIN_SRC R :results none :exports none\n  options(useFancyQuotes=FALSE)\n#+END_SRC\n\n\n* About tsdb\n\nA terribly-simple database for numeric time series,\nwritten purely in R, so no external database-software\nis needed. Series are stored in plain-text files (the\nmost-portable and enduring file type) in CSV\nformat. Timestamps are encoded using R's native numeric\nrepresentation for 'Date'/'POSIXct', which makes them\nfast to parse, but keeps them accessible with other\nsoftware. The package provides tools for saving and\nupdating series in this standardised format, for\nretrieving and joining data, for summarising files and\ndirectories, and for coercing series from and to other\ndata types (such as 'zoo' series).\n\n\n** Good things about tsdb\n\n- no setup needed, no system dependencies\n  (i.e. external software, such as a database)\n- completely portable; moving from one computer to\n  another requires no effort other than copying the\n  files (the only thing to take care of is file\n  encoding if non-ASCII column names are used)\n- data usable by other software\n\n\n** When you need another database\n\n- tsdb is potentially slow\n- no multi-user support; no access-rights management\n  (other than that provided by the OS)\n- no network protocols\n\n\n* Using tsdb\n\n** Writing data\n\nWe first load the package.\n\n#+BEGIN_SRC R :session *R* :results none :exports code\n  library(\"tsdb\")\n#+END_SRC\n\nStart by creating time-series data.\n#+BEGIN_SRC R :session *R* :results output :exports both\n  library(\"zoo\")\n  z \u003c- zoo(1:5, as.Date(\"2016-1-1\") + 0:4)\n  z\n#+END_SRC\n\n#+RESULTS:\n: 2016-01-01 2016-01-02 2016-01-03 2016-01-04 2016-01-05\n:          1          2          3          4          5\n\n\nTo store these data, we need to enforce a consistent\nformat, which the functions =ts_table= and\n=as.ts_table= do.\n\n#+BEGIN_SRC R :session *R* :results output :exports both\nts \u003c- as.ts_table(z, columns = \"A\")\nts\n#+END_SRC\n\n#+RESULTS:\n: 5 rows [2016-01-01 -\u003e 2016-01-05]: A\n\nNote that we had to provide a column name (=A=) for the\ndata. This is not optional. It is one of the things\nthat =ts_table= enforces. Another is that timestamps\nneed to be of class =Date= or =POSIXct=.\n\nTo store the data to a file, use =write_ts_table=. The\nfunction will take a directory and file name as\narguments, which mimics the hierarchy of databases and\ntables in a classical database.\n#+BEGIN_SRC R :session *R* :results none :exports code\n  write_ts_table(ts, dir = \"~/tsdb/daily\", file = \"example1\")\n#+END_SRC\n\nThe written file will look like this:\n# +INCLUDE: ~/tsdb/daily/example1 example\n\n#+BEGIN_EXAMPLE\n\"timestamp\",\"A\"\n16801,1\n16802,2\n16803,3\n16804,4\n16805,5\n#+END_EXAMPLE\n\nYou may notice that the dates have been replaced by\nnumbers. The mapping between these numbers and calendar\ntimes is described later, when we discuss the\nrepresentation of timestamps. (But if you can't wait:\nit is the number of days since 1 January 1970.)\n\nLet us write a second file. This time, we use\n=ts_table= directly.\n\n#+BEGIN_SRC R :session *R* :results output :exports both\nx \u003c- array(1:20, dim = c(10, 2))\ncolnames(x) \u003c- c(\"A\", \"B\")\nx\n#+END_SRC\n\n#+RESULTS:\n#+begin_example\n       A  B\n [1,]  1 11\n [2,]  2 12\n [3,]  3 13\n [4,]  4 14\n [5,]  5 15\n [6,]  6 16\n [7,]  7 17\n [8,]  8 18\n [9,]  9 19\n[10,] 10 20\n#+end_example\n\n\n#+BEGIN_SRC R :session *R* :results output :exports both\n  ts_table(x, timestamp = as.Date(\"2016-1-1\") + 0:9)\n#+END_SRC\n\n#+RESULTS:\n: 10 rows [2016-01-01 -\u003e 2016-01-10]: A, B\n\nWe can also explicitly specify the column names, which\nwill override the column names of the data. In fact,\nthis is the preferred way, since it makes things more\nexplicit (which usually means safer).\n#+BEGIN_SRC R :session *R* :results output :exports both\n  ts \u003c- ts_table(x, timestamp = as.Date(\"2016-1-1\") + 0:9,\n\t\t columns = c(\"B\", \"A\"))\n  ts\n#+END_SRC\n\n#+RESULTS:\n: 10 rows [2016-01-01 -\u003e 2016-01-10]: B, A\n\nWe write the data to a file =example2=.\n#+BEGIN_SRC R :session *R* :results none :exports code\n  write_ts_table(ts, dir = \"~/tsdb/daily\", file = \"example2\")\n#+END_SRC\n\nThe written file looks like this:\n# +INCLUDE: ~/tsdb/daily/example2 example\n\n#+BEGIN_EXAMPLE\n\"timestamp\",\"B\",\"A\"\n16801,1,11\n16802,2,12\n16803,3,13\n16804,4,14\n16805,5,15\n16806,6,16\n16807,7,17\n16808,8,18\n16809,9,19\n16810,10,20\n#+END_EXAMPLE\n\n\n** TODO Reading data\n\nUse the function =read_ts_tables=.\n\n#+name: read1\n#+BEGIN_SRC R :session *R* :results output :exports both\n  read_ts_tables(\"example1\", dir = \"~/tsdb/daily\", columns = \"A\")\n#+END_SRC\n\nThe default return value is a list with components\n=data=, =timestamp=, =columns= and =file.path=.\n#+RESULTS: read1\n#+begin_example\n$data\n     A\n[1,] 1\n[2,] 2\n[3,] 3\n[4,] 4\n[5,] 5\n\n$timestamp\n[1] \"2016-01-01\" \"2016-01-02\" \"2016-01-03\" \"2016-01-04\" \"2016-01-05\"\n\n$columns\n[1] \"A\"\n\n$file.path\n[1] \"~/tsdb/daily/example1::A\"\n#+end_example\n\n\nMore convenient may be to specify a =return.class=.\n#+BEGIN_SRC R :session *R* :results output :exports both\n  read_ts_tables(\"example1\", dir = \"~/tsdb/daily\", columns = \"A\",\n\t\t return.class = \"zoo\")\n#+END_SRC\n\n#+RESULTS:\n:            ~/tsdb/daily/example1::A\n: 2016-01-01                        1\n: 2016-01-02                        2\n: 2016-01-03                        3\n: 2016-01-04                        4\n: 2016-01-05                        5\n\n#+BEGIN_SRC R :session *R* :results output :exports both\n  read_ts_tables(\"example1\", dir = \"~/tsdb/daily\", columns = \"A\",\n\t\t return.class = \"data.frame\")\n#+END_SRC\n\n#+RESULTS:\n:    timestamp ~/tsdb/daily/example1::A\n: 1 2016-01-01                        1\n: 2 2016-01-02                        2\n: 3 2016-01-03                        3\n: 4 2016-01-04                        4\n: 5 2016-01-05                        5\n\n\nWith =tsdb= before version 0.7, =read_ts_tables= would\nper default only have returned values for non-weekend\ndays.  (=tsdb= was written with financial data in mind,\nand on weekends there are no prices.) This behaviour is\ncontrolled by argument =drop.weekends=, which defaults\nto =FALSE=.\n\n#+BEGIN_SRC R :session *R* :results output :exports both\nweekdays(as.Date(\"2016-1-1\")+0:4)\n#+END_SRC\n\n#+RESULTS:\n: [1] \"Friday\"   \"Saturday\" \"Sunday\"   \"Monday\"   \"Tuesday\"\n\n\nTo obtain data for weekends as well, specify the\nargument =drop.weekends=.\n#+BEGIN_SRC R :session *R* :results output :exports both\n  read_ts_tables(\"example1\", dir = \"~/tsdb/daily\",\n\t\t columns = \"A\",\n\t\t return.class = \"data.frame\",\n\t\t drop.weekends = TRUE)\n#+END_SRC\n\n#+RESULTS:\n:    timestamp ~/tsdb/daily/example1::A\n: 1 2016-01-01                        1\n: 2 2016-01-04                        4\n: 3 2016-01-05                        5\n\n\nYou may have noticed a small difference in the names of\nthe functions for reading and writing. We always write\na single table, but we read tables.\n\n#+BEGIN_SRC R :session *R* :results output :exports both\n  read_ts_tables(c(\"example1\", \"example2\"),\n\t\t dir = \"~/tsdb/daily\",\n\t\t columns = \"A\",\n\t\t return.class = \"data.frame\")\n#+END_SRC\n\n#+RESULTS:\n#+begin_example\n    timestamp ~/tsdb/daily/example1::A ~/tsdb/daily/example2::A\n1  2016-01-01                        1                       11\n2  2016-01-02                        2                       12\n3  2016-01-03                        3                       13\n4  2016-01-04                        4                       14\n5  2016-01-05                        5                       15\n6  2016-01-06                       NA                       16\n7  2016-01-07                       NA                       17\n8  2016-01-08                       NA                       18\n9  2016-01-09                       NA                       19\n10 2016-01-10                       NA                       20\n#+end_example\n\nThe column names of the returned object consist of the\nfilepaths and the column, which may be more information\nthan we actually want. The argument =column.name=\nspecifies the format; its default is\n=%dir%/%file%::%column%=.\n#+BEGIN_SRC R :session *R* :results output :exports both\n  read_ts_tables(c(\"example1\", \"example2\"),\n\t\t dir = \"~/tsdb/daily\",\n\t\t columns = \"A\",\n\t\t return.class = \"data.frame\",\n                 column.name = \"%file%/%column%\")\n#+END_SRC\n\n#+RESULTS:\n#+begin_example\n    timestamp example1/A example2/A\n1  2016-01-01          1         11\n2  2016-01-02          2         12\n3  2016-01-03          3         13\n4  2016-01-04          4         14\n5  2016-01-05          5         15\n6  2016-01-06         NA         16\n7  2016-01-07         NA         17\n8  2016-01-08         NA         18\n9  2016-01-09         NA         19\n10 2016-01-10         NA         20\n#+end_example\n\n\nMissing values are by default set to =NA=. That happens\neven for missing columns, with a warning though.\n#+BEGIN_SRC R :session *R* :results output :exports both\n  read_ts_tables(c(\"example1\", \"example2\"),\n\t\t dir = \"~/tsdb/daily\",\n\t\t columns = c(\"A\", \"B\"),\n\t\t return.class = \"data.frame\",\n                 column.name = \"%file%/%column%\")\n#+END_SRC\n\n#+RESULTS:\n#+begin_example\n    timestamp example1/A example1/B example2/A example2/B\n1  2016-01-01          1         NA         11          1\n2  2016-01-02          2         NA         12          2\n3  2016-01-03          3         NA         13          3\n4  2016-01-04          4         NA         14          4\n5  2016-01-05          5         NA         15          5\n6  2016-01-06         NA         NA         16          6\n7  2016-01-07         NA         NA         17          7\n8  2016-01-08         NA         NA         18          8\n9  2016-01-09         NA         NA         19          9\n10 2016-01-10         NA         NA         20         10\nWarning message:\nIn read_ts_tables(c(\"example1\", \"example2\"), dir = \"~/tsdb/daily\",  :\n  columns missing\n#+end_example\n\n\n\n* How tsdb works\n\n** ts_tables\n\ntsdb works with /time-series tables/ (objects of\nclass =ts_table=). A =ts_table= is a numeric matrix,\nso there is always a =dim= attribute. For a\ntime-series table =x=, you get the number of\nobservations with =dim(x)[1L]=.\n\nAttached to this matrix are several attributes:\n\n- timestamp :: a vector: the numeric representation of\n               the timestamp\n- t.type :: character: the class of the original\n            timestamp, either =Date= or =POSIXct=\n- columns :: a character vector that provides the\n             columns names\n\nThere may be other attributes as well, but these three\nare always present.\n\nA =ts_table= is not meant as a time-series class. For\nmost computations (plotting, calculation of statistics,\netc), the =ts_table= must first be coerced to =zoo=,\n=xts=, a data-frame or a similar data\nstructure. Methods that perform such coercions are\nresponsible for converting the numeric timestamp vector\nto an actual timestamp. For this, they may use the\nfunction =ttime=, whose pronounciation may remind you\nof a hot beverage, but whose name really stands for\n=translate time=.\n\n\n** The file format\n  :PROPERTIES:\n  :CUSTOM_ID: file-format\n  :END:\n\n=tsdb= can store and load time-series data. The format\nit uses is plain CSV. A sample file may look as\nfollows:\n\n#+BEGIN_EXAMPLE\n  \"timestamp\",\"close\"\n  17131,11\n  17132,12\n  17133,13\n  17134,14\n  17135,15\n#+END_EXAMPLE\n\nThus, the file has a header line that gives the\nnames of the columns, with the first column always\nbeing named =timestamp=.\n\nThe advantage of this plain format is that the data\nare in no way dependent on =tsdb=. The files can be\nused and manipulated by other software as well.\n\n\n** Timestamps\n  :PROPERTIES:\n  :CUSTOM_ID: timestamps\n  :END:\n\n  Two types of timestamps are supported: =Date= and\n  =POSXIct=. As part of a =ts_table=, timestamps are\n  always stored in their numeric representation: daily\n  timestamps are represented as the number of days\n  since 1 Jan 1970; intraday timestamps are the number\n  of seconds since 1 Jan 1970.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fenricoschumann%2Ftsdb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fenricoschumann%2Ftsdb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fenricoschumann%2Ftsdb/lists"}