{"id":17717643,"url":"https://github.com/davzim/ritch","last_synced_at":"2025-08-23T14:13:09.597Z","repository":{"id":116208750,"uuid":"112514746","full_name":"DavZim/RITCH","owner":"DavZim","description":"An R interface to the ITCH Protocol","archived":false,"fork":false,"pushed_at":"2024-08-25T09:56:48.000Z","size":11362,"stargazers_count":18,"open_issues_count":2,"forks_count":6,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-04-15T15:02:15.543Z","etag":null,"topics":["cpp","itch","r","rdatatable"],"latest_commit_sha":null,"homepage":"https://davzim.github.io/RITCH/","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DavZim.png","metadata":{"files":{"readme":"README.Rmd","changelog":"NEWS.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-11-29T18:53:05.000Z","updated_at":"2025-03-01T07:52:48.000Z","dependencies_parsed_at":"2024-06-11T17:14:50.119Z","dependency_job_id":"f2c5493f-7b41-4a64-8854-f27754664c44","html_url":"https://github.com/DavZim/RITCH","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/DavZim/RITCH","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DavZim%2FRITCH","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DavZim%2FRITCH/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DavZim%2FRITCH/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DavZim%2FRITCH/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DavZim","download_url":"https://codeload.github.com/DavZim/RITCH/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DavZim%2FRITCH/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":271751925,"owners_count":24814707,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-23T02:00:09.327Z","response_time":69,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cpp","itch","r","rdatatable"],"created_at":"2024-10-25T14:27:27.507Z","updated_at":"2025-08-23T14:13:09.570Z","avatar_url":"https://github.com/DavZim.png","language":"R","readme":"---\noutput: github_document\n---\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r, include = FALSE}\noptions(width = 120)\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.path = \"man/figures/README-\",\n  out.width = \"100%\"\n)\n```\n\n# RITCH - an R interface to the ITCH Protocol\n\n\u003c!-- badges: start --\u003e\n[![CRAN status](https://www.r-pkg.org/badges/version/RITCH)](https://CRAN.R-project.org/package=RITCH) [![CRAN RStudio mirror downloads](https://cranlogs.r-pkg.org/badges/RITCH)](https://www.r-pkg.org/pkg/RITCH) [![R-CMD-check](https://github.com/DavZim/RITCH/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/DavZim/RITCH/actions/workflows/R-CMD-check.yaml)\n\u003c!-- badges: end --\u003e\n\nThe `RITCH` library provides an `R` interface to NASDAQs ITCH protocol, which is used to distribute financial messages to participants.\nMessages include orders, trades, market status, and much more financial information.\nA full list of messages is shown later.\nThe main purpose of this package is to parse the binary ITCH files to a [`data.table`](https://CRAN.R-project.org/package=data.table) in `R`.\n\nThe package leverages [`Rcpp`](https://CRAN.R-project.org/package=Rcpp) and `C++` for efficient message parsing.\n\nNote that the package provides a small simulated sample dataset in the `ITCH_50` format for testing and example purposes.\nHelper functions are provided to list and download sample files from NASDAQs official server.\n\n## Install\n\nTo install `RITCH` you can use the following\n\n```R\n# stable version:\ninstall.packages(\"RITCH\")\n\n# development version:\n# install.packages(\"remotes\")\nremotes::install_github(\"DavZim/RITCH\")\n```\n\n## Quick Overview\n\nThe main functions of `RITCH` are read-related and are easily identified by their `read_` prefix.\n\nDue to the inherent structural differences between message classes, each class has its own read function.\nA list of message types and the respective classes are provided later in this Readme.\n\nExample message classes used in this example are *orders* and *trades*.\nFirst we define the file to load and count the messages, then we read in the orders and the first 100 trades\n\n```{r}\nlibrary(RITCH)\n# use built in example dataset\nfile \u003c- system.file(\"extdata\", \"ex20101224.TEST_ITCH_50\", package = \"RITCH\")\n\n# count the number of messages in the file\nmsg_count \u003c- count_messages(file)\ndim(msg_count)\nnames(msg_count)\n\n# read the orders into a data.table\norders \u003c- read_orders(file)\ndim(orders)\nnames(orders)\n\n# read the first 100 trades\ntrades \u003c- read_trades(file, n_max = 100)\ndim(trades)\nnames(trades)\n```\nNote that the file can be a plain `ITCH_50` file or a gzipped `ITCH_50.gz` file, which will be decompressed to the current directory.\nYou may also note that the output reports quite a low read speed in the `MB/s`.\nThis lowish number is due to including the parsing process, furthermore, due to overhead of setup code, this number gets higher on larger files.\n\nIf you want to know more about the functions of the package, read on.\n\n## Main Functions\n\n`RITCH` provides the following main functions:\n\n- `read_itch(file, ...)` to read an ITCH file\nConvenient wrappers for different message classes such as `orders`, `trades`, etc are also provided as `read_orders()`, `read_trades()`, ...\n- `filter_itch(infile, outfile, ...)` to filter an ITCH file and write directly to another file without loading the data into R\n- `write_itch(data, file, ...)` to write a dataset to an ITCH file\n\nThere are also some helper functions provided, a selection is:\n\n- `download_sample_file(choice)` to download a sample file from the NASDAQ server and `list_sample_files()` to get a list of all available sample files\n- `download_stock_directory(exchange, date)` to download the stock locate information for a given exchange and date\n- `open_itch_sample_server()` to open the official NASDAQ server in your browser, which hosts among other things example data files\n- `gzip_file(infile, outfile)` and `gunzip_file(infile, outfile)` for gzip functionality\n- `open_itch_specification()` to open the official NASDAQ ITCH specification PDF in your browser\n\n## Writing ITCH Files\n\n`RITCH` also provides functionality for writing ITCH files.\nAlthough it could be stored in other file formats (for example a database or a [`qs`](https://CRAN.R-project.org/package=qs) file), ITCH files are quite optimized regarding size as well as write/read speeds.\nThus the `write_itch()` function allows you to write a single or multiple types of message to an `ITCH_50` file.\nNote however, that only the standard columns are supported.\nAdditional columns will not be written to file!\n\nAdditional information can be saved in the filename.\nBy default the date, exchange, and fileformat information is added to the filename unless you specify `add_meta = FALSE`, in which case the given name is used.\n\nAs a last note: if you write your data to an ITCH file and want to filter for stocks later on, make sure to save the stock directory of that day/exchange, either externally or in the ITCH file directly (see example below).\n\n### Simple Write Example\n\nA simple write example would be to read all modifications from an ITCH file and save it to a separate file to save space, reduce read times later on, etc.\n\n```{r}\nfile \u003c- system.file(\"extdata\", \"ex20101224.TEST_ITCH_50\", package = \"RITCH\")\nmd \u003c- read_modifications(file, quiet = TRUE)\ndim(md)\nnames(md)\n\noutfile \u003c- write_itch(md, \"modifications\", compress = TRUE)\n\n# compare file sizes\nfiles \u003c- c(full_file = file, subset_file = outfile)\nformat_bytes(sapply(files, file.size))\n```\n```{r, include = FALSE}\nunlink(outfile)\n```\n\n\n### Comprehensive Write Example\n\nA typical work flow would look like this:\n\n- read in some message classes from file and filter for certain stocks\n- save the results for later analysis, also compress to save disk space\n\n```{r}\n## Read in the different message classes\nfile \u003c- system.file(\"extdata\", \"ex20101224.TEST_ITCH_50\", package = \"RITCH\")\n\n# read in the different message types\ndata \u003c- read_itch(file,\n                  c(\"system_events\", \"stock_directory\", \"orders\"),\n                  filter_stock_locate = c(1, 3),\n                  quiet = TRUE)\n\nstr(data, max.level = 1)\n\n\n## Write the different message classes\noutfile \u003c- write_itch(data,\n                      \"alc_char_subset\",\n                      compress = TRUE)\noutfile\n\n# compare file sizes\nformat_bytes(\n  sapply(c(full_file = file, subset_file = outfile),\n         file.size)\n)\n\n\n## Lastly, compare the two datasets to see if they are identical\ndata2 \u003c- read_itch(outfile, quiet = TRUE)\nall.equal(data, data2)\n```\n```{r, include=FALSE}\n# remove files from write_itch again...\nunlink(outfile)\noutfile_unz \u003c- gsub(\"\\\\.gz$\", \"\", outfile)\nunlink(outfile_unz)\n```\n\nFor comparison, the same format in the [`qs`](https://CRAN.R-project.org/package=qs) format results in `44788` bytes.\n\u003c!---qs::qsave(data, \"data.qs\", preset = \"archive\");file.info(\"data.qs\")[[\"size\"]];unlink(\"data.qs\")--\u003e\n\n## ITCH Messages\n\nThere are a total of 22 different message types which are grouped into 13 classes by `RITCH`.\n\nThe messages and their respective classes are:\n```{r, echo=FALSE}\nd \u003c- get_msg_classes()\nd$msg_type \u003c- paste0(\"\u003ccode\u003e\", d$msg_type, \"\u003c/code\u003e\")\nd$read_function \u003c- paste0(\"\u003ccode\u003e\", \"read_\", d$msg_class, \"()\", \"\u003c/code\u003e\")\n\ndata.table::setcolorder(d, c(\"msg_type\", \"msg_class\", \"read_function\",\n                             \"msg_name\", \"doc_nr\"))\ndata.table::setnames(d, c(\"Type\", \"\u003ccode\u003eRITCH\u003c/code\u003e Class\",\n                          \"\u003ccode\u003eRITCH\u003c/code\u003e Read Function\", \"ITCH Name\",\n                          \"ITCH Spec Section\"))\n\nknitr::kable(d, escape = FALSE)\n```\n\nNote that if you are interested in the exact definition of the messages and its components, you should look into the [official ITCH specification](https://www.nasdaqtrader.com/content/technicalsupport/specifications/dataproducts/NQTVITCHspecification.pdf), which can also be opened by calling `open_itch_specification()`.\n\n\n## Data\n\nThe `RITCH` package provides a small, artificial dataset in the ITCH format for example and test purposes.\nTo learn more about the dataset check `?ex20101224.TEST_ITCH_50`.\n\nTo access the dataset use:\n```{r}\nfile \u003c- system.file(\"extdata\", \"ex20101224.TEST_ITCH_50\", package = \"RITCH\")\ncount_messages(file, add_meta_data = TRUE, quiet = TRUE)\n```\nNote that the example dataset does not contain messages from all classes but is limited to 6 system messages, 3 stock directory, 3 stock trading action, 5000 trade, 5000 order, and 2000 order modification messages.\nAs seen by the 3 stock directory messages, the file contains data about 3 made up stocks (see also the plot later in the Readme).\n\nMASDAQ provides sample ITCH files on their official server at \u003chttps://emi.nasdaq.com/ITCH/Nasdaq%20ITCH/\u003e (or in R use `open_itch_sample_server()`) which can be used to test code on larger datasets.\nNote that the sample files are up to 5GB compressed, which inflate to about 13GB.\nTo interact with the sample files, use `list_sample_files()` and `download_sample_files()`.\n\n\n## Notes on Memory and Speed\n\nThere are some tweaks available to deal with memory and speed issues.\nFor faster reading speeds, you can increase the buffer size of the `read_` functions to something around 1 GB or more (`buffer_size = 1e9`).\n\n### Provide Message Counts\n\nIf you have to read from a single file multiple times, for example because you want to extract orders and trades, you can count the messages beforehand and provide it to each read's `n_max` argument, reducing the need to pass the file for counting the number of messages.\n```{r}\n# count messages once\nn_msgs \u003c- count_messages(file, quiet = TRUE)\n\n# use counted messages multiple times, saving file passes\norders \u003c- read_orders(file, quiet = TRUE, n_max = n_msgs)\ntrades \u003c- read_trades(file, quiet = TRUE, n_max = n_msgs)\n```\n\n### Batch Read\n\nIf the dataset does not fit entirely into RAM, you can do a partial read specifying `skip` and `n_max`, similar to this:\n\n```{r}\nfile \u003c- system.file(\"extdata\", \"ex20101224.TEST_ITCH_50\", package = \"RITCH\")\n\nn_messages \u003c- count_orders(count_messages(file, quiet = TRUE))\nn_messages\n\n# read 1000 messages at a time\nn_batch \u003c- 1000\nn_parsed \u003c- 0\n\nwhile (n_parsed \u003c n_messages) {\n  cat(sprintf(\"Parsing Batch %04i - %04i\", n_parsed, n_parsed + n_batch))\n  # read in a batch\n  df \u003c- read_orders(file, quiet = TRUE, skip = n_parsed, n_max = n_batch)\n  cat(sprintf(\": with %04i orders\\n\", nrow(df)))\n  # use the data\n  # ...\n  n_parsed \u003c- n_parsed + n_batch\n}\n```\n\n### Filter when Reading Data\n\nYou can also filter a dataset directly while reading messages for `msg_type`, `stock_locate`, `timestamp` range, as well as `stock`.\nNote that filtering for a specific stock, is just a shorthand lookup for the stocks' `stock_locate` code, therefore a `stock_directory` needs to be supplied (either by providing the output from `read_stock_directory()` or `download_stock_locate()`) or the function will try to extract the stock directory from the file (might take some time depending on the size of the file).\n\n```{r}\n# read in the stock directory as we filter for stock names later on\nsdir \u003c- read_stock_directory(file, quiet = TRUE)\n\nod \u003c- read_orders(\n  file,\n  filter_msg_type = \"A\",          # take only 'No MPID add orders'\n  min_timestamp = 43200000000000, # start at 12:00:00.000000\n  max_timestamp = 55800000000000, # end at 15:30:00.000000\n  filter_stock_locate = 1,        # take only stock with code 1\n  filter_stock = \"CHAR\",          # but also take stock CHAR\n  stock_directory = sdir          # provide the stock_directory to match stock names to stock_locates\n)\n\n# count the different message types\nod[, .(n = .N), by = msg_type]\n# see if the timestamp is in the specified range\nrange(od$timestamp)\n# count the stock/stock-locate codes\nod[, .(n = .N), by = .(stock_locate, stock)]\n```\n\n### Filter Data to File\n\nOn larger files, reading the data into memory might not be the best idea, especially if only a small subset is actually needed.\nIn this case, the `filter_itch` function will come in handy.\n\nThe basic design is identical to the `read_itch` function but instead of reading the messages into memory, they are immediately written to a file.\n\nTaking the filter data example from above, we can do the following\n\n```{r}\n# the function returns the final name of the output file\noutfile \u003c- filter_itch(\n  infile = file,\n  outfile = \"filtered\",\n  filter_msg_type = \"A\",          # take only 'No MPID add orders'\n  min_timestamp = 43200000000000, # start at 12:00:00.000000\n  max_timestamp = 55800000000000, # end at 15:30:00.000000\n  filter_stock_locate = 1,        # take only stock with code 1\n  filter_stock = \"CHAR\",          # but also take stock CHAR\n  stock_directory = sdir          # provide the stock_directory to match stock names to stock_locates\n)\n\nformat_bytes(file.size(outfile))\n\n# read in the orders from the filtered file\nod2 \u003c- read_orders(outfile)\n\n# check that the filtered dataset contains the same information as in the example above\nall.equal(od, od2)\n```\n```{r, include=FALSE}\n# remove files from filter_itch again...\nunlink(outfile)\n```\n\n\n## Create a Plot with Trades and Orders of the largest ETFs\n\nAs a last step, a quick visualization of the example dataset\n\n```{r ETF_plot}\nlibrary(ggplot2)\n\nfile \u003c- system.file(\"extdata\", \"ex20101224.TEST_ITCH_50\", package = \"RITCH\")\n\n# load the data\norders \u003c- read_orders(file, quiet = TRUE)\ntrades \u003c- read_trades(file, quiet = TRUE)\n\n# replace the buy-factor with something more useful\norders[, buy := ifelse(buy, \"Bid\", \"Ask\")]\n\nggplot() +\n  geom_point(data = orders,\n             aes(x = as.POSIXct(datetime), y = price, color = buy), alpha = 0.2) +\n  geom_step(data = trades, aes(x = as.POSIXct(datetime), y = price), size = 0.2) +\n  facet_grid(stock~., scales = \"free_y\") +\n  theme_light() +\n  labs(title = \"Orders and Trades of Three Simulated Stocks\",\n       subtitle = \"Date: 2010-12-24 | Exchange: TEST\",\n       caption = \"Source: RITCH package\", x = \"Time\", y = \"Price\", color = \"Side\") +\n  scale_y_continuous(labels = scales::dollar) +\n  scale_color_brewer(palette = \"Set1\")\n```\n\n\n## Other Notes\n\nIf you find this package useful or have any other kind of feedback, I'd be happy if you let me know. Otherwise, if you need more functionality, please feel free to create an issue or a pull request.\n\nCitation and CRAN release are WIP.\n\nIf you are interested in gaining a better understanding of the internal data structures, converting data to and from binary, have a look at the `debug` folder and its contents (only available on the [RITCH's Github page](https://github.com/DavZim/RITCH/)).\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdavzim%2Fritch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdavzim%2Fritch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdavzim%2Fritch/lists"}