{"id":14069017,"url":"https://github.com/fstpackage/fsttable","last_synced_at":"2026-01-17T17:56:56.057Z","repository":{"id":224199929,"uuid":"117897410","full_name":"fstpackage/fsttable","owner":"fstpackage","description":"An interface to fast on-disk data tables stored with the fst format","archived":false,"fork":false,"pushed_at":"2019-09-10T08:39:21.000Z","size":89,"stargazers_count":27,"open_issues_count":22,"forks_count":4,"subscribers_count":11,"default_branch":"develop","last_synced_at":"2024-08-13T07:15:36.335Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fstpackage.png","metadata":{"files":{"readme":"README.Rmd","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2018-01-17T21:55:50.000Z","updated_at":"2024-07-17T03:13:10.000Z","dependencies_parsed_at":"2024-02-24T13:54:31.833Z","dependency_job_id":null,"html_url":"https://github.com/fstpackage/fsttable","commit_stats":null,"previous_names":["fstpackage/fsttable"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fstpackage%2Ffsttable","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fstpackage%2Ffsttable/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fstpackage%2Ffsttable/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fstpackage%2Ffsttable/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fstpackage","download_url":"https://codeload.github.com/fstpackage/fsttable/tar.gz/refs/heads/develop","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":228092113,"owners_count":17868138,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-13T07:06:32.809Z","updated_at":"2026-01-17T17:56:56.016Z","avatar_url":"https://github.com/fstpackage.png","language":"R","funding_links":[],"categories":["R"],"sub_categories":[],"readme":"---\noutput: github_document\neditor_options: \n  chunk_output_type: console\n---\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.path = \"man/figures/README-\",\n  out.width = \"100%\"\n)\n```\n\n# fsttable\n\n\u003c!-- badges: start --\u003e\n[![Linux/OSX Build Status](https://travis-ci.org/fstpackage/fsttable.svg?branch=develop)](https://travis-ci.org/fstpackage/fsttable)\n[![Windows Build status](https://ci.appveyor.com/api/projects/status/nrjyuihxtx9amgpl/branch/develop?svg=true\n)](https://ci.appveyor.com/project/fstpackage/fsttable)\n[![License: AGPL v3](https://img.shields.io/badge/License-AGPL%20v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0)\n[![Lifecycle: maturing](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://www.tidyverse.org/lifecycle/#experimental)\n\u003c!-- badges: end --\u003e\n\nR package `fsttable` aims to provide a fully functional `data.table` interface to on-disk `fst` files. The focus of the package is on keeping memory usage as low as possible woithout sacrificing features of in-memory `data.table` operations.\n\n## Installation\n\nYou can install the latest package version with:\n\n``` r\ndevtools::install_github(\"fstpackage/fsttable\")\n```\n\n## Example\n\nFirst, we create a on-disk _fst_ file containing a medium sized dataset:\n\n```{r data}\nlibrary(fsttable)\n\n# write some sample data to disk\nnr_of_rows \u003c- 1e6\nx \u003c- data.table::data.table(X = 1:nr_of_rows, Y = LETTERS[1 + (1:nr_of_rows) %% 26])\nfst::write_fst(x, \"1.fst\")\n```\n\nThen we define our _fst\\_table_ by using:\n\n```{r proxy}\nft \u003c- fst_table(\"1.fst\")\n```\n\nThis _fst\\_table_ can be used as a regular _data.table_ object. For example, we can print:\n\n```{r print}\nft\n```\n\nwe can select columns:\n\n```{r columns}\nft[, .(Y)]\n```\n\nand rows:\n\n```{r rows}\nft[1:4,]\n```\n\nOr both at the same time:\n\n```{r cols_and_rows}\nft[1:4, .(X)]\n```\n\n# Memory\n\nDuring the operations shown above, the actual data was never fully loaded from the file. That's because of `fsttable`'s philosophy of keeping RAM usage as low as possible. Printing a few lines of a table doesn't require knowlegde of the remaining lines, so `fsttable` will never actualy load them.\n\nEven when you create a new set:\n\n```{r copy}\nft2 \u003c- ft[1:4, .(X)]\n```\n\nNo actual data is being loaded into RAM. The copy still uses the original _fst_ file to keep the data on-disk:\n\n```{r internals}\n# small size because actual data is still on disk\nobject.size(ft2)\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffstpackage%2Ffsttable","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffstpackage%2Ffsttable","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffstpackage%2Ffsttable/lists"}