Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/hughparsonage/factor256
https://github.com/hughparsonage/factor256
Last synced: 1 day ago
JSON representation
- Host: GitHub
- URL: https://github.com/hughparsonage/factor256
- Owner: HughParsonage
- Created: 2021-08-27T12:20:52.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2023-11-17T10:02:03.000Z (about 1 year ago)
- Last Synced: 2024-03-24T11:00:32.042Z (9 months ago)
- Language: R
- Size: 65.4 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.Rmd
Awesome Lists containing this project
README
---
output: github_document
---```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```# factor256
The goal of factor256 is to minimize the memory footprint of data analysis that
uses categorical variables with fewer than 256 unique values.## Installation
You can install the development version of factor256 from [GitHub](https://github.com/) with:
``` r
# install.packages("devtools")
devtools::install_github("HughParsonage/factor256")
```## Example
This is a basic example which shows you how to solve a common problem:
```{r basic-example}
library(factor256)
x <- factor256(LETTERS)
typeof(x)
identical(recompose256(x), LETTERS)
``````{r data-example}
library(data.table)
DT <-
CJ(Year = 2000:2020,
State = rep_len(c("WA", "SA", "NSW", "NT", "TAS", "VIC", "QLD"), 1000),
Age = rep_len(0:100, 10000))
# pryr::object_size(DT)
# 3.36GB
for (j in seq_along(DT)) {
set(DT, j = j, value = factor256(.subset2(DT, j)))
}
# pryr::object_size(DT)
# 630 MB
```