Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/janmarvin/readsas

Read the SAS file formats
https://github.com/janmarvin/readsas

r r-package rcpp reader sas sas7bdat

Last synced: 7 days ago
JSON representation

Read the SAS file formats

Host: GitHub
URL: https://github.com/janmarvin/readsas
Owner: JanMarvin
License: gpl-2.0
Created: 2019-10-05T09:43:57.000Z (about 5 years ago)
Default Branch: main
Last Pushed: 2024-09-17T15:22:40.000Z (4 months ago)
Last Synced: 2024-09-18T11:59:02.347Z (4 months ago)
Topics: r, r-package, rcpp, reader, sas, sas7bdat
Language: C++
Homepage: https://janmarvin.github.io/readsas
Size: 403 KB
Stars: 3
Watchers: 3
Forks: 0
Open Issues: 5
Metadata Files:
- Readme: README.Rmd
- License: LICENSE

Awesome Lists containing this project

README

        ---

output: github_document

---

```{r setup, include=FALSE}

knitr::opts_chunk$set(

  collapse = TRUE,

  comment = "#>",

  out.width = "100%"

)

library(readsas)

```

# readsas 

![R-CMD-check](https://github.com/JanMarvin/readspss/workflows/R-CMD-check/badge.svg)

[![Codecov test coverage](https://codecov.io/gh/JanMarvin/readsas/branch/main/graph/badge.svg)](https://app.codecov.io/gh/JanMarvin/readsas?branch=main) [![readsas status badge](https://janmarvin.r-universe.dev/badges/readsas)](https://janmarvin.r-universe.dev/readsas)

R package using Rcpp to parse a SAS file into a data.frame(). Currently 

`read.sas` is the main function and feature of this package.

The package allows (experimental) reading of sas7bdat files that are

* (un)compressed

As with other releases of the `read` series, focus is again on being as 

accurate as possible. Speed is welcome, but a secondary goal.

## Installation

With `remotes`:

``` r

remotes::install_github("JanMarvin/readsas")

```

With `r-universe`:

``` r

options(repos = c(

  janmarvin = 'https://janmarvin.r-universe.dev',

  CRAN = 'https://cloud.r-project.org'))

install.packages('readsas')

```

## Basic usage

```{r}

fl <- system.file("extdata", "cars.sas7bdat", package = "readsas")

dd <- read.sas(fl)

head(dd)

```

## Select columns or rows

This should be much faster, since unselected cells of the entire data frame are skipped when reading, and it is memory efficient to load only specific columns or rows. However, the file header is always read in its entirety. If the file header is large enough, it will still take some time to read.

```{r}

fl <- system.file("extdata", "mtcars.sas7bdat", package = "readsas")

dd <- read.sas(fl, select.cols = c("VAR1", "mpg", "hp"),

               select.rows = c(2:5), rownames = TRUE)

head(dd)

```

## Thanks

The documentation of the sas7bdat package by Matt Shotwell and Clint Cummins in

their R package [`sas7bdat`](https://github.com/BioStatMatt/sas7bdat), by 

Jared Hobbs for the python library 

[`sas7bdat`](https://bitbucket.org/jaredhobbs/sas7bdat/src/master/), and by EPAM in 

the Java library [`parso`](https://github.com/epam/parso) was crucial.

Without their decryption of the SAS format, this package would not have been

possible.

Further testing was done using the R package 

[`haven`](https://github.com/tidyverse/haven) by Hadley Wickam and Evan Miller.