Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/nbbrd/sasquatch
SAS dataset library for Java
https://github.com/nbbrd/sasquatch
command-line-tool desktop-application java8 library sas7bdat
Last synced: about 1 month ago
JSON representation
SAS dataset library for Java
- Host: GitHub
- URL: https://github.com/nbbrd/sasquatch
- Owner: nbbrd
- License: eupl-1.2
- Created: 2020-03-24T13:57:15.000Z (over 4 years ago)
- Default Branch: develop
- Last Pushed: 2024-09-07T12:36:52.000Z (4 months ago)
- Last Synced: 2024-09-07T13:54:58.720Z (4 months ago)
- Topics: command-line-tool, desktop-application, java8, library, sas7bdat
- Language: Java
- Homepage:
- Size: 732 KB
- Stars: 4
- Watchers: 2
- Forks: 1
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# Sasquatch - SAS dataset library for Java
[![Download](https://img.shields.io/github/release/nbbrd/sasquatch.svg)](https://github.com/nbbrd/sasquatch/releases/latest)
[![Changes](https://img.shields.io/endpoint?url=https%3A%2F%2Fraw.githubusercontent.com%2Fnbbrd%2Fsasquatch%2Fbadges%2Funreleased-changes.json)](https://github.com/nbbrd/sasquatch/blob/develop/CHANGELOG.md)This [Java library](#java-library) provides a reader for SAS datasets.
It also provides a [command-line tool](#command-line-tool) and a [desktop application](#desktop-application).Key points:
- lightweight library designed as a [facade](https://en.wikipedia.org/wiki/Facade_pattern)
- Java 8 minimum requirement
- has a module-info that makes it compatible with [JPMS](https://www.baeldung.com/java-9-modularity)Features:
- reads meta and data from SAS datasets (*.sas7bdat)
- browses data with 3 types of cursor: forward-only, scrollable and splittable
- is compatible with Java Stream API
- provides a simple facade that allows to plug in any implementation at deployment time
- implies the addition of a single mandatory dependency## Java library
### API overview
Sasquatch is instantiated by a factory:
```java
Sasquatch sasquatch = Sasquatch.ofServiceLoader();
```It provides 3 ways of browsing the data:
- forward-only: row by row from the first to the last
- scrollable: any row by its position
- splittable: rows as a (parallel) stream```java
Path file = ...;// forward-only cursor
try (SasForwardCursor cursor = sasquatch.readForward(file)) {
while (cursor.next()) {
}
}// scrollable cursor
try (SasScrollableCursor cursor = sasquatch.readScrollable(file)) {
for (int i = 0; i < cursor.getRowCount(); i++) {
cursor.moveTo(i);
}
}// splittable cursor
try (SasSplittableCursor cursor = sasquatch.readSplittable(file)) {
Stream stream = StreamSupport.stream(cursor.getSpliterator(), false);
}
```
Some shortcuts are also available:```java
// sample factory that extracts the first field as a string
SasRow.Factory factory = cursor -> row -> row.getString(0);// stream shortcut
try (Stream stream = sasquatch.rows(file, factory)) {
}// list shortcut
List rows = sasquatch.getAllRows(file, factory);
```
Metadata can be retrieved directly or through a cursor:
```java
// direct
SasMetaData meta = sasquatch.readMetaData(file);// through a cursor
try (SasCursor cursor = sasquatch.read...(file)) {
cursor.getMetaData();
}
```### Implementations
At least one implementation must be available at runtime (on classpath or modulepath) in order to read datasets. No implementation triggers an `IOException` on read operations.
Sasquatch supports the following implementations:
| artifactId | description | support |
| --- | --- | :---: |
| `sasquatch-ri` | native reference implementation | advanced |
| `sasquatch-parso` | wrapper around parso library | advanced |
| `sasquatch-sassy` | wrapper around sassy library | basic |
| `sasquatch-biostatmatt` | java version of biostatmatt r code | basic |Feature matrix:
| | `ri` | `parso` | `sassy` | `biostatmatt` |
| ---: | :---: | :---: | :---: | :---: |
| `BIG_ENDIAN_32` | x | x | - | - |
| `LITTLE_ENDIAN_32` | x | x | x | x |
| `BIG_ENDIAN_64` | x | x | - | - |
| `LITTLE_ENDIAN_64` | x | x | - | x |
| `ATTRIBUTES` | x | x | - | x |
| `LABEL_META` | x | x | - | - |
| `FIELD_ENCODING` | x | x | - | - |
| `COLUMN_ENCODING` | x | x | - | - |
| `CHAR_COMP` | x | x | - | - |
| `BIN_COMP` | x | x | - | - |
| `DATE_TYPE` | x | x | - | - |
| `DATE_TIME_TYPE` | x | x | - | - |
| `TIME_TYPE` | x | x | - | - |
| `CUSTOM_NUMERIC` | x | x | x | - |
| `COLUMN_FORMAT` | x | x | - | - |### Dependencies setup
```xml
com.github.nbbrd.sasquatch
sasquatch-api
LATEST_VERSION
com.github.nbbrd.sasquatch
sasquatch-ri
LATEST_VERSION
runtime
```
## Command-line tool
The command-line tool (`sasquatch` in `sasquatch-cli` project) allows to export a SAS dataset to a CSV or SQL file.
```bash
$ sasquath csv somedata.sas7bdat -o somedata.csv
$ sasquath sql somedata.sas7bdat -o somedata.sql
```## Desktop application
The desktop application (`sasquatchw` in `sasquatch-desktop` project) is a basic dataset viewer.