Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/3p3r/csv-query-stream
Query large compressed CSV documents using NodeJS streams.
https://github.com/3p3r/csv-query-stream
csv fast query stream streaming zip
Last synced: 2 days ago
JSON representation
Query large compressed CSV documents using NodeJS streams.
- Host: GitHub
- URL: https://github.com/3p3r/csv-query-stream
- Owner: 3p3r
- License: mit
- Created: 2023-02-21T22:22:26.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2023-02-24T03:48:59.000Z (almost 2 years ago)
- Last Synced: 2024-12-21T08:42:31.873Z (about 2 months ago)
- Topics: csv, fast, query, stream, streaming, zip
- Language: TypeScript
- Homepage:
- Size: 4.18 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# csv-query-stream
Query large compressed CSV documents using NodeJS streams.
## Use Case
```sh
$ npm install csv-query-stream
```In mission critical applications, sometimes even the extra head space of SQLite
indices can be too much. In that case, data can be saved directly to text files
and queried with this module, while at the same time being inside a Zip archive
and never get unpacked.This module uses two stream to achieve this:
1. a stream to read the Zip file and seek to the CSV file's position in it
2. a stream to read the CSV file and query inside itUsage of streams allows low memory overhead and fast processing.
## Data Format
The following assumptions are made about your data when using this module:
- Your data is in CSV or TSV file(s)
- Every row of data is unique in its own file
- Your data file(s) are inside a Zip archive at the root level
- Every row of data is monotonic, meaning row's ID is its line number minus 1
- First row of data is a header rowSample data is checked in under the `test/` directory.
API usage is pretty straightforward. See `test/` for examples.