{"id":20964542,"url":"https://github.com/hyparam/hyparquet-compressors","last_synced_at":"2025-05-14T09:32:41.937Z","repository":{"id":240636803,"uuid":"798086315","full_name":"hyparam/hyparquet-compressors","owner":"hyparam","description":"Decompressors for hyparquet","archived":false,"fork":false,"pushed_at":"2025-03-20T09:51:34.000Z","size":840,"stargazers_count":13,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-05-10T03:01:32.393Z","etag":null,"topics":["brotli","decompress","decompression","decompressor","gzip","hyperparam","javascript","js","lz4","parquet","zstd"],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hyparam.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-05-09T04:18:52.000Z","updated_at":"2025-05-06T10:41:03.000Z","dependencies_parsed_at":"2025-01-11T06:31:20.626Z","dependency_job_id":"12a2261b-6fc6-4828-930e-a93bf22b5461","html_url":"https://github.com/hyparam/hyparquet-compressors","commit_stats":null,"previous_names":["hyparam/hyparquet-compressors"],"tags_count":9,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hyparam%2Fhyparquet-compressors","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hyparam%2Fhyparquet-compressors/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hyparam%2Fhyparquet-compressors/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hyparam%2Fhyparquet-compressors/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hyparam","download_url":"https://codeload.github.com/hyparam/hyparquet-compressors/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254112389,"owners_count":22016750,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["brotli","decompress","decompression","decompressor","gzip","hyperparam","javascript","js","lz4","parquet","zstd"],"created_at":"2024-11-19T02:56:00.551Z","updated_at":"2025-05-14T09:32:36.920Z","avatar_url":"https://github.com/hyparam.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# hyparquet decompressors\n\n![hyparquet parakeets](hyparquet-compressors.jpg)\n\n[![npm](https://img.shields.io/npm/v/hyparquet-compressors)](https://www.npmjs.com/package/hyparquet-compressors)\n[![workflow status](https://github.com/hyparam/hyparquet-compressors/actions/workflows/ci.yml/badge.svg)](https://github.com/hyparam/hyparquet-compressors/actions)\n[![mit license](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)\n![coverage](https://img.shields.io/badge/Coverage-86-darkred)\n\nThis package exports a `compressors` object intended to be passed into [hyparquet](https://github.com/hyparam/hyparquet).\n\n[Apache Parquet](https://parquet.apache.org) is a popular columnar storage format that is widely used in data engineering, data science, and machine learning applications for efficiently storing and processing large datasets. It supports a number of different compression formats, but most parquet files use snappy compression.\n\nThe hyparquet library by default only supports `uncompressed` and `snappy` compressed files. The `hyparquet-compressors` package extends support for all legal parquet compression formats.\n\nThe `hyparquet-compressors` package works in both node.js and the browser. Uses js and wasm packages, no system dependencies.\n\n## Usage\n\n```js\nimport { parquetRead } from 'hyparquet'\nimport { compressors } from 'hyparquet-compressors'\n\nawait parquetRead({ file, compressors, onComplete: console.log })\n```\n\nSee [hyparquet](https://github.com/hyparam/hyparquet) repo for further info.\n\n# Compression formats\n\nParquet compression types supported with `hyparquet-compressors`:\n - [X] Uncompressed\n - [X] Snappy\n - [x] Gzip\n - [ ] LZO\n - [X] Brotli\n - [X] LZ4\n - [X] ZSTD\n - [X] LZ4_RAW\n\n## Snappy\n\nSnappy compression uses [hysnappy](https://github.com/hyparam/hysnappy) for fast snappy decompression using minimal wasm.\n\n## Gzip\n\nNew gzip implementation adapted from [fflate](https://github.com/101arrowz/fflate).\nIncludes modifications to handle repeated back-to-back gzip streams that sometimes occur in parquet files (but was not supported by fflate).\n\n## Brotli\n\nIncludes a minimal port of [brotli.js](https://github.com/foliojs/brotli.js) which pre-compresses the brotli dictionary using gzip to minimize the distribution bundle size.\n\n## LZ4\n\nNew LZ4 implementation includes support for legacy hadoop LZ4 frame format used on some old parquet files.\n\n## Zstd\n\nUses [fzstd](https://github.com/101arrowz/fzstd) for Zstandard decompression.\n\n# Bundle size\n\n| File | Size |\n| --- | --- |\n| hyparquet-compressors.min.js | 116.1kb |\n| hyparquet-compressors.min.js.gz | 75.2kb |\n\n# References\n\n - https://parquet.apache.org/docs/file-format/data-pages/compression/\n - https://en.wikipedia.org/wiki/Brotli\n - https://en.wikipedia.org/wiki/Gzip\n - https://en.wikipedia.org/wiki/LZ4_(compression_algorithm)\n - https://en.wikipedia.org/wiki/Snappy_(compression)\n - https://en.wikipedia.org/wiki/Zstd\n - https://github.com/101arrowz/fflate\n - https://github.com/101arrowz/fzstd\n - https://github.com/foliojs/brotli.js\n - https://github.com/hyparam/hysnappy\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhyparam%2Fhyparquet-compressors","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhyparam%2Fhyparquet-compressors","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhyparam%2Fhyparquet-compressors/lists"}