{"id":13456412,"url":"https://github.com/uhop/stream-json","last_synced_at":"2025-05-13T18:06:59.223Z","repository":{"id":9997120,"uuid":"12029945","full_name":"uhop/stream-json","owner":"uhop","description":"The micro-library of Node.js stream components for creating custom JSON processing pipelines with a minimal memory footprint. It can parse JSON files far exceeding available memory streaming individual primitives using a SAX-inspired API.","archived":false,"fork":false,"pushed_at":"2025-02-03T06:26:21.000Z","size":921,"stargazers_count":1047,"open_issues_count":8,"forks_count":48,"subscribers_count":12,"default_branch":"master","last_synced_at":"2025-05-03T21:01:51.529Z","etag":null,"topics":["javascript-objects","parse-json-files","parser","stream-components","stream-processing","streaming-json"],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/uhop.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":"uhop","buy_me_a_coffee":"uhop"}},"created_at":"2013-08-11T02:17:41.000Z","updated_at":"2025-05-02T22:40:51.000Z","dependencies_parsed_at":"2024-05-04T16:33:06.626Z","dependency_job_id":"4d93e11a-bc28-4c04-a8a0-303b78f999ff","html_url":"https://github.com/uhop/stream-json","commit_stats":{"total_commits":345,"total_committers":13,"mean_commits":26.53846153846154,"dds":"0.24927536231884062","last_synced_commit":"6b8ac553be30fa9ffaeb24256f728903b8c69a11"},"previous_names":[],"tags_count":48,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uhop%2Fstream-json","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uhop%2Fstream-json/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uhop%2Fstream-json/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/uhop%2Fstream-json/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/uhop","download_url":"https://codeload.github.com/uhop/stream-json/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254000848,"owners_count":21997441,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["javascript-objects","parse-json-files","parser","stream-components","stream-processing","streaming-json"],"created_at":"2024-07-31T08:01:21.442Z","updated_at":"2025-05-13T18:06:59.200Z","avatar_url":"https://github.com/uhop.png","language":"JavaScript","readme":"# stream-json [![NPM version][npm-image]][npm-url]\n\n[npm-image]:      https://img.shields.io/npm/v/stream-json.svg\n[npm-url]:        https://npmjs.org/package/stream-json\n\n`stream-json` is a micro-library of node.js stream components with minimal dependencies for creating custom data processors oriented on processing huge JSON files while requiring a minimal memory footprint. It can parse JSON files far exceeding available memory. Even individual primitive data items (keys, strings, and numbers) can be streamed piece-wise. Streaming SAX-inspired event-based API is included as well.\n\nAvailable components:\n\n* Streaming JSON [Parser](https://github.com/uhop/stream-json/wiki/Parser).\n  * It produces a SAX-like token stream.\n  * Optionally it can pack keys, strings, and numbers (controlled separately).\n  * The [main module](https://github.com/uhop/stream-json/wiki/Main-module) provides helpers to create a parser.\n* Filters to edit a token stream:\n  * [Pick](https://github.com/uhop/stream-json/wiki/Pick) selects desired objects.\n    * It can produces multiple top-level objects just like in [JSON Streaming](https://en.wikipedia.org/wiki/JSON_Streaming) protocol.\n    * Don't forget to use [StreamValues](https://github.com/uhop/stream-json/wiki/StreamValues) when picking several subobjects!\n  * [Replace](https://github.com/uhop/stream-json/wiki/Replace) substitutes objects with a replacement.\n  * [Ignore](https://github.com/uhop/stream-json/wiki/Ignore) removes objects.\n  * [Filter](https://github.com/uhop/stream-json/wiki/Filter) filters tokens maintaining stream's validity.\n* Streamers to produce a stream of JavaScript objects.\n  * [StreamValues](https://github.com/uhop/stream-json/wiki/StreamValues) can handle a stream of JSON objects.\n    * Useful to stream objects selected by `Pick`, or generated by other means.\n    * It supports [JSON Streaming](https://en.wikipedia.org/wiki/JSON_Streaming) protocol, where individual values are separated semantically (like in `\"{}[]\"`), or with white spaces (like in `\"true 1 null\"`).\n  * [StreamArray](https://github.com/uhop/stream-json/wiki/StreamArray) takes an array of objects and produces a stream of its components.\n    * It streams array components individually taking care of assembling them automatically.\n    * Created initially to deal with JSON files similar to [Django](https://www.djangoproject.com/)-produced database dumps.\n    * Only one top-level array per stream is valid!\n  * [StreamObject](https://github.com/uhop/stream-json/wiki/StreamObject) takes an object and produces a stream of its top-level properties.\n    * Only one top-level object per stream is valid!\n* Essentials:\n  * [Assembler](https://github.com/uhop/stream-json/wiki/Assembler) interprets a token stream creating JavaScript objects.\n  * [Disassembler](https://github.com/uhop/stream-json/wiki/Disassembler) produces a token stream from JavaScript objects.\n  * [Stringer](https://github.com/uhop/stream-json/wiki/Stringer) converts a token stream back into a JSON text stream.\n  * [Emitter](https://github.com/uhop/stream-json/wiki/Emitter) reads a token stream and emits each token as an event.\n    * It can greatly simplify data processing.\n* Utilities:\n  * [emit()](https://github.com/uhop/stream-json/wiki/emit()) makes any stream component to emit tokens as events.\n  * [withParser()](https://github.com/uhop/stream-json/wiki/withParser()) helps to create stream components with a parser.\n  * [Batch](https://github.com/uhop/stream-json/wiki/Batch) batches items into arrays to simplify their processing.\n  * [Verifier](https://github.com/uhop/stream-json/wiki/Verifier) reads a stream and verifies that it is a valid JSON.\n  * [Utf8Stream](https://github.com/uhop/stream-json/wiki/Utf8Stream) sanitizes multibyte `utf8` text input.\n* Special helpers:\n  * JSONL AKA [JSON Lines](http://jsonlines.org/) AKA [NDJSON](http://ndjson.org/):\n    * [jsonl/Parser](https://github.com/uhop/stream-json/wiki/jsonl-Parser) parses a JSONL file producing objects similar to `StreamValues`.\n      * Useful when we know that individual items can fit in memory.\n      * Generally it is faster than the equivalent combination of `Parser({jsonStreaming: true})` + `StreamValues`.\n    * [jsonl/Stringer](https://github.com/uhop/stream-json/wiki/jsonl-Stringer) produces a JSONL file from a stream of JavaScript objects.\n      * Generally it is faster than the equivalent combination of `Disassembler` + `Stringer`.\n\nAll components are meant to be building blocks to create flexible custom data processing pipelines. They can be extended and/or combined with custom code. They can be used together with [stream-chain](https://www.npmjs.com/package/stream-chain) to simplify data processing.\n\nThis toolkit is distributed under New BSD license.\n\n## Introduction\n\n```js\nconst {chain}  = require('stream-chain');\n\nconst {parser} = require('stream-json');\nconst {pick}   = require('stream-json/filters/Pick');\nconst {ignore} = require('stream-json/filters/Ignore');\nconst {streamValues} = require('stream-json/streamers/StreamValues');\n\nconst fs   = require('fs');\nconst zlib = require('zlib');\n\nconst pipeline = chain([\n  fs.createReadStream('sample.json.gz'),\n  zlib.createGunzip(),\n  parser(),\n  pick({filter: 'data'}),\n  ignore({filter: /\\b_meta\\b/i}),\n  streamValues(),\n  data =\u003e {\n    const value = data.value;\n    // keep data only for the accounting department\n    return value \u0026\u0026 value.department === 'accounting' ? data : null;\n  }\n]);\n\nlet counter = 0;\npipeline.on('data', () =\u003e ++counter);\npipeline.on('end', () =\u003e\n  console.log(`The accounting department has ${counter} employees.`));\n```\n\nSee the full documentation in [Wiki](https://github.com/uhop/stream-json/wiki).\n\nCompanion projects:\n\n* [stream-csv-as-json](https://www.npmjs.com/package/stream-csv-as-json) streams huge CSV files in a format compatible with `stream-json`:\n  rows as arrays of string values. If a header row is used, it can stream rows as objects with named fields.\n\n## Installation\n\n```bash\nnpm install --save stream-json\n# or: yarn add stream-json\n```\n\n## Use\n\nThe whole library is organized as a set of small components, which can be combined to produce the most effective pipeline. All components are based on node.js\n[streams](http://nodejs.org/api/stream.html), and [events](http://nodejs.org/api/events.html). They implement all required standard APIs. It is easy to add your\nown components to solve your unique tasks.\n\nThe code of all components is compact and simple. Please take a look at their source code to see how things are implemented, so you can produce your own components\nin no time.\n\nObviously, if a bug is found, or a way to simplify existing components, or new generic components are created, which can be reused in a variety of projects,\ndon't hesitate to open a ticket, and/or create a pull request.\n\n## Release History\n\n* 1.9.0 *fixed a slight deviation from the JSON standard. Thx [Peter Burns](https://github.com/rictic).*\n* 1.8.0 *added an option to indicate/ignore JSONL errors. Thx, [AK](https://github.com/ak--47).*\n* 1.7.5 *fixed a stringer bug with ASCII control symbols. Thx, [Kraicheck](https://github.com/Kraicheck).*\n* 1.7.4 *updated dependency (`stream-chain`), bugfix: inconsistent object/array braces. Thx [Xiao Li](https://github.com/xli1000).*\n* 1.7.3 *added an assembler option to treat numbers as strings.*\n* 1.7.2 *added an error check for JSONL parsing. Thx [Marc-Andre Boily](https://github.com/maboily).*\n* 1.7.1 *minor bugfix and improved error reporting.*\n* 1.7.0 *added `utils/Utf8Stream` to sanitize `utf8` input, all parsers support it automatically. Thx [john30](https://github.com/john30) for the suggestion.*\n* 1.6.1 *the technical release, no need to upgrade.*\n* 1.6.0 *added `jsonl/Parser` and `jsonl/Stringer`.*\n\nThe rest can be consulted in the project's wiki [Release history](https://github.com/uhop/stream-json/wiki/Release-history).\n","funding_links":["https://github.com/sponsors/uhop","https://buymeacoffee.com/uhop"],"categories":["Repository","JavaScript","others","Modules"],"sub_categories":["Streams"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fuhop%2Fstream-json","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fuhop%2Fstream-json","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fuhop%2Fstream-json/lists"}