{"id":15578375,"url":"https://github.com/nelson-perez/large-sort","last_synced_at":"2025-06-15T08:36:52.607Z","repository":{"id":63392712,"uuid":"561189671","full_name":"nelson-perez/large-sort","owner":"nelson-perez","description":"Fast sorting library to parse and sort large amount of data from a file.","archived":false,"fork":false,"pushed_at":"2024-04-09T18:10:04.000Z","size":150,"stargazers_count":1,"open_issues_count":1,"forks_count":4,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-11-27T09:14:11.656Z","etag":null,"topics":["external","external-sorting","files","javascript","node","node-js","nodejs","sort","sorting","typescript"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nelson-perez.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-11-03T06:29:45.000Z","updated_at":"2023-11-24T08:59:54.000Z","dependencies_parsed_at":"2024-10-02T19:10:09.288Z","dependency_job_id":"dfc425ab-8bb9-4007-afce-98cc0a243e2e","html_url":"https://github.com/nelson-perez/large-sort","commit_stats":{"total_commits":30,"total_committers":1,"mean_commits":30.0,"dds":0.0,"last_synced_commit":"0cd19c1d77ad6d93bfb2914c5bdf53958a0e1cea"},"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nelson-perez%2Flarge-sort","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nelson-perez%2Flarge-sort/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nelson-perez%2Flarge-sort/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nelson-perez%2Flarge-sort/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nelson-perez","download_url":"https://codeload.github.com/nelson-perez/large-sort/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250541501,"owners_count":21447528,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["external","external-sorting","files","javascript","node","node-js","nodejs","sort","sorting","typescript"],"created_at":"2024-10-02T19:09:53.414Z","updated_at":"2025-04-24T01:20:33.286Z","avatar_url":"https://github.com/nelson-perez.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"![Large Sort JS](img/large_sort_js.png)\n\n[![Node.js CI](https://github.com/nelson-perez/large-sort/actions/workflows/node.js.yml/badge.svg)](https://github.com/nelson-perez/large-sort/actions/workflows/node.js.yml)\n[![Total Downloads](https://img.shields.io/npm/dt/large-sort.svg)](https://www.npmjs.com/package/large-sort)\n[![Start](https://img.shields.io/github/stars/nelson-perez/large-sort?style=flat-square)](https://github.com/nelson-perez/large-sort/stargazers)\n[![MIT Licence](https://badges.frapsoft.com/os/mit/mit.svg?v=103)](https://opensource.org/licenses/mit-license.php)\n[![Open Source Love](https://badges.frapsoft.com/os/v1/open-source.svg?v=103)](https://opensource.org/)\n\n\u003c!--- [![large-sort](/advisor/npm-package/large-sort/badge.svg)](/advisor/npm-package/large-sort) --\u003e\n\n[![NPM Package](https://nodei.co/npm/large-sort.png)](https://www.npmjs.com/package/large-sort)\n\n\n\u003c!-- [![npm large-sort](img/npm_large-sort.png)](https://www.npmjs.com/package/large-sort) --\u003e\n\n# Overview\nFast sorting library that parses, sorts and serializes the content of large files using [external merge sort](https://en.wikipedia.org/wiki/External_sorting) for NodeJS.Currently there are two functions `sortFile()` to sort the file content of large files and `sortStream()` to generically sort `Stream`.\n\nI've also enable the case of multi sorted stream merging exposing the internal __`merge()`__ which takes a list of filenames or `Readable` streams. Or the specific functions __`mergeSortedFiles()`__ that takes a filename list or __`mergeSortedStreams()`__ which takes a list of `Readables` streams. I wanted to exposse this as I saw there are some folks that need it and I exposed a _high performance_ version of an npm package similar to [multi-sort-stream](https://www.npmjs.com/package/multi-sort-stream?activeTab=readme).\n\n\n### Additional planned features:\n- [DONE] Enable ***custom delimeters*** for the data via `string` or `regex`.\n- [DONE] Load the input from a `ReadStream` and output the sorted data into a `WriteStream` instead of file to file.\n- *[**exploring**]* - Create API to build the sort scenario based on a property/field name or an `extract property function` instead of a comparer function.\n  - This is an area of exploration to see if there could be performance advantages utilizing `number` and `string` specific sorting algorithms instead of relying on the comparer.\n- *[**exploring**]* - I've been experimenting a bit using `thread_workers` to help sort during the split process and although I did saw great performance, it comes with the disadvange passing the comparer as serializable JSON which is not possible to pass a function so it will require some refactoring like I mentioned above where instead of providing the compareFn you need provide a property/field you would like to sort with. I think I'll borrow some inspiration from [fast-sort](https://www.npmjs.com/package/fast-sort) which uses that similar builder approach to build the sorter before doing the actual sort but without the lambda capability when using `thread_workers`. I'll probably switch the logic depending if the caller provides a property or provides a function to either compare or resolve a property.\n\n## Jump to examples links\n### [sortFile() examples](#usage-example-of-sortfile)\n### [sortStream() examples](#usage-example-of-sortstream)\n### [More examples](#additional-sort-examples)\n\n# Installation\nInstall to your NodeJS project using [npm](https://npmjs.org/large-sort).\n```bash\nnpm install large-sort --save\n```\n# Available functions\n## `sortFile()`\nThis method provides the necesary functionality that allows to parse line by line the input file deserializing from a `string` into an object or primitive that can be **compared**, **sorted** and **serialized** back into an output file. It sorts the data using an [external merge sort](https://en.wikipedia.org/wiki/External_sorting) algorithm which splits the file into multiple sorted temporary *k-files* and then merges each of the splited *k-files* into a single output file.\n\nThe size of the splitted files is controlled by the maximun number of lines per file (`linesPerFile`) parameter or if the memory reaches more than ***1GB*** with a minumum of 1,000 lines whichever happens first.\n\n\n### Parameters of `sortFile()`\n|Name                   | Description   |\n|           -           |      -        |\n|***TValue***           | Type of the parsed value from the input file|\n|__inputFile__          | File path of the file that contains data delimited by a the `inputDelimeter` to be sorted.|\n|__outputFile__         | File path of the output sorted data delimited by the `outputDelimeter`.|\n|__inputMapFn__         | Function that maps/parses/deserializes a delimited `string` from the input file into a **TValue** type. _default_: `x =\u003e x`|\n|__outputMapFn__        | Function maps/serializes each **TValue** into a single line `string` for the output file. _default_: `x =\u003e String(x)`|\n|__compareFn__          | Comparer function of **TValue** types to define the sorting order. _default_: `(a, b) =\u003e a \u003e b? 1 : -1`|\n|__inputDelimeter__     | String or Regex that delimits each input string before been mapped by the `inputMapFunc` function. _default_: `'\\n'` |\n|__outputDelimeter__    | String delimeter to separate each output string after been mapped to string using the `outputMapFn` function. _default_: `'\\n'` |\n|__linesPerFile__   | Max number of lines processed for each file split. _`It's recommended to keep the default value for performance.`_|\n\n\n### Function definition of `sortFile()`\n```typescript\n/**\n * The `sortFile()` method sorts the content of an input file and writes the results into an output file.\n * It's designed to handled large files that would not fit into memory by using an external merge sort algorithm.\n * (see: {@link https://en.wikipedia.org/wiki/External_sorting})\n * \n * This method parses each line of the input file into {@link TValue} instances, sorts them and finally\n * serializes and writes these {@link TValue} instances into lines of the output file via the parameters\n * {@link inputMapFn}, {@link compareFn} and {@link outputMapFn} funtions respectively.\n * \n * \n * The sort order is determined by the {@link compareFn} which specifies the precedence of the {@link TValue} instances.\n * @examples\n * - increasing order sort compareFn: (a, b) =\u003e a \u003e b? 1 : -1\n * - decreasing order sort compareFn: (a, b) =\u003e a \u003c b? 1 : -1\n * \n * Note:\n * It is recommended to don't specify the {@link linesPerFile} parameter to keep the default value of 100,000.\n * As `sortFile()` has been tested/benchmarked for the best sorting/io performance. It can be specified only \n * for special scenarios to overcome `too many files` error when other options are not possible or to tune\n * performance for larger `TValue` instances or slow file IO \n * \n * When sorting tremendously large files the following error could occur:\n *  ---------------------------------------\n * | `Error: EMFILE, too many open files`  |\n *  ---------------------------------------\n * Which occurs when there input has been splited in more than ~1,024 files and all those files are opened during\n * the k-file merging process.\n * To overcome this the error you'll need to increase the maximum number of concurrent open stream/files limit by\n * using the `$ ulimit -n \u003cmax open files (default: 1024)\u003e` command or update the `/etc/security/limit.conf` file.\n * \n * If above is not possible then you could overcome it by specifying the {@link linesPerFile} parameter above 100,000\n * which could result less split files to merge.\n * \n * \n * @template TValue                         - Specifies type of a parsed instance to sort from the input file.\n * \n * \n * @param {string}          inputFile       - Location of the input file to sort with data delimited by the\n *                                            {@link inputDelimeter}.\n * @param {string}          outputFile      - Location of output file to write the sorted data delimited by the\n *                                            {@link outputDelimeter}.\n * @param {Function}        inputMapFn      - Function that parses/deserializes an input file line `string` into a\n *                                            {@link TValue} instance.\n * @param {Function}        outputMapFn     - Function that serializes each {@link TValue} instance into a single\n *                                            line `string` of the ouput file.\n * @param {Function}        compareFn       - Function that compares {@link TValue} instances to determine their\n *                                            sort order.\n *                                            See: {@link https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/sort#parameters}\n * @param {string | RegExp} inputDelimeter  - String or Regex that delimits each input string before been mapped\n *                                            using the {@link inputMapFn} function.\n * @param {string}          outputDelimeter - String delimeter to separate each output string after been mapped to\n *                                            string using the {@link outputMapFn} function.\n * @param {number}          linesPerFile    - Maximum number of lines per temporary split file. Keep default value \n *                                            of 100K.\n * \n * \n * @return {Promise\u003cvoid\u003e}                  - Promise that once resolved the output sorted file has been completely \n *                                            created and the temporary files has been cleaned up.\n */\nexport async function sortFile\u003cTValue\u003e(\n    inputFile: string,\n    outputFile: string,\n    inputMapFn: (x: string) =\u003e TValue = x =\u003e x as TValue,\n    outputMapFn: (x:TValue) =\u003e string = x =\u003e String(x),\n    compareFn: (a:TValue, b:TValue) =\u003e number = (a, b) =\u003e a \u003e b? 1 : -1,\n    inputDelimeter: string | RegExp = '\\n',\n    outputDelimeter: string = '\\n',\n    linesPerFile: number = 100_000): Promise\u003cvoid\u003e\n```\n\n### Usage example of `sortFile()`\n#### Sorting numbers\nHere is an example that explain each of the parameters and how to use it to sort a file with `Numbers` and outputs the numbers as strings.\n\n```typescript\n// Function that tansforms a line from input file into a number to use for comparison.\nconst inputMapFunction = (input: string) =\u003e Number(input);\n\n// Function that tansform a parsed number back into string as a line for the output file.\nconst outputMapFuncton = (output: number) =\u003e output.toString();\n\n // Function that compares two numbers to define their sort order.\n let compareFunction = (a: number, b: number) =\u003e a \u003e b? 1 : -1;\n\n\n // Sorts the lines of the file \"input_file.txt\" as numbers and outputs it to the \"out_sorted_file.txt\" file\n await sortFile\u003cnumber\u003e(\n    'input_file.txt',\n    'output_sorted_file.txt',\n    inputMapFunction,\n    outputMapFuncton,\n    compareFunction);\n\n// OR for the ones that like oneliners-ish\nawait sortFile\u003cnumber\u003e(\n    'input_file.txt',\n    'output_sorted_file.txt',\n    (x) =\u003e Number(input),\n    (output) =\u003e String(output),\n    (a, b) =\u003e a \u003e b? 1 : -1));\n ```\n\n## `sortStream()`\nThis method provides the necesary functionality that allows read and parse data from a stream given the provided delimeter. It deserializes from the input `string` into an object or primitive that can be **compared**, **sorted** and **serialized** back into to write into the output stream. It sorts the data using an [external merge sort](https://en.wikipedia.org/wiki/External_sorting) algorithm which splits the file into multiple sorted temporary *k-files* and then merges each of the splited *k-files* into the output stream.\n\nThe size of the splitted files is controlled by the maximun number of lines per file (`linesPerFile`) parameter or if the memory reaches more than ***1GB*** with a minumum of 1,000 lines whichever happens first.\n\n\u003e Note: It is recommended to use the `sortFile()` method when sorting files as it is quite efficient and tunned to perform at it's best.\n\n### Parameters of `sortStream()`\n|Name                   | Description |\n|         -             |       -     |\n|***TValue***           | Type of the parsed value by the `inputMapFn` function|\n|__inputStream__        | Stream contains the input data delimited by the `inputDelimeter`.|\n|__outputStream__       | File path of the output sorted data delimited by a _newline_ `\"\\n\"`.|\n|__inputMapFn__         | Function that maps/parses/deserializes a delimited `string` from the input file into a **TValue** type. _default_: `x =\u003e x`|\n|__outputMapFn__        | Function maps/serializes each **TValue** into a single line `string` for the output file. _default_: `x =\u003e String(x)`|\n|__compareFn__          | Comparer function of **TValue** types to define the sorting order. _default_: `(a, b) =\u003e a \u003e b? 1 : -1`|\n|__inputDelimeter__     | String or Regex that delimits each input string before been mapped by the `inputMapFunc` function. _default_: `'\\n'` |\n|__outputDelimeter__    | String delimeter to separate each output string after been mapped to string using the `outputMapFn` function. _default_: `'\\n'` |\n|__linesPerFile__   | Max number of lines processed for each file split. _`It's recommended to keep the default value for performance.`_|\n\n\n### Function definition of `sortStream()`\n```typescript\n/**\n * The `sortStream()` method sorts the content from an input Readable stream and writes the results into an \n * output Writable stream.\n * It's designed to handled large files that would not fit into memory by using an external merge sort algorithm.\n * (see: {@link https://en.wikipedia.org/wiki/External_sorting})\n * \n * This method parses each line of the input file into {@link TValue} instances, sorts them and finally\n * serializes and writes these {@link TValue} instances into lines of the output file via the parameters\n * {@link inputMapFn}, {@link compareFn} and {@link outputMapFn} funtions respectively.\n * \n * \n * The sort order is determined by the {@link compareFn} which specifies the precedence of the {@link TValue} instances.\n * @examples\n * - increasing order sort compareFn: (a, b) =\u003e a \u003e b? 1 : -1\n * - decreasing order sort compareFn: (a, b) =\u003e a \u003c b? 1 : -1\n * \n * Note:\n * It is recommended to don't specify the {@link linesPerFile} parameter to keep the default value of 100,000.\n * As `sortStream()` has been tested/benchmarked for the best sorting/io performance. It can be specified only \n * for special scenarios to overcome `too many files` error when other options are not possible or to tune\n * performance for larger `TValue` instances or slow file IO \n * \n * When sorting tremendously large files the following error could occur:\n *  ---------------------------------------\n * | `Error: EMFILE, too many open files`  |\n *  ---------------------------------------\n * Which occurs when there input has been splited in more than ~1,024 files and all those files are opened during\n * the k-file merging process.\n * To overcome this the error you'll need to increase the maximum number of concurrent open stream/files limit by\n * using the `$ ulimit -n \u003cmax open files (default: 1024)\u003e` command or update the `/etc/security/limit.conf` file.\n * \n * If above is not possible then you could overcome it by specifying the {@link linesPerFile} parameter above 100,000\n * which could result less split files to merge.\n * \n * \n * @template TValue                         - Specifies type of a parsed instance to sort from the input file.\n * \n * \n * @param {Readable}        inputStream     - Input stream to read the data from.\n * @param {Writable}        outputStream    - Writeable stream to output the data.\n * @param {Function}        inputMapFn      - Function that parses/deserializes an input file line `string` into a\n *                                            {@link TValue} instance.\n * @param {Function}        outputMapFn     - Function that serializes each {@link TValue} instance into a single\n *                                            line `string` of the ouput file.\n * @param {Function}        compareFn       - Function that compares {@link TValue} instances to determine their\n *                                            sort order.\n *                                            See: {@link https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/sort#parameters}\n * @param {string | RegExp} inputDelimeter  - String or Regex that delimits each input string before been mapped\n *                                            using the {@link inputMapFn} function.\n * @param {string}          outputDelimeter - String delimeter to separate each output string after been mapped to\n *                                            string using the {@link outputMapFn}.\n * @param {number}          linesPerFile    - Maximum number of lines per temporary split file. Keep default value \n *                                            of 100K.\n * \n * \n * @return {Promise\u003cvoid\u003e}                  - Promise that once resolved the output sorted stream has been completely \n *                                            created and temporary files had been cleaned up.\n */\nexport async function sortStream\u003cTValue\u003e(\n    inputStream: Readable,\n    outputStream: Writable,\n    inputMapFn: (x: string) =\u003e TValue = x =\u003e x as TValue,\n    outputMapFn: (x:TValue) =\u003e string = x =\u003e String(x),\n    compareFn: (a:TValue, b:TValue) =\u003e number = (a, b) =\u003e a \u003e b? 1 : -1,\n    inputDelimeter: string | RegExp = '\\n',\n    outputDelimeter: string = '\\n',\n    linesPerFile: number = 100_000): Promise\u003cvoid\u003e\n```\n\n## Usage example of `sortStream()`\nSimilar to the `sortFile()` method, `sortStream()` it offers the same capabilities with the nuance of you'll be using Streams instead of files. But keep in mind if you want to do file to file sorting it's best to use the `sortFile()` function instead of creating the streams yourself.\n\nBellow is an example showing how to use it.\n\n### Example: Sorting numbers file and output to terminal\nHere is an example that explain each of the parameters and how to use it to sort a file with `Numbers` and outputs the numbers as strings to the terminal.\n\n```typescript\n// Function that tansforms a line from input file into a number to use for comparison.\nconst inputMapFunction = (input: string) =\u003e Number(input);\n\n// Function that tansform a parsed number back into string as a line for the output file.\nconst outputMapFuncton = (output: number) =\u003e output.toString();\n\n // Function that compares two numbers to define their sort order.\n const compareFunction = (a: number, b: number) =\u003e a \u003e b? 1 : -1;\n\n// Readable from array;\nconst inputStream = Readable.from('5\\n10\\n15');\n\n// Output stream to the terminal\nconst outputStream = process.stdout;\n\n// Sort the lines of the inputStream (file \"input_file.txt\") as numbers and outputs the results to the outputStream (terminal sdtout)\nawait sortStream\u003cnumber\u003e(\n        inputStream,\n        outputStream,\n        inputMapFunction,\n        outputMapFuncton,\n        compareFunction);\n \n\n // OR for those who prefer the oneliners/ish\nawait sortStream\u003cnumber\u003e(\n        Readable.from('5\\n10\\n15'),                 // Input stream\n        process.stdout,                             // Output stream\n        (input: string) =\u003e Number(input),           // Input map/parsing/deserializing function\n        (output: number) =\u003e String(output),         // Output map/toString/serializing function\n        (a: number, b: number) =\u003e a \u003e b? 1 : -1);   // Compare function\n```\n\n## `merge()`\nMerge function that takes either a list of sorted files or sorted `Readable` streams and outputs to a `Writeable` output stream.\n\n### Parameters of `merge()`\n|Name                   | Description |\n|         -             |       -     |\n|***TValue***           | Type of the parsed value by the `inputMapFn` function|\n|__inputs__             | List of filenames or streams with sorted data to merge.|\n|__inputMapFn__         | Function that maps/parses/deserializes a delimited `string` from the input file into a **TValue** type. _default_: `x =\u003e x`|\n|__outputMapFn__        | Function maps/serializes each **TValue** into a single line `string` for the output file. _default_: `x =\u003e String(x)`|\n|__compareFn__          | Comparer function of **TValue** types to define the sorting order. _default_: `(a, b) =\u003e a \u003e b? 1 : -1`|\n|__outputDelimeter__    | String delimeter to separate each output string after been mapped to string using the `outputMapFn` function. _default_: `'\\n'` |\n\n### Function definition of `merge()`\n```typescript\n/**\n * Merges multiple sorted files or sorted Readable streams with data separated by a new line into an output \n * Writeable stream.\n *\n * @param {Readable[] | string[]}   inputs          - List of filenames or Readable streams to merge\n * @param {Writable}                outputStream    - Writeable stream to output the data.\n * @param {Function}                inputMapFn      - Function that parses/deserializes an input file line `string` into a\n *                                                    {@link TValue} instance.\n * @param {Function}                outputMapFn     - Function that serializes each {@link TValue} instance into a single\n *                                                    line `string` of the ouput file.\n * @param {Function}                compareFn       - Function that compares {@link TValue} instances to determine their\n *                                                    sort order.\n *                                                    See: {@link https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/sort#parameters}\n * @param {string}                  outputDelimeter - String delimeter to separate each output string after been mapped to\n */\nexport async function merge\u003cTValue\u003e(\n    inputs: Readable[] | string[],\n    outputStream: Writable,\n    inputMapFn: (x: string) =\u003e TValue = x =\u003e x as TValue,\n    outputMapFn: (x:TValue) =\u003e string = x =\u003e String(x),\n    compareFn: (a:TValue, b:TValue) =\u003e number = (a, b) =\u003e a \u003e b? 1 : -1,\n    outputDelimeter: string = '\\n'): Promise\u003cvoid\u003e\n```\n\n## `mergeSortedFiles()`\nMerge function that takes a list of sorted files and outputs to a `Writeable` output stream.\n\n### Parameters of `mergeSortedFiles()`\n|Name                   | Description |\n|         -             |       -     |\n|***TValue***           | Type of the parsed value by the `inputMapFn` function|\n|__files__              | List of filenames with sorted data to merge.|\n|__inputMapFn__         | Function that maps/parses/deserializes a delimited `string` from the input file into a **TValue** type. _default_: `x =\u003e x`|\n|__outputMapFn__        | Function maps/serializes each **TValue** into a single line `string` for the output file. _default_: `x =\u003e String(x)`|\n|__compareFn__          | Comparer function of **TValue** types to define the sorting order. _default_: `(a, b) =\u003e a \u003e b? 1 : -1`|\n|__outputDelimeter__    | String delimeter to separate each output string after been mapped to string using the `outputMapFn` function. _default_: `'\\n'` |\n\n### Function definition of `mergeSortedFiles()`\n```typescript\n/**\n * Merges multiple sorted files with data separated by a new line into an output Writeable stream.\n * \n * @template TValue                         - Specifies type of a parsed instance to sort from the input file.\n * \n * @param {string[]}        files           - List of filenames to merge\n * @param {Writable}        outputStream    - Writeable stream to output the data.\n * @param {Function}        inputMapFn      - Function that parses/deserializes an input file line `string` into a\n *                                            {@link TValue} instance.\n * @param {Function}        outputMapFn     - Function that serializes each {@link TValue} instance into a single\n *                                            line `string` of the ouput file.\n * @param {Function}        compareFn       - Function that compares {@link TValue} instances to determine their\n *                                            sort order.\n *                                            See: {@link https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/sort#parameters}\n * @param {string | RegExp} inputDelimeter  - String or Regex that delimits each input string before been mapped\n *                                            using the {@link inputMapFn} function.\n * @param {string}          outputDelimeter - String delimeter to separate each output string after been mapped to\n */\nexport async function mergeSortedFiles\u003cTValue\u003e(\n    files: string[],\n    outputStream: Writable,\n    inputMapFn: (x: string) =\u003e TValue,\n    outputMapFn: (x:TValue) =\u003e string,\n    compareFn: (a:TValue, b:TValue) =\u003e number,\n    outputDelimeter: string): Promise\u003cvoid\u003e\n```\n\n## mergeSortedStreams()\nMerge function that takes a list sorted `Readable` streams and outputs to a `Writeable` output stream.\n\n### Parameters of `mergeSortedFiles()`\n|Name                   | Description |\n|         -             |       -     |\n|***TValue***           | Type of the parsed value by the `inputMapFn` function|\n|__streams__            | List of `Readable streams with sorted data to merge.|\n|__inputMapFn__         | Function that maps/parses/deserializes a delimited `string` from the input file into a **TValue** type. _default_: `x =\u003e x`|\n|__outputMapFn__        | Function maps/serializes each **TValue** into a single line `string` for the output file. _default_: `x =\u003e String(x)`|\n|__compareFn__          | Comparer function of **TValue** types to define the sorting order. _default_: `(a, b) =\u003e a \u003e b? 1 : -1`|\n|__outputDelimeter__    | String delimeter to separate each output string after been mapped to string using the `outputMapFn` function. _default_: `'\\n'` |\n\n\n### Function definition of `mergeSortedStreams()`\n```typescript\n/**\n * Merges multiple sorted streams with data separated by a new line into an output Writeable stream.\n * \n * @template TValue                         - Specifies type of a parsed instance to sort from the input file.\n * \n * @param {Readable[]}      streams         - List of streams to merge\n * @param {Writable}        outputStream    - Writeable stream to output the sorted data.\n * @param {Function}        inputMapFn      - Function that parses/deserializes an input file line `string` into a\n *                                            {@link TValue} instance.\n * @param {Function}        outputMapFn     - Function that serializes each {@link TValue} instance into a single\n *                                            line `string` of the ouput file.\n * @param {Function}        compareFn       - Function that compares {@link TValue} instances to determine their\n *                                            sort order.\n *                                            See: {@link https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/sort#parameters}\n * @param {string}          outputDelimeter - String delimeter to separate each output string after been mapped to\n */\nexport async function mergeSortedStreams\u003cTValue\u003e(\n    streams: Readable[],\n    outputStream: Writable,\n    inputMapFn: (x: string) =\u003e TValue,\n    outputMapFn: (x:TValue) =\u003e string,\n    compareFn: (a:TValue, b:TValue) =\u003e number,\n    outputDelimeter: string): Promise\u003cvoid\u003e\n```\n\n\n# Additional sort examples\nHere are examples showing the scenarios where this would be useful.\n\n### Sort CSV file by the second column\n This example shows how to sort a csv file based on the value of the second column by parsing the csv into an array, sorting the array based on the second column and writting the array back into the csv format.\n\n ```typescript\n\n // Function that transforms the input csv row into an array with the values.\nfunction parseCsv(inputLine: string): string[] {\n    let array = inputLine.split(',');\n    return array;\n}\n\n// Function to transforms the csv value array into a csv row `string` line for output.\nfunction outputCsv(array: string[]): string {\n    let outputLine = array.join(',');\n    return outputLine;\n}\n\n// Sorts the file base on the second column of the csv file\nawait sortFile\u003cstring[]\u003e(\n    'input.csv',                    // inputFile    - input csv file\n    'sorted_output.csv',            // outputFile   - sorted output csv file\n    parseCsv,                       // inputMapFn   - maps the input csv row into an array of column values\n    outputCsv,                      // outputMapFn  - maps the array of values into a csv row to output\n    (a, b) =\u003e a[1] \u003e b[1]? 1 : -1); // compareFn    - compares the second column to sort in ascending order\n\n ```\n\n### Sort CSV input and outputs lines of JSON\n This example shows how to sort a csv file based on the value of the second column and output parsed JSON. Does this parsing the csv into an object with fiels `col1` and `col2`, sorts these objects by the `col2` field and writes the object to a output lines as JSON.\n\n ```typescript\n\n // Function that transforms the input csv row into an object.\nfunction parseCsv(inputLine: string): {col1: string, col2: string} {\n    let array = inputLine.split(',');\n    return {\n        col1: array[0],\n        col2: array[1]\n    };\n}\n\n// Function that transform the parsed object into a JSON string\nfunction outputJSON(obj: {col1: string, col2: string}): string {\n    let ouputLine = JSON.stringify(obj);\n    return outputLine;\n}\n\n// Sorts the file base on the second column of the csv file\nawait sortFile\u003c{col1: string, col2: string}\u003e(\n    'input.csv',                        // inputFile    - input csv file\n    'sorted_output.txt',                // outputFile   - sorted output csv file\n    parseCsv,                           // inputMapFn   - maps the input line `string` to an object\n    outputJSON,                         // outputMapFn  - maps the object into a json string [JSON.stringify]\n    (a, b) =\u003e a.col2 \u003c b.col2? 1 : -1); // compareFn    - compare the field col2 to sort in descending order\n\n ```\n\n ### Sort CSV by the combination of two columns\n This example shows how to sort a csv file based on the value columns 1 and column2 by parsing the csv into an object containing a field `sortBy` and the array of values, sorts the objects by the `sortBy` field and writes the array of values into the csv format.\n\n *Note:*\n The computation of the data to sort by is done ahead of time once during the input parsing in the `parseCsv()` call instead of on each comparison `compareFn` call for performance reasons.\n\n ```typescript\n\n // Function that transforms the input csv row into an object with a `sortBy` field \nfunction parseCsv(inputLine: string): {sortBy: string, array: string[]} {\n    let array = inputLine.split(',');\n    // Generating the sort by value ahead of time once instead of on comparison `compareFn` call.\n    return {\n        sortBy: array[0] + array[1] \n        array: array\n    };\n}\n\n// Function that transform the object into a json string\nfunction outputCSV(obj: {sortBy: string, array: string[]}): string {\n    let ouputLine = obj.array.join(',');\n    return outputLine;\n}\n\n// Sorts the file based on the combination of columns 1 and 2 from the csv file\nawait sortFile\u003c{sortBy: string, array: string[]}\u003e(\n    'input.csv',                            // inputFile    - input csv file\n    'sorted_output.txt',                    // outputFile   - sorted output csv file\n    parseCsv,                               // inputMapFn   - maps the input line `string` to an object\n    outputJSON,                             // outputMapFn  - maps the object into a csv row line to ouput\n    (a, b) =\u003e a.sortBy \u003e b.sortBy? 1 : -1); // compareFn    - compares using `sortBy` field to sort in ascending order\n\n ```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnelson-perez%2Flarge-sort","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnelson-perez%2Flarge-sort","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnelson-perez%2Flarge-sort/lists"}