{"id":13859221,"url":"https://github.com/Claviz/bellboy","last_synced_at":"2025-07-14T01:33:30.500Z","repository":{"id":34039719,"uuid":"166376248","full_name":"Claviz/bellboy","owner":"Claviz","description":"Highly performant JavaScript data stream ETL engine.","archived":false,"fork":false,"pushed_at":"2024-10-25T12:07:52.000Z","size":1598,"stargazers_count":96,"open_issues_count":0,"forks_count":14,"subscribers_count":6,"default_branch":"master","last_synced_at":"2024-10-30T08:19:05.142Z","etag":null,"topics":["csv","etl","excel","mssql","mysql","nodejs","postgres","rest-api","streaming"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Claviz.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-01-18T09:05:54.000Z","updated_at":"2024-10-25T12:07:56.000Z","dependencies_parsed_at":"2024-02-13T10:29:23.769Z","dependency_job_id":"78bbec90-a0db-4fcc-b96e-734386492d14","html_url":"https://github.com/Claviz/bellboy","commit_stats":{"total_commits":241,"total_committers":6,"mean_commits":"40.166666666666664","dds":0.2365145228215768,"last_synced_commit":"cd6e9639d0468f22e830a3bb7821d41a2d067e50"},"previous_names":[],"tags_count":45,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Claviz%2Fbellboy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Claviz%2Fbellboy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Claviz%2Fbellboy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Claviz%2Fbellboy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Claviz","download_url":"https://codeload.github.com/Claviz/bellboy/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225893458,"owners_count":17540916,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csv","etl","excel","mssql","mysql","nodejs","postgres","rest-api","streaming"],"created_at":"2024-08-05T03:02:37.685Z","updated_at":"2024-11-22T17:30:49.146Z","avatar_url":"https://github.com/Claviz.png","language":"TypeScript","funding_links":[],"categories":["TypeScript"],"sub_categories":[],"readme":"# bellboy ![gh workflow](https://github.com/claviz/bellboy/actions/workflows/config.yml/badge.svg) [![codecov](https://codecov.io/gh/Claviz/bellboy/branch/master/graph/badge.svg)](https://codecov.io/gh/Claviz/bellboy) ![npm](https://img.shields.io/npm/v/bellboy.svg)\n\nHighly performant JavaScript data stream ETL engine.\n\n## How it works?\n\nBellboy streams input data row by row. Every row, in turn, goes through user-defined function where it can be transformed. When enough data is collected in batch, it is being loaded to destination.\n\n## Installation\n\nBefore install, make sure you are using [latest](https://nodejs.org/en/download/current/) version of Node.js.\n\n```\nnpm install bellboy\n```\n\nIf you will be using `bellboy` with the native [msnodesqlv8][msnodesqlv8-url] driver, add it as a dependency.\n\n```\nnpm install msnodesqlv8\n```\n\n## Example\n\nThis example shows how `bellboy` can extract rows from the [Excel file](#excel-processor), modify it on the fly, load to the [Postgres database](#postgres-destination), move processed file to the other folder and process remaining files.\n\nJust in five simple steps.\n\n```javascript\nconst bellboy = require(\"bellboy\");\nconst fs = require(\"fs\");\nconst path = require(\"path\");\n\n(async () =\u003e {\n  const srcPath = `C:/source`;\n\n  // 1. create a processor which will process\n  // Excel files in the folder one by one\n  const processor = new bellboy.ExcelProcessor({\n    path: srcPath,\n    hasHeader: true,\n  });\n\n  // 2. create a destination which will add a new 'status'\n  // field to each row and load processed data into a Postgres database\n  const destination = new bellboy.PostgresDestination({\n    connection: {\n      user: \"user\",\n      password: \"password\",\n      host: \"localhost\",\n      database: \"bellboy\",\n    },\n    table: \"stats\",\n    recordGenerator: async function* (record) {\n      yield {\n        ...record.raw.obj,\n        status: \"done\",\n      };\n    },\n  });\n\n  // 3. create a job which will glue the processor and the destination together\n  const job = new bellboy.Job(processor, [destination]);\n\n  // 4. tell bellboy to move the file away as soon as it was processed\n  job.on(\"endProcessingStream\", async (file) =\u003e {\n    const filePath = path.join(srcPath, file);\n    const newFilePath = path.join(`./destination`, file);\n    await fs.renameSync(filePath, newFilePath);\n  });\n\n  // 5. Log all error events\n  job.onAny(async (eventName, ...args) =\u003e {\n    if (eventName.includes(\"Error\")) {\n      console.log(args);\n    }\n  });\n\n  // 6. run your job\n  await job.run();\n})();\n```\n\n## Jobs \u003cdiv id='job'/\u003e\n\nA job in `bellboy` is a relationship link between [processor](#processors) and [destinations](#destinations). When the job is run, data processing and loading mechanism will be started.\n\n#### Initialization\n\nTo initialize a Job instance, pass [processor](#processors) and some [destination(s)](#destinations).\n\n\u003c!-- and [options](#job-options) if needed. --\u003e\n\n```javascript\nconst job = new bellboy.Job(\n  processor_instance,\n  [destination_instance],\n  (job_options = {})\n);\n```\n\n#### Options\n\n- **reporters** `Reporter[]`\\\n  Array of [reporters](#reporters).\n- **jobName** `string`\\\n  Optional user-defined name of the job. Can become handy if used in combination with [extended events](#extended-event) to distinguish events from different jobs.\n\n#### Instance methods\n\n- **run** `async function()`\\\n  Starts processing data.\n- **on** `function(event, async function listener)`\\\n  Add specific [event listener](#events).\n- **onAny** `function(async function listener)`\\\n  Add [any event listener](#any-event).\n- **stop** `function(errorMessage?)`\\\n  Stops job execution. If `errorMessage` is passed, job will throw an error with this message.\n\n#### Events and event listeners \u003cdiv id='events'/\u003e\n\nEvent listeners, which can be registered with `job.on` or `job.onAny` methods, allow you to listen to specific events in the job lifecycle and to interact with them.\n\n- When multiple listeners are registered for the same event, those added using `.on` will always be executed first, regardless of the order in which they were added compared to `.onAny`. This ensures that specific event listeners have priority over generic ones.\n- When multiple listeners are registered for a single event, those added by [reporters](#reporters) will be executed first, followed by the order of registration for the remaining listeners.\n- Job always waits for the code inside a listener to complete.\n- Any error thrown inside a listener will be ignored and warning message will be printed out.\n- `job.stop()` method can be used inside a listener to stop job execution and throw an error if needed.\n\n```ts\njob.on(\n  \"startProcessing\",\n  async (processor: IProcessor, destinations: IDestination[]) =\u003e {\n    // Job has started execution.\n  }\n);\n```\n\n```ts\njob.on(\"startProcessingStream\", async (...args: any) =\u003e {\n  // Stream processing has been started.\n  // Passed parameters may vary based on specific processor.\n});\n```\n\n```ts\njob.on(\"startProcessingRow\", async (row: any) =\u003e {\n  // Row has been received and is about to be processed inside `recordGenerator` method.\n});\n```\n\n```ts\njob.on(\"rowGenerated\", async (destinationIndex: number, generatedRow: any) =\u003e {\n  // Row has been generated using `recordGenerator` method.\n});\n```\n\n```ts\njob.on(\n  \"rowGenerationError\",\n  async (destinationIndex: number, row: any, error: any) =\u003e {\n    // Record generation (`recordGenerator` method) has thrown an error.\n  }\n);\n```\n\n```ts\njob.on('endProcessingRow', async ()) =\u003e {\n    // Row has been processed.\n});\n```\n\n```ts\njob.on(\"transformingBatch\", async (destinationIndex: number, rows: any[]) =\u003e {\n  // Batch is about to be transformed inside `batchTransformer` method.\n});\n```\n\n```ts\njob.on(\n  \"transformedBatch\",\n  async (destinationIndex: number, transformedRows: any) =\u003e {\n    // Batch has been transformed using`batchTransformer` method.\n  }\n);\n```\n\n```ts\njob.on(\n  \"transformingBatchError\",\n  async (destinationIndex: number, rows: any[], error: any) =\u003e {\n    // Batch transformation (`batchTransformer` method) has thrown an error.\n  }\n);\n```\n\n```ts\njob.on(\"endTransformingBatch\", async (destinationIndex: number) =\u003e {\n  // Batch has been transformed.\n});\n```\n\n```ts\njob.on(\"loadingBatch\", async (destinationIndex: number, data: any[]) =\u003e {\n  // Batch is about to be loaded into destination.\n});\n```\n\n```ts\njob.on(\n  \"loadedBatch\",\n  async (destinationIndex: number, data: any[], result: any) =\u003e {\n    // Batch has been loaded into destination.\n  }\n);\n```\n\n```ts\njob.on(\n  \"loadingBatchError\",\n  async (destinationIndex: number, data: any[], error: any) =\u003e {\n    // Batch load has failed.\n  }\n);\n```\n\n```ts\njob.on(\"endLoadingBatch\", async (destinationIndex: number) =\u003e {\n  // Batch load has finished .\n});\n```\n\n```ts\njob.on(\"endProcessingStream\", async (...args: any) =\u003e {\n  // Stream processing has finished.\n  // Passed parameters may vary based on specific processor.\n});\n```\n\n```ts\njob.on(\"processingError\", async (error: any) =\u003e {\n  // Unexpected error has occured.\n});\n```\n\n```ts\njob.on(\"endProcessing\", async () =\u003e {\n  // Job has finished execution.\n});\n```\n\n##### Listening for any event \u003cdiv id='any-event'/\u003e\n\nSpecial listener can be registered using `job.onAny` method which will listen for any previously mentioned event.\n\n```ts\njob.onAny(async (eventName: string, ...args: any) =\u003e {\n  // An event has been fired.\n});\n```\n\n##### Extended information from event \u003cdiv id='extended-event'/\u003e\n\nSometimes more information about event is needed, especially if you are building custom [reporter](#reporters) to log or trace fired events.\n\nThis information can be obtained by registering an async function as a third parameter with `job.on` method or as a second parameter with `job.onAny` method.\n\nFor example,\n\n```ts\njob.on(\"rowGenerated\", undefined, async (event: IBellboyEvent) =\u003e {\n  // Row has been generated using `recordGenerator` method.\n  console.log(\n    `${event.jobName} has generated row for #${event.eventArguments.destinationIndex} destination`\n  );\n});\n```\n\nor\n\n```ts\njob.onAny(undefined, async (event: IBellboyEvent) =\u003e {\n  console.log(`${event.jobName} has fired ${event.jobEvent}`);\n});\n```\n\n#### Extended event (IBellboyEvent) fields\n\n- **eventName** `string`\\\n  Name of the event.\n- **eventArguments** `any`\\\n  Arguments of the event.\n- **jobName** `string?`\\\n  User-defined name of the job.\n- **jobId** `string`\\\n  Unique ID of the job.\n- **eventId** `string`\\\n  Unique ID of the event.\n- **timestamp** `number`\\\n  High resolution timestamp of the event.\n- **jobStopped** `boolean`\\\n  Whether the job is stopped or not.\n\n## Processors \u003cdiv id='processors'/\u003e\n\nEach processor in `bellboy` is a class which has a single responsibility of processing data of specific type -\n\n- [MqttProcessor](#mqtt-processor) processes **MQTT** protocol messages.\n- [HttpProcessor](#http-processor) processes data received from a **HTTP** call.\n- [ExcelProcessor](#excel-processor) processes **XLSX** file data from the file system.\n- [JsonProcessor](#json-processor) processes **JSON** file data from the file system.\n- [DelimitedProcessor](#delimited-processor) processes files with **delimited data** from the file system.\n- [PostgresProcessor](#postgres-processor) processes data received from a **PostgreSQL** SELECT.\n- [MySqlProcessor](#mysql-processor) processes data received from a **MySQL** SELECT.\n- [MssqlProcessor](#mssql-processor) processes data received from a **MSSQL** SELECT.\n- [FirebirdProcessor](#firebird-processor) processes data received from a **Firebird** SELECT.\n- [DynamicProcessor](#dynamic-processor) processes **dynamically generated** data.\n- [TailProcessor](#tail-processor) processes **new lines** added to the file.\n\n### Options \u003cdiv id='processor-options'/\u003e\n\n- **rowLimit** `number`\\\n  Number of records to be processed before stopping processor. If not specified or `0` is passed, all records will be processed.\n\n### MqttProcessor \u003cdiv id='mqtt-processor'/\u003e\n\n[Usage examples](tests/mqtt-source.spec.ts)\n\nListens for messages and processes them one by one. It also handles backpressure by queuing messages, so all messages can be eventually processed.\n\n#### Options\n\n- [Processor options](#processor-options)\n- **url** `string` `required`\n- **topics** `string[]` `required`\n\n### HttpProcessor \u003cdiv id='http-processor'/\u003e\n\n[Usage examples](tests/http-source.spec.ts)\n\nProcesses data received from a HTTP call. Can process `json`, `xml` as well as `delimited` data. Can handle pagination by using `nextRequest` function.\n\nFor delimited data produces rows described [here](#delimited-produced-row).\n\n#### Options\n\n- [Processor options](#processor-options)\n- **connection** `object` `required`\\\n  Options from [axios](https://github.com/axios/axios) library.\n- **dataFormat** `delimited | json | xml` `required`\n- **rowSeparator** `string` `required for delimited`\n- **delimiter** `string` `only for delimited`\\\n  A symbol separating fields of the row.\n- **hasHeader** `boolean` `only for delimited`\\\n  If `true`, first row will be processed as a header.\n- **qualifier** `string` `only for delimited`\\\n  Symbol placed around a field to signify that it is the same field.\n- **encoding** `string` `only for delimited`\n- **jsonPath** `RegExp | string`\\\n  Path to the array to be streamed. This option is described in detail inside [JsonProcessor](#json-processor) section.\n- **saxOptions** `object` `only for xml`\\\n  Options for XML streaming as described in [sax-stream](https://github.com/melitele/sax-stream#api) library.\n- **authorizationRequest** `object`\n  - **connection**\\\n    Options from [axios](https://github.com/axios/axios) library.\n  - **applyTo**\\\n    Where extracted field should be applied. Whether `header` or `query`.\n  - **sourceField**\\\n    Name of the field from which value of authorization token will be extracted.\n  - **destinationField**\\\n    Name of the field which will be applied to `header` or `query` using `applyTo` option.\n  - **prefix**\\\n    Custom prefix to apply to the token.\n- **nextRequest** `async function(header)`\\\n  Function which must return `connection` for the next request or `null` if the next request is not needed.\n\n```javascript\nconst processor = new bellboy.HttpProcessor({\n  nextRequest: async function () {\n    if (currentPage \u003c pageCount) {\n      return {\n        ...connection,\n        url: `${url}\u0026current_page=${currentPage + 1}`,\n      };\n    }\n    return null;\n  },\n  // ...\n});\n```\n\n### Directory processors \u003cdiv id='directory-processors'/\u003e\n\nUsed for streaming text data from files in directory. There are currently four types of directory processors - `ExcelProcessor`, `JsonProcessor`, `DelimitedProcessor` and `TailProcessor`. Such processors search for the files in the source directory and process them one by one.\n\nFile name (`file`) and full file path (`filePath`) parameters will be passed to `startProcessingStream` event.\n\n#### Options \u003cdiv id='directory-processor-options'/\u003e\n\n- [Processor options](#processor-options)\n- **path** `string`\\\n  Path to the directory where files are located. Current directory by default.\n- **filePattern** `RegExp`\\\n  Regex pattern for the files to be processed. If not specified, all files in the directory will be matched.\n- **files** `string[]`\\\n  Array of file names. If not specified, all files in the directory will be matched against `filePattern` regex and processed in alphabetical order.\n\n### ExcelProcessor \u003cdiv id='excel-processor'/\u003e\n\n[Usage examples](tests/excel-source.spec.ts)\n\nProcesses `XLSX` files in the directory.\n\n#### Options\n\n- [Directory processor options](#directory-processor-options)\n- **hasHeader** `boolean` | `number`\\\n  Whether the worksheet has a header or not, `false` by default. 0-based row location can be passed to this option if header is not located on the first row.\n- **fillMergedCells** `boolean`\\\n  If `true`, merged cells wil have the same value (by default, only the first cell of merged cells is filled with value). \\\n  **Warning!** Enabling this feature may increase streaming time because file must be processed to detect merged cells before actual stream. `false` by default.\n- **ignoreEmpty** `boolean`\\\n  Whether to ignore empty rows or not, `true` by default.\n- **sheets** `(string | number)[] | async function(sheets)`\\\n  Array of sheet names and/or sheet indexes or async function, which accepts array of all sheets and must return another array of sheet names that needs to be processed. If not specified, first sheet will be processed.\n- **encoding** `string`\\\n  XLSX file encoding.\n\n```javascript\nconst processor = new bellboy.ExcelProcessor({\n  // process last sheet\n  sheets: async (sheets) =\u003e {\n    const sheet = sheets[sheets.length - 1];\n    return [sheet.name];\n  },\n  // ...\n});\n```\n\n\u003c!-- * **sheetName** `string`\n* **sheetIndex** `number`\\\nStarts from `0`.\n* **sheetGetter** `async function(sheets)`\\\nFunction which has array of `sheets` as a parameter and must return required name of the sheet.\n```javascript\nconst processor = new bellboy.ExcelProcessor({\n    // returns last sheet name\n    sheetGetter: async (sheets) =\u003e {\n        return sheets[sheets.length - 1];\n    },\n    // ...\n});\n```\nIf no `sheetName` specified, value of the `sheetIndex` will be used. If it isn't specified either, `sheetGetter` function will be called. If none options are specified, first sheet will be processed. --\u003e\n\n#### Produced row\n\nTo see how processed row will look like, proceed to [xlstream](https://github.com/Claviz/xlstream) library documentation which is used for Excel processing.\n\n### JsonProcessor \u003cdiv id='json-processor'/\u003e\n\nProcesses `JSON` files in the directory.\n\n#### Options\n\n- [Directory processor options](#directory-processor-options)\n- **jsonPath** `RegExp | string`\\\n  Path to the array to be streamed. Internally when JSON is streamed, current path is joined together using `.` as separator and then tested against provided regular expression. If not specified, a root array will be streamed. As an example, if you have this JSON object:\\\n  `{ \"animals\": { \"dogs\": [ \"pug\", \"bulldog\", \"poodle\" ] } }`\\\n  And want to stream `dogs` array, path you will need to use is `/animals.dogs.(\\d+)/` if using RegExp as `jsonPath` and `animals.dogs.(\\\\d+)` if a string is used.\\\n  `(\\d+)` is used here because each index of the array is a number.\n\n### DelimitedProcessor \u003cdiv id='delimited-processor'/\u003e\n\n[Usage examples](tests/delimited-source.spec.ts)\n\nProcesses files with delimited data in the directory.\n\n#### Options\n\n- [Directory processor options](#directory-processor-options)\n- **rowSeparator** `string` `required`\n- **delimiter** `string`\\\n  A symbol separating fields of the row.\n- **hasHeader** `boolean`\\\n  If `true`, first row will be processed as a header.\n- **qualifier** `string` \\\n  Symbol placed around a field to signify that it is the same field.\n- **encoding** `string` `only for delimited`\n\n#### Produced row \u003cdiv id='delimited-produced-row'/\u003e\n\n- **header** `string[]`\\\n  If `hasHeader` is `true`, first row will appear here.\n- **arr** `string`\\\n  Row split by `delimiter` and `qualifier`.\n- **obj** `string`\\\n  If `hasHeader` is `true`, object with header elements as keys will appear here.\n- **row** `string`\\\n  Received raw row.\n\n### TailProcessor \u003cdiv id='tail-processor'/\u003e\n\n[Usage examples](tests/tail-source.spec.ts)\n\nWatches for file changes and outputs last part of file as soon as new lines are added to the file.\n\n#### Options\n\n- [Directory processor options](#directory-processor-options)\n- **fromBeginning** `boolean`\\\n  In addition to emitting new lines, emits lines from the beginning of file, `false` by default.\n\n#### Produced row\n\n- **file** `string`\\\n  Name of the file the data came from.\n- **data** `string`\n\n### PostgresProcessor \u003cdiv id='postgres-processor'/\u003e\n\nProcesses a PostgreSQL `SELECT` query row by row.\n\n#### Options\n\n- [Processor options](#processor-options)\n- **query** `string` `required`\\\n  Query to execute.\n- **connection** `object` `required`\n  - **user**\n  - **password**\n  - **host**\n  - **port**\n  - **database**\n  - **schema**\n\n### MySqlProcessor \u003cdiv id='mysql-processor'/\u003e\n\nProcesses a MySQL `SELECT` query row by row.\n\n#### Options\n\n- [Processor options](#processor-options)\n- **query** `string` `required`\\\n  Query to execute.\n- **connection** `object` `required`\n  - **user**\n  - **password**\n  - **host**\n  - **port**\n  - **database**\n\n### FirebirdProcessor \u003cdiv id='firebird-processor'/\u003e\n\nProcesses a Firebird `SELECT` query row by row.\n\n#### Options\n\n- [Processor options](#processor-options)\n- **query** `string` `required`\\\n  Query to execute.\n- **connection** `object` `required`\n  - **user**\n  - **password**\n  - **host**\n  - **database**\n\n### MssqlProcessor \u003cdiv id='mssql-processor'/\u003e\n\nProcesses a MSSQL `SELECT` query row by row.\n\n#### Options\n\n- [Processor options](#processor-options)\n- **query** `string` `required`\\\n  Query to execute.\n- **connection** `object` `required`\n  - **user**\n  - **password**\n  - **server**\n  - **port**\n  - **database**\n  - **driver**\\\n     Optional [mssql][mssql-url] TDS driver; defaults to the pure JavaScript [Tedious][tedious-url] driver.\n\n#### Usage\n\nHere is an example of how to configure `MssqlProcessor` with a native TDS driver instead of the default pure JavasScript Tedious driver.\n\n```javascript\nconst nativeDriver: ITdsDriver = await import(\"mssql/msnodesqlV8\");\nconst connection: IMssqlDbConnection = {\n  user: \"user\",\n  password: \"password\",\n  server: \"server\",\n  database: \"database\",\n  driver: nativeDriver,\n};\nconst source = new MssqlProcessor({\n  connection,\n  query: \"select * from orders\",\n});\n```\n\nIn previous versions of `bellboy`, `connection.driver` was a `string` parameter.\n\n[More usage examples](tests/mssql-source.spec.ts)\n\n### DynamicProcessor \u003cdiv id='dynamic-processor'/\u003e\n\nProcessor which generates records on the fly. Can be used to define custom data processors.\n\n#### Options\n\n- [Processor options](#processor-options)\n- **generator** `async generator function` `required`\\\n  [Generator](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/function*) function which must yield records to process.\n\n```javascript\n// processor which generates 10 records dynamically\nconst processor = new bellboy.DynamicProcessor({\n  generator: async function* () {\n    for (let i = 0; i \u003c 10; i++) {\n      yield i;\n    }\n  },\n});\n```\n\n## Destinations \u003cdiv id='destinations'/\u003e\n\nEvery [job](#job) can have as many destinations (outputs) as needed. For example, one job can load processed data into a [database](#postgres-destination), log this data to [stdout](#stdout-destination) and post it by [HTTP](#http-destination) simultaneously.\n\n- [StdoutDestination](#stdout-destination) logs data to **console**.\n- [HttpDestination](#http-destination) executes **HTTP** request calls.\n- [PostgresDestination](#postgres-destination) inserts/upserts data to **PostgreSQL** database.\n- [MySqlDestination](#mysql-destination) inserts/upserts data to **MySQL** database.\n- [MssqlDestination](#mssql-destination) inserts data to **MSSQL** database.\n\n### Options \u003cdiv id='destination-options'/\u003e\n\n- **disableLoad** `boolean`\\\n  If `true`, no data will be loaded to the destination. In combination with [reporters](#reporters), this option can become handy during testing process.\n- **batchSize** `number`\\\n  Number of records to be processed before loading them to the destination. If not specified or `0` is passed, all records will be processed.\n- **recordGenerator** `async generator function(row)`\\\n  Function which receives produced row by processor and can apply transformations to it.\n- **batchTransformer** `async function(rows)`\\\n  Function which receives whole batch of rows. This function is being called after row count reaches `batchSize`. Data is being loaded to destination immediately after this function has been executed.\n\n### StdoutDestination \u003cdiv id='stdout-destination'/\u003e\n\nLogs out all data to stdout (console).\n\n#### Options\n\n- [General destination options](#destination-options)\n- **asTable** `boolean`\\\n  If set to `true`, data will be printed as table.\n\n### HttpDestination \u003cdiv id='http-destination'/\u003e\n\n[Usage examples](tests/http-destination.spec.ts)\n\nPuts processed data one by one in `body` and executes specified HTTP request.\n\n#### Options\n\n- [General destination options](#destination-options)\n- **request** `required`\\\n  Options from [axios](https://github.com/axios/axios) library.\n- **authorizationRequest** `object`\n  - **connection**\\\n    Options from [axios](https://github.com/axios/axios) library.\n  - **applyTo**\\\n    Where extracted field should be applied. Whether `header` or `query`.\n  - **sourceField**\\\n    Name of the field from which value of authorization token will be extracted.\n  - **destinationField**\\\n    Name of the field which will be applied to `header` or `query` using `applyTo` option.\n  - **prefix**\\\n    Custom prefix to apply to the token.\n\n### PostgresDestination \u003cdiv id='postgres-destination'/\u003e\n\n[Usage examples](tests/postgres-destination.spec.ts)\n\nInserts data to PostgreSQL.\n\n#### Options\n\n- [General destination options](#destination-options)\n- **table** `string` `required`\\\n  Table name.\n- **upsertConstraints** `string[]`\\\n  If specified, `UPSERT` command will be executed based on provided constraints.\n- **connection** `object` `required`\n  - **user**\n  - **password**\n  - **host**\n  - **database**\n  - **schema**\n\n### MySqlDestination \u003cdiv id='mysql-destination'/\u003e\n\n[Usage examples](tests/mysql-destination.spec.ts)\n\nInserts data to MySQL.\n\n#### Options\n\n- [General destination options](#destination-options)\n- **table** `string` `required`\\\n  Table name.\n- **connection** `object` `required`\n  - **user**\n  - **password**\n  - **host**\n  - **database**\n- **useSourceColumns** `boolean` \\\n   If `true`, only the columns in the source data will be used for data load. Default is `false`, using all destination table columns.\n- **postLoadQuery** `string` \\\n   A query which will be executed after the data load and before connection is closed. Result will be available in the `result` object of the `loadedBatch` event.\n\n### MssqlDestination \u003cdiv id='mssql-destination'/\u003e\n\n[Usage examples](tests/mssql-destination.spec.ts)\n\nInserts data to MSSQL.\n\n#### Options\n\n- [General destination options](#destination-options)\n- **table** `string` `required`\\\n  Table name.\n- **connection** `object` `required`\n  - **user**\n  - **password**\n  - **server**\n  - **database**\n  - **driver** \\\n     Optional [mssql][mssql-url] TDS driver; defaults to the pure JavaScript [Tedious][tedious-url] driver.\n\n#### Usage\n\nHere is an example of how to configure `MssqlDestination` with a native TDS driver instead of the default pure JavasScript Tedious driver.\n\n```javascript\nconst nativeDriver: ITdsDriver = await import(\"mssql/msnodesqlV8\");\nconst connection: IMssqlDbConnection = {\n  user: \"user\",\n  password: \"password\",\n  server: \"server\",\n  database: \"database\",\n  driver: nativeDriver,\n};\nconst sink = new MssqlDestination({\n  connection,\n  table: \"orders\",\n  batchSize: 1000,\n});\n```\n\n[More usage examples](tests/mssql-destination.spec.ts)\n\n## Extendability\n\nNew [processors](#processors) and [destinations](#destinations) can be made by extending existing ones. Feel free to make a pull request if you create something interesting.\n\n### Creating a new processor\n\n[Processor class examples](src/processors)\n\nTo create a new processor, you must extend `Processor` class and implement async `process` function. This function accepts one parameter:\n\n- **processStream** `async function(readStream, ...args)` `required`\\\n  Callback function which accepts [Readable stream](https://nodejs.org/api/stream.html#stream_class_stream_readable). After calling this function, `job` instance will handle passed stream internally. Passed parameters (`args`) will be emitted with `startProcessingStream` event during job execution.\n\n```javascript\nclass CustomProcessor extends bellboy.Processor {\n  async process(processStream) {\n    // await processStream(readStream, 'hello', 'world');\n  }\n}\n```\n\n### Creating a new destination\n\n[Destination class examples](src/destinations)\n\nTo create a new destination, you must extend `Destination` class and implement async `loadBatch` function. This function accepts one parameter:\n\n- **data** `any[]` `required`\\\n  Array of some processed data that needs to be loaded.\n\n```javascript\nclass CustomDestination extends bellboy.Destination {\n  async loadBatch(data) {\n    console.log(data);\n  }\n}\n```\n\n### Creating a new reporter \u003cdiv id='reporters'/\u003e\n\n[Official stdout reporter](https://github.com/claviz/bellboy-stdout-reporter)\n\nReporter is a job wrapper which can operate with [job instance](#job) (for example, listen to events using job `on` method). To create a new reporter, you must extend `Reporter` class and implement `report` function, which will be executed during job instance initialization.\nReporter event listeners (`on`, `onAny`) are added before any other user-defined listeners.\nThis function accepts one parameter:\n\n- **job** `Job` `required`\\\n  [Job](#job) instance\n\n```javascript\nclass CustomReporter extends bellboy.Reporter {\n  report(job) {\n    job.on(\"startProcessing\", undefined, async ({ jobName }) =\u003e {\n      console.log(`Job ${jobName} has been started.`);\n    });\n  }\n}\n```\n\n## Testing\n\nTests can be run by using `docker compose up --abort-on-container-exit --exit-code-from test --build test` command.\n\n[mssql-url]: https://github.com/tediousjs/node-mssql\n[tedious-url]: https://www.npmjs.com/package/tedious\n[msnodesqlv8-url]: https://www.npmjs.com/package/msnodesqlv8\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FClaviz%2Fbellboy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FClaviz%2Fbellboy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FClaviz%2Fbellboy/lists"}