{"id":18474831,"url":"https://github.com/mhkeller/joiner","last_synced_at":"2025-04-08T13:31:54.731Z","repository":{"id":14164729,"uuid":"16870702","full_name":"mhkeller/joiner","owner":"mhkeller","description":"A simple utility for SQL-like joins with Json, GeoJson or dbf data in Node, the browser and on the command line. Also creates join reports so you can know how successful a given join was. Try it in the browser --\u003e","archived":false,"fork":false,"pushed_at":"2023-01-06T11:37:16.000Z","size":905,"stargazers_count":51,"open_issues_count":5,"forks_count":6,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-03-23T13:04:46.628Z","etag":null,"topics":["csv","dbf","geojson","geojson-data","join","joiner","joins","sql"],"latest_commit_sha":null,"homepage":"https://mhkeller.github.io/join.report","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mhkeller.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2014-02-15T20:08:17.000Z","updated_at":"2024-08-12T07:17:04.000Z","dependencies_parsed_at":"2023-01-13T17:49:29.840Z","dependency_job_id":null,"html_url":"https://github.com/mhkeller/joiner","commit_stats":null,"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mhkeller%2Fjoiner","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mhkeller%2Fjoiner/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mhkeller%2Fjoiner/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mhkeller%2Fjoiner/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mhkeller","download_url":"https://codeload.github.com/mhkeller/joiner/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247851762,"owners_count":21006815,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csv","dbf","geojson","geojson-data","join","joiner","joins","sql"],"created_at":"2024-11-06T10:31:10.186Z","updated_at":"2025-04-08T13:31:49.704Z","avatar_url":"https://github.com/mhkeller.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"Joiner\n======\n\n[![Build Status](https://secure.travis-ci.org/mhkeller/joiner.png?branch=master\u0026style=flat-square)](http://travis-ci.org/mhkeller/joiner) [![NPM version](https://badge.fury.io/js/joiner.png?style=flat)](http://badge.fury.io/js/joiner) [![npm](https://img.shields.io/npm/dm/joiner.svg)](https://www.npmjs.com/package/joiner)\n[![js-standard-style](https://img.shields.io/badge/code%20style-standard-brightgreen.svg?style=flat)](https://github.com/feross/standard)\n\nA simple utility for SQL-like joins with Json or geoJson data in Node, the browser and on the command line. Also creates join reports so you can know how successful a given join is.\n\nTry it in the browser --\u003e https://mhkeller.github.io/join.report/\n\n```js\nvar data = [\n  { \"id\": \"1\", \"name\": \"UT\" },\n  { \"id\": \"4\", \"name\": \"NM\" }\n]\n\nvar newData = [\n  { \"state_name\": \"NM\", \"avg_temp\": 45 }\n]\n\nvar joinedData = joiner({\n  leftData: data,\n  leftDataKey: 'name',\n  rightData: newData,\n  rightDataKey: 'state_name'\n})\n\nconsole.log(joinedData)\n/*\n{ data:\n  [ { id: '1', name: 'UT', avg_temp: null },\n    { id: '4', name: 'NM', avg_temp: 45 }\n  ],\n  report:\n    { diff:\n      { a: [ 'NM', 'UT' ],\n        b: [ 'NM' ],\n        a_and_b: [ 'NM' ],\n        a_not_in_b: [ 'UT' ],\n        b_not_in_a: []\n      },\n     prose:\n      { summary: '1 row matched in A and B. 1 row in A not in B. All 1 row in B in A.',\n        full: 'Matches in A and B: NM. A not in B: UT.' } } }\n*/\n\n```\n## Examples\n\nSee the **[`examples`](https://github.com/mhkeller/joiner/tree/master/examples)** folder for different file formats and options. Joiner is useful to verify whether all of your joins were successful and to spot any patterns among fields that didn't join properly. For example, you can see that the `county_01` row in dataset A didn't match with the `county_1` in dataset B and that you have a zero-padding issue going on.\n\n## Installation\n\nTo install as a Node.js module:\n\n````\nnpm install --save joiner\n````\n\nOr to install as a command-line utility:\n\n````\nnpm install joiner -g\n````\n\nTo use as both, run both commands.\n\n## Methods\n\nAll joins return an object with the following structure:\n\n````\ndata: [data object],\nreport: {\n\tdiff: {\n\t\ta: [data in A],\n\t\tb: [data in B],\n\t\ta_and_b: [data in A and B],\n\t\ta_not_in_b: [data in A not in B],\n\t\tb_not_in_a: [data in B not in A]\n\t}:\n\tprose: {\n\t\tsummary: [summary description of join result, number of matches in A and B, A not in B, B not in A.]\n\t\tfull:    [full list of which rows were joined in each of the above categories]\n\t}\n}\n````\n\n### _joiner(config)_\n\nPerform a left join on the two array of object json datasets. It performs a deep clone using [lodash.clonedeep](https://www.npmjs.com/package/lodash.clonedeep) of the objects you pass in and returns the new object.\n\nOptionally, you can pass in a key name under `nestKey` in case the left data's attribute dictionary is nested under another key, such as in geoJson when it's under the `properties` object. More on that below.\n\n| parameter    | type     | description    |\n| :------------|:-------- |:---------------|\n| leftData     | Array    | existing data  |\n| leftDataKey  | [String] | key to join on, defaults to `\"id\"` if not set and `geoJson: true` |\n| rightData    | Array    | new data       |\n| rightDataKey | String   | key to join on |\n| geoJson      | [Boolean] default=false | optional, key name holding attribute |\n| nestKey      | [String] | optional, key name holding attribute, feaults to `\"properties\"` if not set and `geoJson: true` |\n\n#### Joining to geoJson\n\nIf `geoJson` is true, performs a left join onto the `properties` object of each feature in a geoJson array.\n\nIf you want to join on the `\"id\"` property, omit `leftDataKey`. If you want to join on a value in the `properties` object, set `leftDataKey` to `'properties.\u003cdesired-key-name\u003e'` and set `nestKey` to `'properties'`. See examples for more.\n\n## Command line interface\n\n````\nUsage: joiner -a DATASET_A_PATH -k DATASET_A_KEY -b DATASET_B_PATH -j DATASET_B_KEY -o OUT_FILE_PATH [-r (summary|full) -n NEST_KEY --geojson]\n\nOptions:\n  -h, --help     Display help           [default: false]\n  -a, --apath    Dataset A path\n  -k, --akey     Dataset A key\n  -b, --bpath    Dataset B path\n  -j, --bkey     Dataset B key\n  -g, --geojson  Is dataset A geojson?  [default: false]\n  -n, --nestkey  Nested key name\n  -o, --out      Out path\n  -r, --report   Report format          [default: \"summary\"]\n\n````\n\nIn most cases, the first four parameters (`--apath`, `--akey`, `--bpath` and `--bkey`) are required. `--akey` is not required if you have set geojson to true by using `-g` or `--geojson` since it will join on the `\"id\"` field. If you want to join on a property field in geojson, then set that using `--akey`.\n\nIf you specify an output file, it will write the join result to the specified file and the report to the same directory. Intermediate directories will be created if they don't already exist. For example, `-o path/to/output.csv` will also write `-o path/to/output-report.json` and create the `to/` folder if it isn't already there. If you don't specify an output file, it will print the results to the console.\n\nIf you don't specify an output file with `-o`, Joiner will print the join report to the console. By default, it will just specify the summary report. To print the full report, specify `-d full`.\n\nSetting `-g` or `--geojson` is the equivalent of setting `geojson: true` above.\n\nIt converts the specified input file into json and writes the joined dataset to file using [indian ocean](https://github.com/mhkeller/indian-ocean), which currently supports the following formats: `json`, `geojson`, `csv`, `psv`, `tsv` and `dbf`. The format is inferred from the file extension of the input and output file paths. For example, `-a path/to/input/file.csv` will read in a csv and `-o path/to/output/file.csv` will write a csv.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmhkeller%2Fjoiner","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmhkeller%2Fjoiner","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmhkeller%2Fjoiner/lists"}