{"id":13406243,"url":"https://github.com/pbeshai/tidy","last_synced_at":"2025-05-15T09:08:47.011Z","repository":{"id":37902411,"uuid":"335350698","full_name":"pbeshai/tidy","owner":"pbeshai","description":"Tidy up your data with JavaScript, inspired by dplyr and the tidyverse","archived":false,"fork":false,"pushed_at":"2024-05-17T23:51:10.000Z","size":1358,"stargazers_count":749,"open_issues_count":8,"forks_count":20,"subscribers_count":15,"default_branch":"main","last_synced_at":"2025-05-15T09:08:39.306Z","etag":null,"topics":["data","dplyr","tidyverse","wrangling"],"latest_commit_sha":null,"homepage":"https://pbeshai.github.io/tidy","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pbeshai.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-02-02T16:22:39.000Z","updated_at":"2025-05-04T13:51:31.000Z","dependencies_parsed_at":"2022-07-07T23:12:58.844Z","dependency_job_id":"97fdb2fe-487e-4dda-8fce-3c42b0f29109","html_url":"https://github.com/pbeshai/tidy","commit_stats":{"total_commits":90,"total_committers":9,"mean_commits":10.0,"dds":0.2666666666666667,"last_synced_commit":"1355f2e02a94ac58b1bd6c8e2c548ede55abfce0"},"previous_names":[],"tags_count":12,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pbeshai%2Ftidy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pbeshai%2Ftidy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pbeshai%2Ftidy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pbeshai%2Ftidy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pbeshai","download_url":"https://codeload.github.com/pbeshai/tidy/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254310520,"owners_count":22049470,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data","dplyr","tidyverse","wrangling"],"created_at":"2024-07-30T19:02:25.057Z","updated_at":"2025-05-15T09:08:42.000Z","avatar_url":"https://github.com/pbeshai.png","language":"TypeScript","funding_links":[],"categories":["TypeScript","JavaScript Libraries","data","Libraries"],"sub_categories":[],"readme":"# tidy.js\n\n\n[![CircleCI](https://img.shields.io/circleci/build/gh/pbeshai/tidy)](https://app.circleci.com/pipelines/github/pbeshai/tidy)\n[![npm](https://img.shields.io/npm/v/@tidyjs/tidy)](https://www.npmjs.com/package/@tidyjs/tidy)\n\n**Tidy up your data with JavaScript!** Inspired by [dplyr](https://dplyr.tidyverse.org/) and the [tidyverse](https://www.tidyverse.org/), tidy.js attempts to bring the ergonomics of data manipulation from R to javascript (and typescript). The primary goals of the project are:\n\n* **Readable code**. Tidy.js prioritizes making your data transformations readable, so future you and your teammates can get up and running quickly.\n\n* **Standard transformation verbs**. Tidy.js is built using battle-tested verbs from the R community that can handle any data wrangling need.\n\n* **Work with plain JS objects**. No wrapper classes needed — all tidy.js needs is an array of plain old-fashioned JS objects to get started. Simple in, simple out.\n\nSecondarily, this project aims to provide acceptable types for the functions provided.\n\n\n#### Quick Links\n\n* [GitHub repo](https://github.com/pbeshai/tidy)\n* [Project homepage](https://pbeshai.github.io/tidy)\n* [API reference documentation](https://pbeshai.github.io/tidy/docs/api/tidy)\n* [Playground](https://pbeshai.github.io/tidy/playground)\n* [Observable Intro](https://observablehq.com/@pbeshai/tidy-js-intro-demo)\n* [Observable Examples Collection](https://observablehq.com/collection/@pbeshai/tidy-js)\n* [GitHub Discussions for Q\u0026A](https://github.com/pbeshai/tidy/discussions)\n* [CodeSandbox showing basic HTML usage (UMD)](https://codesandbox.io/s/tidyjs-umd-example-n1g4r?file=/index.html)\n\n#### Related work\n\nBe sure to check out a very similar project, [Arquero](https://github.com/uwdata/arquero), from [UW Data](https://idl.cs.washington.edu/). \n\n\n## Getting started\n\nTo start using tidy, your best bet is to install from npm:\n\n```shell\nnpm install @tidyjs/tidy\n# or\nyarn add @tidyjs/tidy\n```\n\nThen import the functions you need:\n\n```js\nimport { tidy, mutate, arrange, desc } from '@tidyjs/tidy'\n```\n\n**Note** if you're just trying tidy in a browser, you can use the UMD version hosted on jsdelivr ([codesandbox example](https://codesandbox.io/s/tidyjs-umd-example-n1g4r?file=/index.html)):\n\n```html\n\u003cscript src=\"https://d3js.org/d3-array.v2.min.js\"\u003e\u003c/script\u003e\n\u003cscript src=\"https://cdn.jsdelivr.net/npm/@tidyjs/tidy/dist/umd/tidy.min.js\"\u003e\u003c/script\u003e\n\u003cscript\u003e\n  const { tidy, mutate, arrange, desc } = Tidy;\n  // ...\n\u003c/script\u003e  \n```\n\n\nAnd use them on an array of objects:\n\n```js\nconst data = [\n  { a: 1, b: 10 }, \n  { a: 3, b: 12 }, \n  { a: 2, b: 10 }\n]\n\nconst results = tidy(\n  data, \n  mutate({ ab: d =\u003e d.a * d.b }),\n  arrange(desc('ab'))\n)\n```\n\nThe output is:\n\n```js\n[\n  { a: 3, b: 12, ab: 36},\n  { a: 2, b: 10, ab: 20},\n  { a: 1, b: 10, ab: 10}\n]\n```\n\nAll tidy.js code is wrapped in a **tidy flow** via the `tidy()` function. The first argument is the array of data, followed by the transformation verbs to run on the data. The actual functions passed to `tidy()` can be anything so long as they fit the form:\n\n```\n(items: object[]) =\u003e object[]\n```\n\nFor example, the following is valid:\n\n```js\ntidy(\n  data, \n  items =\u003e items.filter((d, i) =\u003e i % 2 === 0),\n  arrange(desc('value'))\n)\n```\n\nAll tidy verbs fit this style, with the exception of exports from groupBy, discussed below.\n\n### Grouping data with groupBy\n\nBesides manipulating flat lists of data, tidy provides facilities for wrangling grouped data via the `groupBy()` function.\n\n```js\nimport { tidy, summarize, sum, groupBy } from '@tidyjs/tidy'\n\nconst data = [\n  { key: 'group1', value: 10 }, \n  { key: 'group2', value: 9 }, \n  { key: 'group1', value: 7 }\n]\n\ntidy(\n  data,\n  groupBy('key', [\n    summarize({ total: sum('value') })\n  ])\n)\n\n```\n\nThe output is:\n```js\n[\n  { \"key\": \"group1\", \"total\": 17 },\n  { \"key\": \"group2\", \"total\": 9 },\n]\n```\n\nThe `groupBy()` function works similarly to `tidy()` in that it takes a flow of functions as its second argument (wrapped in an array). Things get really fun when you use groupBy's *third* argument for exporting the grouped data into different shapes. \n\nFor example, exporting data as a nested object, we can use `groupBy.object()` as the third argument to `groupBy()`.\n \n```js\nconst data = [\n  { g: 'a', h: 'x', value: 5 },\n  { g: 'a', h: 'y', value: 15 },\n  { g: 'b', h: 'x', value: 10 },\n  { g: 'b', h: 'x', value: 20 },\n  { g: 'b', h: 'y', value: 30 },\n]\n\ntidy(\n  data,\n  groupBy(\n    ['g', 'h'], \n    [\n      mutate({ key: d =\u003e `\\${d.g}\\${d.h}`})\n    ], \n    groupBy.object() // \u003c-- specify the export\n  )\n);\n\n```\n\nThe output is:\n\n```js\n{\n  \"a\": {\n    \"x\": [{\"g\": \"a\", \"h\": \"x\", \"value\": 5, \"key\": \"ax\"}],\n    \"y\": [{\"g\": \"a\", \"h\": \"y\", \"value\": 15, \"key\": \"ay\"}]\n  },\n  \"b\": {\n    \"x\": [\n      {\"g\": \"b\", \"h\": \"x\", \"value\": 10, \"key\": \"bx\"},\n      {\"g\": \"b\", \"h\": \"x\", \"value\": 20, \"key\": \"bx\"}\n    ],\n    \"y\": [{\"g\": \"b\", \"h\": \"y\", \"value\": 30, \"key\": \"by\"}]\n  }\n}\n```\n\nOr alternatively as `{ key, values }` entries-objects  via `groupBy.entriesObject()`:\n\n```js\ntidy(data,\n  groupBy(\n    ['g', 'h'], \n    [\n      mutate({ key: d =\u003e `\\${d.g}\\${d.h}`})\n    ], \n    groupBy.entriesObject() // \u003c-- specify the export\n  )\n);\n```\n\nThe output is:\n\n```js\n[\n  {\n    \"key\": \"a\",\n    \"values\": [\n      {\"key\": \"x\", \"values\": [{\"g\": \"a\", \"h\": \"x\", \"value\": 5, \"key\": \"ax\"}]},\n      {\"key\": \"y\", \"values\": [{\"g\": \"a\", \"h\": \"y\", \"value\": 15, \"key\": \"ay\"}]}\n    ]\n  },\n  {\n    \"key\": \"b\",\n    \"values\": [\n      {\n        \"key\": \"x\",\n        \"values\": [\n          {\"g\": \"b\", \"h\": \"x\", \"value\": 10, \"key\": \"bx\"},\n          {\"g\": \"b\", \"h\": \"x\", \"value\": 20, \"key\": \"bx\"}\n        ]\n      },\n      {\"key\": \"y\", \"values\": [{\"g\": \"b\", \"h\": \"y\", \"value\": 30, \"key\": \"by\"}]}\n    ]\n  }\n]\n```\n\nIt's common to be left with a single leaf in a groupBy set, especially after running summarize(). To prevent your exported data having its values wrapped in an array, you can pass the `single` option to it.\n\n```js\ntidy(input,\n  groupBy(['g', 'h'], [\n    summarize({ total: sum('value') })\n  ], groupBy.object({ single: true }))\n);\n```\n\nThe output is:\n\n```js\n{\n  \"a\": {\n    \"x\": {\"total\": 5, \"g\": \"a\", \"h\": \"x\"},\n    \"y\": {\"total\": 15, \"g\": \"a\", \"h\": \"y\"}\n  },\n  \"b\": {\n    \"x\": {\"total\": 30, \"g\": \"b\", \"h\": \"x\"},\n    \"y\": {\"total\": 30, \"g\": \"b\", \"h\": \"y\"}\n  }\n}\n```\n\nVisit the [API reference docs](https://pbeshai.github.io/tidy/docs/api/tidy) to learn more about how each function works and all the options they take. Be sure to check out the `levels` export, which can let you mix-and-match different export types based on the depth of the data. For quick reference, other available groupBy exports include: \n\n* groupBy.entries()\n* groupBy.entriesObject()\n* groupBy.grouped()\n* groupBy.levels()\n* groupBy.object()\n* groupBy.keys()\n* groupBy.map()\n* groupBy.values()\n\n---\n\n## Developing\n\nclone the repo:\n\n```\ngit clone git@github.com:pbeshai/tidy.git\n```\n\ninstall dependencies:\n\n```\nyarn\n```\n\ninitialize lerna:\n\n```\nlerna bootstrap\n```\n\nbuild tidy:\n\n```\nyarn run build\n```\n\ntest all of tidy:\n\n```\nyarn run test\n```\n\ntest:watch a single package\n\n```\nyarn workspace @tidyjs/tidy test:watch\n```\n\n### Conventional commits\n\nThis library uses [conventional commits](https://www.conventionalcommits.org/), following the angular convention. Prefixes are:\n\n- **build**: Changes that affect the build system or external dependencies (example scopes: yarn, npm)\n- **ci**: Changes to our CI configuration files and scripts (e.g. CircleCI)\n- **chore**\n- **docs**: Documentation only changes\n- **feat** : A new feature\n- **fix**: A bug fix\n- **perf**: A code change that improves performance\n- **refactor**: A code change that neither fixes a bug nor adds a feature\n- **revert**\n- **style**: Changes that do not affect the meaning of the code (white-space, formatting, missing semi-colons, etc)\n- **test**: Adding missing tests or correcting existing tests\n\n\n### Docs website\n\nstart the local site:\n\n```\nyarn start:web\n```\n\nbuild the site:\n\n```\nyarn build:web\n```\n\ndeploy the site via github-pages:\n```\nUSE_SSH=true GIT_USER=pbeshai yarn workspace @tidyjs/tidy-website deploy\n```\n\nIdeally we can automate this via github actions one day!\n\n\n---\n\n\n#### Shout out to Netflix\n\nI want to give a big shout out to [Netflix](https://research.netflix.com/), my current employer, for giving me the opportunity to work on this project and to open source it. It's a great place to work and if you enjoy tinkering with data-related things, I'd strongly recommend checking out [our analytics department](https://research.netflix.com/research-area/analytics).\n– [Peter Beshai](https://peterbeshai.com/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpbeshai%2Ftidy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpbeshai%2Ftidy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpbeshai%2Ftidy/lists"}