{"id":13508053,"url":"https://paulfitz.github.io/daff/","last_synced_at":"2025-03-30T09:33:19.748Z","repository":{"id":6332891,"uuid":"7568434","full_name":"paulfitz/daff","owner":"paulfitz","description":"align and compare tables","archived":false,"fork":false,"pushed_at":"2024-08-09T13:17:36.000Z","size":1627,"stargazers_count":800,"open_issues_count":44,"forks_count":67,"subscribers_count":25,"default_branch":"master","last_synced_at":"2024-10-29T15:48:06.377Z","etag":null,"topics":["comparing-tables","csv","csv-diffs","diff","sqlite"],"latest_commit_sha":null,"homepage":"https://paulfitz.github.io/daff","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/paulfitz.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":["paulfitz"]}},"created_at":"2013-01-11T22:44:40.000Z","updated_at":"2024-10-24T14:02:17.000Z","dependencies_parsed_at":"2022-09-02T21:43:53.593Z","dependency_job_id":"497214ef-6cc7-4854-a6e8-848cbc3f8889","html_url":"https://github.com/paulfitz/daff","commit_stats":{"total_commits":526,"total_committers":21,"mean_commits":"25.047619047619047","dds":0.0684410646387833,"last_synced_commit":"dfbb38cbfe027a34e9e9120e132b357ba3a85e7c"},"previous_names":["paulfitz/coopyhx"],"tags_count":104,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paulfitz%2Fdaff","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paulfitz%2Fdaff/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paulfitz%2Fdaff/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paulfitz%2Fdaff/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/paulfitz","download_url":"https://codeload.github.com/paulfitz/daff/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245761292,"owners_count":20667895,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["comparing-tables","csv","csv-diffs","diff","sqlite"],"created_at":"2024-08-01T02:00:46.917Z","updated_at":"2025-03-30T09:33:19.430Z","avatar_url":"https://github.com/paulfitz.png","language":"Java","funding_links":["https://github.com/sponsors/paulfitz"],"categories":["visualizing diffs of data:"],"sub_categories":[],"readme":"[![Build Status](https://travis-ci.org/paulfitz/daff.svg?branch=master)](https://travis-ci.org/paulfitz/daff)\n[![NPM version](https://badge.fury.io/js/daff.svg)](http://badge.fury.io/js/daff)\n[![Gem Version](https://badge.fury.io/rb/daff.svg)](http://badge.fury.io/rb/daff)\n[![PyPI version](https://badge.fury.io/py/daff.svg)](http://badge.fury.io/py/daff)\n[![PHP version](https://badge.fury.io/ph/paulfitz%2Fdaff-php.svg)](http://badge.fury.io/ph/paulfitz%2Fdaff-php)\n[![Bower version](https://badge.fury.io/bo/daff.svg)](http://badge.fury.io/bo/daff)\n![Badge count](http://img.shields.io/:badges-7/7-33aa33.svg)\n\ndaff: data diff\n===============\n\nThis is a library for comparing tables, producing a summary of their\ndifferences, and using such a summary as a patch file.  It is\noptimized for comparing tables that share a common origin, in other\nwords multiple versions of the \"same\" table.\n\nFor a live demo, see:\n\u003e https://paulfitz.github.io/daff/\n\nInstall the library for your favorite language:\n````sh\nnpm install daff -g  # node/javascript\npip install daff     # python\ngem install daff     # ruby\ncomposer require paulfitz/daff-php  # php\ninstall.packages('daff') # R wrapper by Edwin de Jonge\nbower install daff   # web/javascript\n````\n\nOther translations are available here:\n\u003e https://github.com/paulfitz/daff/releases\n\nOr use the library to view csv diffs on github via a chrome extension:\n\u003e https://github.com/theodi/csvhub\n\nThe diff format used by `daff` is specified here:\n\u003e http://paulfitz.github.io/daff-doc/spec.html\n\nThis library is a stripped down version of the coopy toolbox (see\nhttp://share.find.coop).  To compare tables from different origins,\nor with automatically generated IDs, or other complications, check out\nthe coopy toolbox.\n\nThe program\n-----------\n\nYou can run `daff`/`daff.py`/`daff.rb` as a utility program:\n````\n$ daff\ndaff can produce and apply tabular diffs.\nCall as:\n  daff a.csv b.csv\n  daff [--color] [--no-color] [--output OUTPUT.csv] a.csv b.csv\n  daff [--output OUTPUT.html] a.csv b.csv\n  daff [--www] a.csv b.csv\n  daff parent.csv a.csv b.csv\n  daff --input-format sqlite a.db b.db\n  daff patch [--inplace] a.csv patch.csv\n  daff merge [--inplace] parent.csv a.csv b.csv\n  daff trim [--output OUTPUT.csv] source.csv\n  daff render [--output OUTPUT.html] diff.csv\n  daff copy in.csv out.tsv\n  daff in.csv\n  daff git\n  daff version\n\nThe --inplace option to patch and merge will result in modification of a.csv.\n\nIf you need more control, here is the full list of flags:\n  daff diff [--output OUTPUT.csv] [--context NUM] [--all] [--act ACT] a.csv b.csv\n     --act ACT:     show only a certain kind of change (update, insert, delete, column)\n     --all:         do not prune unchanged rows or columns\n     --all-rows:    do not prune unchanged rows\n     --all-columns: do not prune unchanged columns\n     --color:       highlight changes with terminal colors (default in terminals)\n     --context NUM: show NUM rows of context (0=none)\n     --context-columns NUM: show NUM columns of context (0=none)\n     --fail-if-diff: return status is 0 if equal, 1 if different, 2 if problem\n     --id:          specify column to use as primary key (repeat for multi-column key)\n     --ignore:      specify column to ignore completely (can repeat)\n     --index:       include row/columns numbers from original tables\n     --input-format [csv|tsv|ssv|psv|json|sqlite]: set format to expect for input\n     --eol [crlf|lf|cr|auto]: separator between rows of csv output.\n     --no-color:    make sure terminal colors are not used\n     --ordered:     assume row order is meaningful (default for CSV)\n     --output-format [csv|tsv|ssv|psv|json|copy|html]: set format for output\n     --padding [dense|sparse|smart]: set padding method for aligning columns\n     --table NAME:  compare the named table, used with SQL sources. If name changes, use 'n1:n2'\n     --unordered:   assume row order is meaningless (default for json formats)\n     -w / --ignore-whitespace: ignore changes in leading/trailing whitespace\n     -i / --ignore-case: ignore differences in case\n\n  daff render [--output OUTPUT.html] [--css CSS.css] [--fragment] [--plain] diff.csv\n     --css CSS.css: generate a suitable css file to go with the html\n     --fragment:    generate just a html fragment rather than a page\n     --plain:       do not use fancy utf8 characters to make arrows prettier\n     --unquote:     do not quote html characters in html diffs\n     --www:         send output to a browser\n````\n\nFormats supported are CSV, TSV, Sqlite (with `--input-format sqlite` or\nthe `.sqlite` extension), and ndjson.\n\nUsing with git\n--------------\n\nRun `daff git csv` to install daff as a diff and merge handler\nfor `*.csv` files in your repository.  Run `daff git` for instructions\non doing this manually. Your CSV diffs and merges will get smarter,\nsince git will suddenly understand about rows and columns, not just lines:\n\n![Example CSV diff](http://paulfitz.github.io/daff-doc/images/daff_vs_diff.png)\n\nThe library\n-----------\n\nYou can use `daff` as a library from any supported language.  We take\nhere the example of Javascript.  To use `daff` on a webpage,\nfirst include `daff.js`:\n```html\n\u003cscript src=\"daff.js\"\u003e\u003c/script\u003e\n```\nOr if using node outside the browser:\n```js\nvar daff = require('daff');\n```\n\nFor concreteness, assume we have two versions of a table,\n`data1` and `data2`:\n```js\nvar data1 = [\n    ['Country','Capital'],\n    ['Ireland','Dublin'],\n    ['France','Paris'],\n    ['Spain','Barcelona']\n];\nvar data2 = [\n    ['Country','Code','Capital'],\n    ['Ireland','ie','Dublin'],\n    ['France','fr','Paris'],\n    ['Spain','es','Madrid'],\n    ['Germany','de','Berlin']\n];\n```\n\nTo make those tables accessible to the library, we wrap them\nin `daff.TableView`:\n```js\nvar table1 = new daff.TableView(data1);\nvar table2 = new daff.TableView(data2);\n```\n\nWe can now compute the alignment between the rows and columns\nin the two tables:\n```js\nvar alignment = daff.compareTables(table1,table2).align();\n```\n\nTo produce a diff from the alignment, we first need a table\nfor the output:\n```js\nvar data_diff = [];\nvar table_diff = new daff.TableView(data_diff);\n```\n\nUsing default options for the diff:\n```js\nvar flags = new daff.CompareFlags();\nvar highlighter = new daff.TableDiff(alignment,flags);\nhighlighter.hilite(table_diff);\n```\n\nThe diff is now in `data_diff` in highlighter format, see\nspecification here:\n\u003e http://paulfitz.github.io/daff-doc/spec.html\n\n```js\n[ [ '!', '', '+++', '' ],\n  [ '@@', 'Country', 'Code', 'Capital' ],\n  [ '+', 'Ireland', 'ie', 'Dublin' ],\n  [ '+', 'France', 'fr', 'Paris' ],\n  [ '-\u003e', 'Spain', 'es', 'Barcelona-\u003eMadrid' ],\n  [ '+++', 'Germany', 'de', 'Berlin' ] ]\n```\n\nFor visualization, you may want to convert this to a HTML table\nwith appropriate classes on cells so you can color-code inserts,\ndeletes, updates, etc.  You can do this with:\n```js\nvar diff2html = new daff.DiffRender();\ndiff2html.render(table_diff);\nvar table_diff_html = diff2html.html();\n```\n\nFor 3-way differences (that is, comparing two tables given knowledge\nof a common ancestor) use `daff.compareTables3` (give ancestor\ntable as the first argument).\n\nHere is how to apply that difference as a patch:\n```js\nvar patcher = new daff.HighlightPatch(table1,table_diff);\npatcher.apply();\n// table1 should now equal table2\n```\n\nFor other languages, you should find sample code in\nthe packages on the [Releases](https://github.com/paulfitz/daff/releases) page.\n\nSupported languages\n-------------------\n\nThe `daff` library is written in [Haxe](http://haxe.org/), which\ncan be translated reasonably well into at least the following languages:\n\n * Javascript\n * Python\n * Java\n * C#\n * C++\n * Ruby (using an [unofficial haxe target](https://github.com/paulfitz/haxe) developed for `daff`)\n * PHP\n\nSome translations are done for you on the\n[Releases](https://github.com/paulfitz/daff/releases) page.\nTo make another translation, or to compile from source\nfirst follow the [Haxe language introduction](https://haxe.org/documentation/introduction/language-introduction.html) for the\nlanguage you care about.  At the time of writing, if you are on OSX, you should\ninstall haxe using `brew install haxe`.  Then do one of:\n\n```\nmake js\nmake php\nmake py\nmake java\nmake cs\nmake cpp\n```\n\nFor each language, the `daff` library expects to be handed an interface to tables you create, rather than creating them\nitself.  This is to avoid inefficient copies from one format to another.  You'll find a `SimpleTable` class you can use if\nyou find this awkward.\n\nOther possibilities:\n\n * There's a daff wrapper for R written by [Edwin de Jonge](https://github.com/edwindj), see https://github.com/edwindj/daff and http://cran.r-project.org/web/packages/daff\n * There's a hand-written ruby port by [James Smith](https://github.com/Floppy), see https://github.com/theodi/coopy-ruby\n\nAPI documentation\n-----------------\n\n * You can browse the `daff` classes at http://paulfitz.github.io/daff-doc/\n\nSponsors\n--------\n\n\u003cimg src=\"http://datacommons.coop/images/the_zen_of_venn.png\" alt=\"the zen of venn\" height=\"100\"\u003e\nThe \u003ca href=\"https://datacommons.coop\"\u003eData Commons Co-op\u003c/a\u003e,  \"perhaps the geekiest of all cooperative organizations on the planet,\" has given great moral support during the development of `daff`.\nDonate a multiple of `42.42` in your currency to let them know you care: \u003ca href=\"https://datacommons.coop/donate/\"\u003ehttps://datacommons.coop/donate/\u003c/a\u003e.\n\nReading material\n----------------\n\n * https://specs.frictionlessdata.io/tabular-diff : a specification of the diff format we use.\n * http://theodi.org/blog/csvhub-github-diffs-for-csv-files : using this library with github.\n * https://github.com/ropensci/unconf/issues/19 : a thread about diffing data in which daff shows up in at least four guises (see if you can spot them all).\n * http://theodi.org/blog/adapting-git-simple-data : using this library with gitlab.\n * http://okfnlabs.org/blog/2013/08/08/diffing-and-patching-data.html : a summary of where the library came from.\n * http://blog.okfn.org/2013/07/02/git-and-github-for-data/ : a post about storing small data in git/github.\n * http://blog.ouseful.info/2013/08/27/diff-or-chop-github-csv-data-files-and-openrefine/ : counterpoint - a post discussing tracked-changes rather than diffs.\n * http://blog.byronjsmith.com/makefile-shortcuts.html : a tutorial on using `make` for data, with daff in the mix. \"Since git considers changes on a per-line basis,\n   looking at diffs of comma-delimited and tab-delimited files can get obnoxious. The program daff fixes this problem.\"\n\n## License\n\ndaff is distributed under the MIT License.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/paulfitz.github.io%2Fdaff%2F","html_url":"https://awesome.ecosyste.ms/projects/paulfitz.github.io%2Fdaff%2F","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/paulfitz.github.io%2Fdaff%2F/lists"}