{"id":20046018,"url":"https://github.com/tulz-app/stringdiff","last_synced_at":"2025-05-05T09:31:18.375Z","repository":{"id":57714930,"uuid":"322976005","full_name":"tulz-app/stringdiff","owner":"tulz-app","description":"Myers diff algorithm in Scala","archived":false,"fork":false,"pushed_at":"2021-05-23T03:32:05.000Z","size":111,"stargazers_count":6,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-08T20:48:08.429Z","etag":null,"topics":["algorithms","diff","myers","myers-algorithm","scala"],"latest_commit_sha":null,"homepage":"","language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tulz-app.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-12-20T02:10:56.000Z","updated_at":"2024-12-09T19:53:38.000Z","dependencies_parsed_at":"2022-09-12T03:10:16.769Z","dependency_job_id":null,"html_url":"https://github.com/tulz-app/stringdiff","commit_stats":null,"previous_names":[],"tags_count":7,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tulz-app%2Fstringdiff","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tulz-app%2Fstringdiff/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tulz-app%2Fstringdiff/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tulz-app%2Fstringdiff/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tulz-app","download_url":"https://codeload.github.com/tulz-app/stringdiff/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252471417,"owners_count":21753174,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["algorithms","diff","myers","myers-algorithm","scala"],"created_at":"2024-11-13T11:20:21.145Z","updated_at":"2025-05-05T09:31:18.009Z","avatar_url":"https://github.com/tulz-app.png","language":"Scala","funding_links":[],"categories":[],"sub_categories":[],"readme":"![Maven Central](https://img.shields.io/maven-central/v/app.tulz/stringdiff_sjs1_2.13.svg)\n\n## app.tulz.diff\n\nMyers diff algorithm in Scala. \n\n\n```scala\n\"app.tulz\" %%% \"stringdiff\" % \"0.3.4\" \n```\n\n### Overview\n\nThe core algorithm is a Scala translation of the Python implementation described here:\nhttps://blog.robertelder.org/diff-algorithm/\n\nAdditionally, the following is implemented on top of it:\n\n* interpretation of the algorithm's output\n* customizable formatting of the interpreted result into a string (with a few provided out of the box formatters)  \n* additional transformations for when the input is expected to be a sequence of tokens (`TokenDiff`)\n\n\n### Interpretation\n\nThe core algorithm outputs a set of instructions to get from `SeqA` to `SeqB`, something like this:\n\n```\nin order to get from s1=\"bcdefgzio\" to s2=\"abcxyfgi\":\n\nInsert a from s2 before position 0 into s1.\nDelete d from s1 at position 2 in s1.\nInsert x from s2 before position 3 into s1.\nInsert y from s2 before position 3 into s1.\nDelete e from s1 at position 3 in s1.\nDelete z from s1 at position 6 in s1.\nDelete o from s1 at position 8 in s1.\n```\n\nIt's hard to work with it. Also, it is somewhat tricky at first to understand how to follow these instructions (try it! :) ).\n\n`MyersInterpret` object parses the raw output into a `List[DiffElement]`:\n\n```scala\ntrait DiffElement[Repr]\n\nobject DiffElement {\n  final case class InBoth[Repr](v: Repr)        extends DiffElement[Repr]\n  final case class InFirst[Repr](v: Repr)       extends DiffElement[Repr]\n  final case class InSecond[Repr](v: Repr)      extends DiffElement[Repr]\n  final case class Diff[Repr](x: Repr, y: Repr) extends DiffElement[Repr]\n}\n```\n\nThe result of the interpretation (without any transformations applied) for the same example:\n\n```scala\nStringDiff\n  .raw(\n    \"bcdefgzio\",\n    \"abcxyfgi\",\n    collapse = false\n  ).mkString(\"[\\n  \", \"\\n  \", \"\\n]\")\n```\n\n```\n[\n  InSecond(a)\n  InBoth(bc)\n  InFirst(d)\n  InSecond(x)\n  InSecond(y)\n  InFirst(e)\n  InBoth(fg)\n  InFirst(z)\n  InBoth(i)\n  InFirst(o)\n]\n```\n\n### Collapsing\n\nBy default, diff functions will collapse the diff:\n\n```scala\nStringDiff\n  .raw(\n    \"bcdefgzio\",\n    \"abcxyfgi\",\n    collapse = true // default is true\n  ).mkString(\"[\\n  \", \"\\n  \", \"\\n]\")\n```\n\n```\n[\n  InSecond(a)\n  InBoth(bc)\n  Diff(de,xy)\n  InBoth(fg)\n  InFirst(z)\n  InBoth(i)\n  InFirst(o)\n]\n```\n\nHere, the following list of `DiffElement`s:\n\n```\n[\n  InFirst(d)\n  InSecond(x)\n  InSecond(y)\n  InFirst(e)\n]\n```\n\ngot collapsed into a single one:\n\n```\n[\n  Diff(de,xy)\n]\n```\n\nIn a nutshell, collapsing removes empty elements and joins same or otherwise \"join-able\" subsequent `DiffElement`s.\n\nExamples:\n* any `InFirst`, `InLast`, `Diff` or `InBoth` gets removed if the element is empty\n* `Diff` becomes `InFirst` or `InSecond` if one the elements is empty\n* `InFirst(a)+InFirst(b) -\u003e InFirst(ab)`\n* `InFirst(a)+InSecond(b) -\u003e Diff(a,b)`\n* `InSecond(a)+InDiff(b,c) -\u003e Diff(ab,c)`\n* etc\n\n### Diff'ing sequences:\n\n```scala\nSeqDiff\n  .seq(\n    Seq(1, 2, 3, 4, 5).toIndexedSeq,\n    Seq(1, 2, 8, 3, 8, 4, 5, 0).toIndexedSeq\n  ).mkString(\"[\\n  \", \"\\n  \", \"\\n]\")\n```\n\n```\n[\n  InBoth(Vector(1, 2))\n  InSecond(Vector(8))\n  InBoth(Vector(3))\n  InSecond(Vector(8))\n  InBoth(Vector(4, 5))\n  InSecond(Vector(0))\n]\n```\n\n### Diff'ing strings:\n\n##### Raw diff AST:\n\n```scala\n  println(\n    StringDiff.diff(\n      \"bcdefgzio\",\n      \"abcxyfgi\"\n    )\n  )\n```\n\n```  \n[\n  InSecond(a)\n  InBoth(bc)\n  Diff(de,xy)\n  InBoth(fg)\n  InFirst(z)\n  InBoth(i)\n  InFirst(o)\n]  \n```\n\n##### Text output:\n\n```scala\n  println(\n    StringDiff.text(\n      \"bcdefgzio\",\n      \"abcxyfgi\"\n    )\n  )\n```\n\n```\n[∅|a]]bc][de|xy]]fg][z|∅]]i][o|∅]\n```\n\n##### ANSI color output:\n\n```scala\n  println(\n    StringDiff.ansi(\n      \"bcdefgzio\",\n      \"abcxyfgi\"\n    )\n  )\n```\n\n![screenshot1](doc/images/screenshot1.png)\n\n(default formatters highlight missing text with yellow, extraneous text — with red, and matching text is underlined)\n\n### Diff'ing tokens\n\nWhen the inputs are strings that are expected to contain whitespace-separated tokens, `TokenDiff` \nwill try to make the diff more comprehensible in terms of tokens (while preserving the accuracy).\n\n```scala\n  println(\n    TokenDiff.ansi(\n      \"match-1 match-2 diff-1 diff-2 match-3 match-4 diff-1 diff-2 match-1 match-2 diff-1 match-3 match-4 diff-1 match-1 match-2 diff-1 match-3 match-4 suffix-1\",\n      \"prefix-1 match-1 match-2 diff-3 match-3 match-4 match-1 match-2 diff-2 diff-3 match-3 match-4 diff-2 match-1 match-2 match-3 match-4\"\n    )    \n  )\n```\n\n![screenshot3](doc/images/screenshot3.png)\n\nWith a `StringDiff` the output would look like the following:\n\n```scala\n  println(\n    StringDiff.ansi(\n      \"match-1 match-2 diff-1 diff-2 match-3 match-4 diff-1 diff-2 match-1 match-2 diff-1 match-3 match-4 diff-1 match-1 match-2 diff-1 match-3 match-4 suffix-1\",\n      \"prefix-1 match-1 match-2 diff-3 match-3 match-4 match-1 match-2 diff-2 diff-3 match-3 match-4 diff-2 match-1 match-2 match-3 match-4\"\n    )    \n  )\n```\n![screenshot4](doc/images/screenshot3.png)\n\n##### Inline diffs for both strings \n\n```scala\n  println(\n    TokenDiff.ansiBoth(\n      \"match-1 match-2 diff-1 diff-2 match-3 match-4 diff-1 diff-2 match-1 match-2 diff-1 match-3 match-4 diff-1 match-1 match-2 diff-1 match-3 match-4 suffix-1\",\n      \"prefix-1 match-1 match-2 diff-3 match-3 match-4 match-1 match-2 diff-2 diff-3 match-3 match-4 diff-2 match-1 match-2 match-3 match-4\"\n    )    \n  )\n```\n\n![screenshot3](doc/images/screenshot5.png)\n\n\n### Usage\n\n##### Diff'ing `Seq`s:\n\n```\nSeqDiff.seq(\n  Seq(1, 2, 3),\n  Seq(2, 3, 4)\n)\n```\n\n##### Diff'ing `Strings`s:\n\n```\nStringDiff.ansi(\"abc\", \"acb\")\n// OR\nStringDiff(\"abc\", \"acb\")\n\nStringDiff.ansiBoth(\"abc\", \"acb\")\nStringDiff.text(\"abc\", \"acb\")\nStringDiff.diff(\"abc\", \"acb\")\nStringDiff.raw(\"abc\", \"acb\")\n```\n\n\n##### Diff'ing `Strings`s with tokens:\n\n```\nTokenDiff.ansi(\"abc\", \"acb\")\n// OR\nTokenDiff(\"abc\", \"acb\")\n\nTokenDiff.ansiBoth(\"abc\", \"acb\")\nTokenDiff.text(\"abc\", \"acb\")\nTokenDiff.diff(\"abc\", \"acb\")\nTokenDiff.raw(\"abc\", \"acb\")\n```\n\n\n### Custom formatters\n\nFormatters are instances of the `DiffFormat[Out]` trait:\n\n```scala\ntrait DiffFormat[Out] {\n\n  def apply(diff: List[DiffElement[String]]): Out\n\n}\n```\n\n```scala\nobject MyFormat extends DiffFormat[MyDiffOutput] { ... }\n\nval diff: MyDiffOutput = MyFormat(StringDiff(\"abc\", \"acb\"))\n```\n\nFor example, the `TextDiffFormat` is implemented like this:\n\n```scala\nobject TextDiffFormat extends DiffFormat[String] {\n\n  import DiffElement._\n\n  def apply(diff: List[DiffElement[String]]): String = {\n    val sb = new StringBuilder\n    diff.foreach {\n      case InBoth(both) =\u003e\n        sb.append(\"]\")\n        sb.appendAll(both)\n        sb.append(\"]\")\n      case InSecond(second) =\u003e\n        sb.append(\"[∅|\")\n        sb.appendAll(second)\n        sb.append(\"]\")\n      case InFirst(first) =\u003e\n        sb.append(\"[\")\n        sb.appendAll(first)\n        sb.append(\"|∅]\")\n      case Diff(first, second) =\u003e\n        sb.append(\"[\")\n        sb.appendAll(first)\n        sb.append(\"|\")\n        sb.appendAll(second)\n        sb.append(\"]\")\n      case _ =\u003e\n    }\n    sb.toString()\n  }\n```\n\n## Author\n\nIurii Malchenko – [@yurique](https://twitter.com/yurique)\n\n\n## License\n\n`stringdiff` is provided under the [MIT license](https://github.com/tulz-app/stringdiff/blob/main/LICENSE.md).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftulz-app%2Fstringdiff","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftulz-app%2Fstringdiff","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftulz-app%2Fstringdiff/lists"}