{"id":17790928,"url":"https://github.com/cubicdaiya/dtl","last_synced_at":"2025-04-06T02:08:55.052Z","repository":{"id":40313624,"uuid":"54257521","full_name":"cubicdaiya/dtl","owner":"cubicdaiya","description":"diff template library written by C++","archived":false,"fork":false,"pushed_at":"2024-07-11T12:58:24.000Z","size":71,"stargazers_count":294,"open_issues_count":9,"forks_count":53,"subscribers_count":20,"default_branch":"master","last_synced_at":"2025-03-30T00:09:50.571Z","etag":null,"topics":["algorithm","diff","library"],"latest_commit_sha":null,"homepage":null,"language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cubicdaiya.png","metadata":{"files":{"readme":"README.md","changelog":"ChangeLog","contributing":null,"funding":null,"license":"COPYING","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-03-19T09:17:34.000Z","updated_at":"2025-03-21T16:35:51.000Z","dependencies_parsed_at":"2024-10-27T11:10:35.918Z","dependency_job_id":null,"html_url":"https://github.com/cubicdaiya/dtl","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cubicdaiya%2Fdtl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cubicdaiya%2Fdtl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cubicdaiya%2Fdtl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cubicdaiya%2Fdtl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cubicdaiya","download_url":"https://codeload.github.com/cubicdaiya/dtl/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247423515,"owners_count":20936626,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["algorithm","diff","library"],"created_at":"2024-10-27T10:48:30.589Z","updated_at":"2025-04-06T02:08:55.031Z","avatar_url":"https://github.com/cubicdaiya.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# dtl\n\n`dtl` is the diff template library written in C++. The name of template is derived C++'s Template.\n\n# Table of contents\n\n * [Features](#features)\n * [Getting started](#getting-started)\n     * [Compare two strings](#compare-two-strings)\n     * [Compare two data has arbitrary type](#compare-two-data-has-arbitrary-type)\n     * [Merge three sequences](#merge-three-sequences)\n     * [Patch function](#patch-function)\n     * [Difference as Unified Format](#difference-as-unified-format)\n     * [Compare large sequences](#compare-large-sequences)\n     * [Unserious difference](#unserious-difference)\n     * [Calculate only Edit Distance](#calculate-only-edit-distance)\n * [Algorithm](#algorithm)\n     * [Computational complexity](#computational-complexity)\n     * [Comparison when difference between two sequences is very large](#comparison-when-difference-between-two-sequences-is-very-large)\n     * [Implementations with various programming languages](#implementations-with-various-programming-languages)\n * [Examples](#examples)\n     * [strdiff](#strdiff)\n     * [intdiff](#intdiff)\n     * [unidiff](#unidiff)\n     * [unistrdiff](#unistrdiff)\n     * [strdiff3](#strdiff3)\n     * [intdiff3](#intdiff3)\n     * [patch](#patch)\n     * [fpatch](#fpatch)\n * [Running tests](#running-tests)\n     * [Building test programs](#building-test-programs)\n     * [Running test programs](#running-test-programs)\n * [Old commit histories](#old-commit-histories)\n * [License](#license)\n\n# Features\n\n`dtl` provides the functions for comparing two sequences have arbitrary type. But sequences must support random access\\_iterator.\n\n# Getting started\n\nTo start using this library, all you need to do is include `dtl.hpp`.\n\n```c++\n#include \"dtl/dtl.hpp\"\n```\n\n## Compare two strings\n\nFirst of all, calculate the difference between two strings.\n\n```c++\ntypedef char elem;\ntypedef std::string sequence;\nsequence A(\"abc\");\nsequence B(\"abd\");\ndtl::Diff\u003c elem, sequence \u003e d(A, B);\nd.compose();\n```\n\nWhen the above code is run, `dtl` calculates the difference between A and B as Edit Distance and LCS and SES.\n\nThe meaning of these three terms is below.\n\n| Edit Distance | Edit Distance is numerical value for declaring a difference between two sequences. |\n|:--------------|:-----------------------------------------------------------------------------------|\n| LCS           | LCS stands for Longest Common Subsequence.                                         |\n| SES           | SES stands for Shortest Edit Script. I mean SES is the shortest course of action for tranlating one sequence into another sequence.|\n\nIf one sequence is \"abc\" and another sequence is \"abd\", Edit Distance and LCS and SES is below.\n\n| Edit Distance | 2               |\n|:--------------|:----------------|\n| LCS           | ab              |\n| SES           | C a C b D c A d |\n\n * 「C」：Common\n * 「D」：Delete\n * 「A」：ADD\n\nIf you want to know in more detail, please see [examples/strdiff.cpp](https://github.com/cubicdaiya/dtl/blob/master/examples/strdiff.cpp).\n\nThis calculates Edit Distance and LCS and SES of two strings received as command line arguments and prints each.\n\nWhen one string is \"abc\" and another string \"abd\", the output of `strdiff` is below.\n\n```bash\n$ ./strdiff abc abd\neditDistance:2\nLCS:ab\nSES\n a\n b\n-c\n+d\n$\n```\n\n## Compare two data has arbitrary type\n\n`dtl` can compare data has aribtrary type because of the C++'s template.\n\nBut the compared data type must support the random access\\_iterator.\n\nIn the previous example, the string data compared,\n\n`dtl` can also compare two int vectors like the example below.\n\n```c++\nint a[10] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};\nint b[10] = {3, 5, 1, 4, 5, 1, 7, 9, 6, 10};\nstd::vector\u003cint\u003e A(\u0026a[0], \u0026a[10]);\nstd::vector\u003cint\u003e B(\u0026b[0], \u0026b[10]);\ndtl::Diff\u003c int \u003e d(A, B);\nd.compose();\n```\n\nIf you want to know in more detail, please see [examples/intdiff.cpp](https://github.com/cubicdaiya/dtl/blob/master/examples/intdiff.cpp).\n\n## Merge three sequences\n\n`dtl` has the diff3 function.\n\nThis function is that `dtl` merges three sequences.\n\nAdditionally `dtl` detects the confliction.\n\n```c++\ntypedef char elem;\ntypedef std::string sequence;\nsequence A(\"qqqabc\");\nsequence B(\"abc\");\nsequence C(\"abcdef\");\ndtl::Diff3\u003celem, sequence\u003e diff3(A, B, C);\ndiff3.compose();\nif (!diff3.merge()) {\n  std::cerr \u003c\u003c \"conflict.\" \u003c\u003c std::endl;\n  return -1;\n}\nstd::cout \u003c\u003c \"result:\" \u003c\u003c diff3.getMergedSequence() \u003c\u003c std::endl;\n```\n\nWhen the above code is run, the output is below.\n\n```console\nresult:qqqabcdef\n```\n\nIf you want to know in more detail, please see [examples/strdiff3.cpp](https://github.com/cubicdaiya/dtl/blob/master/examples/strdiff3.cpp).\n\n## Patch function\n\n`dtl` can also translates one sequence to another sequence with SES.\n\n```c++\ntypedef char elem;\ntypedef std::string sequence;\nsequence A(\"abc\");\nsequence B(\"abd\");\ndtl::Diff\u003celem, sequence\u003e d(A, B);\nd.compose();\nstring s1(A);\nstring s2 = d.patch(s1);\n```\n\nWhen the above code is run, s2 becomes \"abd\".\nThe SES of A(\"abc\") and B(\"abd\") is below.\n\n```console\nCommon a\nCommon b\nDelete c\nAdd    d\n```\n\nThe patch function translates a sequence as argument with SES.\nFor this example, \"abc\" is translated to \"abd\" with above SES.\n\nPlease see dtl's header files about the data structure of SES.\n\n## Difference as Unified Format\n\n`dtl` can also treat difference as Unified Format. See the example below.\n\n```c++\ntypedef char elem;\ntypedef std::string sequence;\nsequence A(\"acbdeaqqqqqqqcbed\");\nsequence B(\"acebdabbqqqqqqqabed\");\ndtl::Diff\u003celem, sequence \u003e d(A, B);\nd.compose();             // construct an edit distance and LCS and SES\nd.composeUnifiedHunks(); // construct a difference as Unified Format with SES.\nd.printUnifiedFormat();  // print a difference as Unified Format.\n```\n\nThe difference as Unified Format of \"acbdeaqqqqqqqcbed\" and \"acebdabbqqqqqqqabed\" is below.\n\n```diff\n@@ -1,9 +1,11 @@\n a\n c\n+e\n b\n d\n-e\n a\n+b\n+b\n q\n q\n q\n@@ -11,7 +13,7 @@\n q\n q\n q\n-c\n+a\n b\n e\n d\n```\n\nThe data structure Unified Format is below.\n\n```c++\n/**\n * Structure of Unified Format Hunk\n */\ntemplate \u003ctypename sesElem\u003e\nstruct uniHunk {\n  int a, b, c, d;                   // @@ -a,b +c,d @@\n  std::vector\u003csesElem\u003e common[2];   // anteroposterior commons on changes\n  std::vector\u003csesElem\u003e change;      // changes\n  int inc_dec_count;                // count of increace and decrease\n};\n```\n\nThe actual blocks of Unified Format is this structure's vector.\n\nIf you want to know in more detail, please see [examples/unistrdiff.cpp](https://github.com/cubicdaiya/dtl/blob/master/examples/unistrdiff.cpp)\nand [examples/unidiff.cpp](https://github.com/cubicdaiya/dtl/blob/master/examples/unidiff.cpp) and dtl's header files.\n\nIn addtion, `dtl` has the function translates one sequence to another sequence with Unified Format.\n\n```c++\ntypedef char elem;\ntypedef std::string sequence;\nsequence A(\"abc\");\nsequence B(\"abd\");\ndtl::Diff\u003celem, sequence\u003e d(A, B);\nd.compose();\nd.composeUnifiedHunks()\nstring s1(A);\nstring s2 = d.uniPatch(s1);\n```\n\nWhen the above code is run, s2 becomes \"abd\".\nThe uniPatch function translates a sequence as argument with Unified Format blocks.\n\nFor this example, \"abc\" is translated to \"abd\" with the Unified Format block below.\n\n```diff\n@@ -1,3 +1,3 @@\n a\n b\n-c\n+d\n```\n\n## Compare large sequences\n\nWhen compare two large sequences, `dtl` can optimizes the calculation of difference with the onHuge function.\n\nThis function is available when the compared data type is std::vector.\n\nWhen you use this function, you may call this function before calling compose function.\n\n```c++\ntypedef char elem;\ntypedef  std::vector\u003celem\u003e sequence;\nsequence A;\nsequence B;\n/* ・・・ */\ndtl::Diff\u003c elem, sequence \u003e d(A, B);\nd.onHuge();\nd.compose();\n```\n\n## Unserious difference\n\nThe calculation of difference is very heavy.\n`dtl` uses An O(NP) Sequence Comparison Algorithm.\n\nThough this Algorithm is sufficiently fast,\nwhen difference between two sequences is very large,\n\nthe calculation of LCS and SES needs massive amounts of memory.\n\n`dtl` avoids above-described problem by dividing each sequence into plural subsequences\nand joining the difference of each subsequence finally.\n\nAs this way repeats allocating massive amounts of memory,\n`dtl` provides other way. It is the way of calculating unserious difference.\n\nFor example, The normal SES of \"abc\" and \"abd\" is below.\n\n```console\nCommon a\nCommon b\nDelete c\nAdd    d\n```\n\nThe unserious SES of \"abc\" and \"abd\" is below.\n\n```console\nDelete a\nDelete b\nDelete c\nAdd    a\nAdd    b\nAdd    d\n```\n\nOf course, when \"abc\" and \"abd\" are compared with `dtl`, above difference is not derived.\n\n`dtl` calculates the unserious difference when `dtl` judges the calculation of LCS and SES\nneeds massive amounts of memory and unserious difference function is ON.\n\n`dtl` joins the calculated difference before `dtl` judges it and unserious difference finally.\n\nAs a result, all difference is not unserious difference when unserious difference function is ON.\n\nWhen you use this function, you may call this function before calling compose function.\n\n```c++\ntypedef char elem;\ntypedef std::string sequence;\nsequence A(\"abc\");\nsequence B(\"abd\");\ndtl::Diff\u003c elem, sequence \u003e d(A, B);\nd.onUnserious();\nd.compose();\n```\n\n## Calculate only Edit Distance\n\nAs using onOnlyEditDistance, `dtl` calculates the only edit distance.\n\nIf you need only edit distance, you may use this function,\nbecause the calculation of edit distance is lighter than the calculation of LCS and SES.\n\nWhen you use this function, you may call this function before calling compose function.\n\n```c++\ntypedef char elem;\ntypedef std::string sequence;\nsequence A(\"abc\");\nsequence B(\"abd\");\ndtl::Diff\u003c elem, sequence \u003e d(A, B);\nd.onOnlyEditDistance();\nd.compose();\n```\n\n# Algorithm\n\nThe algorithm `dtl` uses is based on \"An O(NP) Sequence Comparison Algorithm\" by described by Sun Wu, Udi Manber and Gene Myers.\n\nAn O(NP) Sequence Comparison Algorithm(following, Wu's O(NP) Algorithm) is the efficient algorithm for comparing two sequences.\n\n## Computational complexity\n\nThe computational complexity of Wu's O(NP) Algorithm is averagely O(N+PD), in the worst case, is O(NP).\n\n## Comparison when difference between two sequences is very large\n\nCalculating LCS and SES efficiently at any time is a little difficult.\n\nBecause that the calculation of LCS and SES needs massive amounts of memory when a difference between two sequences is very large.\n\nThe program uses that algorithm don't consider that will burst in the worst case.\n\n`dtl` avoids above-described problem by dividing each sequence into plural subsequences and joining the difference of each subsequence finally. (This feature is supported after version 0.04)\n\n## Implementations with various programming languages\n\nThere are the Wu's O(NP) Algorithm implementations with various programming languages below.\n\nhttps://github.com/cubicdaiya/onp\n\n# Examples\n\nThere are examples in [dtl/examples](https://github.com/cubicdaiya/dtl/tree/master/examples).\n`dtl` uses [SCons](http://scons.org/) for building examples and tests. If you build and run examples and tests, install SCons.\n\n## strdiff\n\n`strdiff` calculates a difference between two string sequences, but multi byte is not supported.\n\n```bash\n$ cd dtl/examples\n$ scons strdiff\n$ ./strdiff acbdeacbed acebdabbabed\neditDistance:6\nLCS:acbdabed\nSES\n  a\n  c\n+ e\n  b\n  d\n- e\n  a\n- c\n  b\n+ b\n+ a\n+ b\n  e\n  d\n$\n```\n\n## intdiff\n\n`intdiff` calculates a diffrence between two int arrays sequences.\n\n```bash\n$ cd dtl/examples\n$ scons intdiff\n$ ./intdiff # There are data in intdiff.cpp\n1 2 3 4 5 6 7 8 9 10 \n3 5 1 4 5 1 7 9 6 10 \neditDistance:8\nLCS: 3 4 5 7 9 10 \nSES\n- 1\n- 2\n  3\n+ 5\n+ 1\n  4\n  5\n- 6\n+ 1\n  7\n- 8\n  9\n+ 6\n  10\n$\n```\n\n## unidiff\n\n`unidiff` calculates a diffrence between two text file sequences,\nand output the difference between files with unified format.\n\n```bash\n$ cd dtl/examples\n$ scons unidiff\n$ cat a.txt\na\ne\nc\nz\nz\nd\ne\nf\na\nb\nc\nd\ne\nf\ng\nh\ni\n$ cat b.txt\na\nd\ne\nc\nf\ne\na\nb\nc\nd\ne\nf\ng\nh\ni\n$ ./unidiff a.txt b.txt\n--- a.txt       2008-08-26 07:03:28 +0900\n+++ b.txt       2008-08-26 03:02:42 +0900\n@@ -1,11 +1,9 @@\n a\n-e\n-c\n-z\n-z\n d\n e\n+c\n f\n+e\n a\n b\n c\n$\n```\n\n## unistrdiff\n\n`unistrdiff` calculates a diffrence between two string sequences.\nand output the difference between strings with unified format.\n\n```bash\n$ cd dtl/examples\n$ scons unistrdiff\n$ ./unistrdiff acbdeacbed acebdabbabed\neditDistance:6\nLCS:acbdabed\n@@ -1,10 +1,12 @@\n a\n c\n+e\n b\n d\n-e\n a\n-c\n b\n+b\n+a\n+b\n e\n d\n$\n```\n\n## strdiff3\n\n`strdiff3` merges three string sequence and output the merged sequence.\nWhen the confliction has occured, output the string \"conflict.\".\n\n```bash\n$ cd dtl/examples\n$ scons strdiff3\n$ ./strdiff3 qabc abc abcdef\nresult:qabcdef\n$\n```\n\nThere is a output below when conflict occured.\n\n```bash\n$ ./strdiff3 adc abc aec\nconflict.\n$\n```\n\n## intdiff3\n\n`intdiff3` merges three integer sequence(vector) and output the merged sequence.\n\n```bash\n$ cd dtl/examples\n$ scons intdiff3\n$ ./intdiff3\na:1 2 3 4 5 6 7 3 9 10\nb:1 2 3 4 5 6 7 8 9 10\nc:1 2 3 9 5 6 7 8 9 10\ns:1 2 3 9 5 6 7 3 9 10\nintdiff3 OK\n$\n```\n\n## patch\n\n`patch` is the test program. Supposing that there are two strings is called by A and B,\n`patch` translates A to B with Shortest Edit Script or unified format difference.\n\n```bash\n$ cd dtl/examples\n$ scons patch\n$ ./patch abc abd\nbefore:abc\nafter :abd\npatch successed\nbefore:abc\nafter :abd\nunipatch successed\n$\n```\n\n## fpatch\n\n`fpatch` is the test program. Supposing that there are two files is called by A and B,\n`fpatch` translates A to B with Shortest Edit Script or unified format difference.\n\n```bash\n$ cd dtl/examples\n$ scons fpatch\n$ cat a.txt\na\ne\nc\nz\nz\nd\ne\nf\na\nb\nc\nd\ne\nf\ng\nh\ni\n$ cat b.txt\n$ cat b.txt\na\nd\ne\nc\nf\ne\na\nb\nc\nd\ne\nf\ng\nh\ni\n$ ./fpatch a.txt b.txt\nfpatch successed\nunipatch successed\n$\n```\n\n# Running tests\n\n`dtl` uses [googletest](https://github.com/google/googletest) and [SCons](http://www.scons.org/) with testing dtl-self.\n\n# Building test programs\n\nIf you build test programs for `dtl`, run `scons` in test direcotry.\n\n```bash\n$ scons\n```\n\n# Running test programs\n\nIf you run all tests for `dtl`, run 'scons check' in test direcotry. (it is necessary that gtest is compiled)\n\n```bash\n$ scons check\n```\n\nIf you run sectional tests, you may exeucte `dtl_test` directly after you run `scons`.\nFollowing command is the example for testing only Strdifftest.\n\n```bash\n$ ./dtl_test --gtest_filter='Strdifftest.*'\n```\n\n`--gtest-filters` is the function of googletest. googletest has many useful functions for testing software flexibly.\nIf you want to know other functions of googletest, run `./dtl_test --help`.\n\n# Old commit histories\n\nPlease see [cubicdaiya/dtl-legacy](https://github.com/cubicdaiya/dtl-legacy).\n\n# License\n\nPlease read the file [COPYING](https://github.com/cubicdaiya/dtl/blob/master/COPYING).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcubicdaiya%2Fdtl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcubicdaiya%2Fdtl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcubicdaiya%2Fdtl/lists"}