{"id":22646034,"url":"https://github.com/koknat/dif","last_synced_at":"2026-04-02T01:03:55.452Z","repository":{"id":56505059,"uuid":"278921489","full_name":"koknat/dif","owner":"koknat","description":"'dif' is a Linux preprocessing front end to gvimdiff/meld/kompare","archived":false,"fork":false,"pushed_at":"2022-11-04T17:23:15.000Z","size":3643,"stargazers_count":26,"open_issues_count":1,"forks_count":1,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-03-25T20:21:24.107Z","etag":null,"topics":["diff","gvim","json","kdiff3","meld","pdf","perl","text-processing","tkdiff","xls","yaml"],"latest_commit_sha":null,"homepage":"","language":"Perl","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/koknat.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-07-11T18:40:34.000Z","updated_at":"2024-08-13T04:07:07.000Z","dependencies_parsed_at":"2023-01-21T01:17:02.425Z","dependency_job_id":null,"html_url":"https://github.com/koknat/dif","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/koknat%2Fdif","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/koknat%2Fdif/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/koknat%2Fdif/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/koknat%2Fdif/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/koknat","download_url":"https://codeload.github.com/koknat/dif/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248501700,"owners_count":21114676,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["diff","gvim","json","kdiff3","meld","pdf","perl","text-processing","tkdiff","xls","yaml"],"created_at":"2024-12-09T06:08:34.240Z","updated_at":"2026-04-02T01:03:50.405Z","avatar_url":"https://github.com/koknat.png","language":"Perl","funding_links":[],"categories":[],"sub_categories":[],"readme":"## dif - a preprocessing front end to meld / gvimdiff / kdiff3 / tkdiff / kompare\n\n'dif' compares files after it preprocesses them.  \nPreprocessing options include:\n* remove comments, whitespace, timestamps\n* search/replace\n* keep/ignore certain lines\n* json/yaml reformatting\n* parse values from xls spreadsheets\n* *many other options (see far below)*  \n  \n  \n![\"Screenshot of  meld  vs  dif with option -comments\"](screenshots/dif_before_after.png)\n  \n  \n  \n  \n'dif' can also be used to compare directories recursively, after optionally preprocessing each file\n  \n  \n![\"Screenshot of  dif comparing two directories\"](screenshots/dif_before_after_directory_meld.png)\n  \n  \n  \n  \n## Overview\nThe graphical compare tools **meld**, **gvimdiff**, **kdiff3**, **tkdiff**, and **kompare** are used to compare text files on Linux\n\nIn many cases, it is difficult and time-consuming to visually compare large files because of the large number of differences\n\nFor example:\n* different versions of code may differ only in comments or whitespace\n* log files are often many MB of text, with some \"don't care\" information such as timestamps or temporary filenames\n* json or yaml files may have ordering differences due to the library used to write the file\n* xls spreadsheets cannot be compared easily because of the file format\n\n\n## Purpose\n\n'dif' preprocesses input text files with a wide variety of options\n\nAfterwards, it runs the Linux tools meld, gvimdiff, kdiff3, tkdiff, or kompare on these intermediate files\n\nThis allows you to concentrate on the important differences, and ignore the rest\n\n\n## Solutions\n\n#### Problem: differences in whitespace or comments or case cause mismatches\nSolution:  Use options -white or -nowhite or -comments or -case\n\n#### Problem: files both need to be filtered using regexes, to strip out certain characters or sequences\nSolution 1:  Use -grep \u003cregex\u003e or -ignore \u003cregex\u003e to filter in or out\n\nSolution 2:  Use -search \u003cregex\u003e -replace \u003cregex\u003e to supply one instance of substitution and replacement\n\nSolution 3:  Use -replaceTable \u003cfile\u003e to supply a file with many substitution/replacement regexes\n       \nSolution 4:  Use -replaceDates to remove dates and timestamps\n       \n#### Problem: need to view your changes to a file on Perforce or SVN or GIT\nSolution:  'dif file' will show the differences between the head revision and the local file\n\n#### Problem: need to recursively compare directories\nSolution 1:  'dif dir1 dir2' will iteratively compare pairs of files\n\nSolution 2:  'dif dir1 dir2 -report' will open a GUI to compare the directories\n\nAny preprocessing option (-comments, -white, -sort, -grep, etc) can be used when comparing directories\n\n\n## Usage examples\n* dif file1 file2\n* dif file1 file2 -white -case\n* dif file1 file2 file3 -comments\n* dif file1 file2 -search 'foo' -replace 'bar'\n* dif file1.xls file2.xls\n* dif dir1 dir2 -report\n\n\n## Options\n    Filtering options:    \n       -comments          Remove any comments such as // or # or single-line */ /*.  Also removes trailing whitespace\n\n                          To remove comments in other languages, use the search/replace options:\n                          For example, to replace comments (marked with ';') in assembly language:\n                              -search '\\s*(;.*)?$' -replace ''\n\n       -white             Remove blank lines and leading/trailing whitespace\n                          Condense multiple whitespace to a single space\n                          Remove any non-printable characters\n       \n       -noWhite           Remove all whitespace and non-printable characters\n\n       -case              Convert files to lowercase before comparing\n       \n       -grep 'regex'      Only display lines which match the user-specified Perl regex\n                          Multiple regexs can be specified, for example:  -grep '(regexA|regexB)'\n                          To display lines above/below matches, see the help text for option -externalPreprocessScript\n\n       -ignore 'regex'    Ignore any lines which match the user-specified regex\n                          This is the opposite of the -grep function\n\n       -search 'regex'    On each line, do a global regex search and replace\n       -replace 'regex'   \n                          For example, to replace temporary filenames such as '/tmp/foo123456/bar.log' with '/tmp/file':\n                              -search '/tmp/\\S+' -replace '/tmp/file'\n\n                          Since the search/replace terms are interpreted as regex,\n                          remember to escape any parentheses\n                              Exception:  if you are using regex grouping, \n                                          do not escape the parentheses\n                              For example:\n                                  -search '(A|B|C)'  -replace 'D'\n\n                          Since the replace term is run through 'eval', make sure to escape any $ dollar signs\n                          Make sure to use 'single-quotes' instead of double-quotes\n                          For example, to convert all spaces to newlines, use:\n                              -search '\\s+'  -replace '\\n'\n\n                          If case-insensitive search is needed, also use option -case\n\n       -replaceTable file     Specify a two-column file which will be used for search/replace\n                              The delimiter is any amount of spaces\n                              Terms in the file are treated as regular expressions\n                              The replace term is run through eval\n\n       -replaceDates      Remove dates and times, for example:\n                               17:36:34\n                               Monday July 20 17:36:34 PDT 2020\n                               Dec  3  2019\n                               Jul 10 17:42\n                               1970.01.01\n                               1/1/1970\n\n       -fields N          Compare only field(s) N\n                          Multiple fields may be given, separated by commas (-fields N,M)\n                          Field numbers start at 0\n                          Fields in the input files are assumed to be separated by spaces,\n                              unless the filename ends with .csv (separated by commas)\n                          Example:  -fields 2\n                          Example:  -fields 0,2      (fields 0 and 2)\n                          Example:  -fields -1       (last field)\n                          Example:  -fields 2+       (field 2 and above)\n                          Example:  -fields not2+    (ignore fields 2 and above)\n                          Example:  -fields not0,5+  (ignore fields 0, 5, and above)\n\n       -fieldSeparator regex    Only needed if default field separators above are not sufficient\n                                Example:  -fieldSeparator ':'\n                                Example:  -fieldSeparator '[,=]' \n       \n       -fieldJustify      Make all fields the same width, right-justified\n\n       -split             Splits each line on whitespace\n       \n       -splitChar 'char'  Splits each line on 'char'\n                          For example:  -splitChar ',' to split on comma\n\n       -splitWords        Splits on whitespace.  Each word will be on its own line.\n                          Identical to -splitChar '\\s+'\n\n       -trim              Trims each line to 105 characters, discarding the overflow\n                          Useful when lines are very long, and the important information is near the beginning\n       \n       -trimChars N       Trims with specified number of characters, instead of 105\n       \n       -head              Compare only the first 10% of the file,\n                            with a minimum of 50, and a maximum of 10000 lines\n       \n       -headLines N       Compare only the first N lines\n                          If a negative number is used, ignore the first -N lines\n\n       -tail              Compare only the last 10% of the file\n                            with a minimum of 50, and a maximum of 10000 lines\n       \n       -tailLines N       Compare only the last N lines\n                          If a negative number is used, ignore the last -N lines\n       \n       -yaml              Compare two yaml files, sorting the keys\n       \n       -json              Compare two json files, sorting the keys\n\n       -removeDictKeys 'regex'\n                          For use with -yaml or -json\n                          Removes all dictionary keys matching the regex\n                          Removes all dictionary keys matching the regex\n\n       -flatten           For use with -yaml or -json\n                          Flatten nested dictionary and array structures\n  \n       -basenames         Convert path/file to file\n                          This can be useful when comparing log files which contain temporary directories\n\n       -extensions        Convert path/file.extension to .extension\n       \n       -removeExtensions  Convert path/file.extension to path/file\n\n       -lineWordSort      Sort the words in each line (space delimited)\n       \n       -round 'string'    Round all numbers according to the sprintf string\n                          For example -round '%0.2f'\n       \n       -dos2unix          Run all files through dos2unix\n\n       -lsl               Useful when comparing previously captured output of 'ls -l'\n                          Compares only names and file sizes\n\n       -tartv             Compare tarfiles using tar -tv, and compare the names and file sizes\n                          If file sizes are not desired in the comparison (names only), also use -fields 1\n          \n       -perlEval          The input file is a perl hashref\n                          Print the keys in alphabetical order\n\n       -perlDump          Useful when comparing previously captured output of Data::Dumper\n                          filter out all SCALAR/HASH/ARRAY/REF/GLOB/CODE addresses from output of Dumpvalue,\n                          since they change on every execution\n                              'SPECS' =\u003e HASH(0x9880110)    becomes    'SPECS' =\u003e HASH()\n                          Also works on Python object dumps:\n                              \u003c_sre.SRE_Pattern object at 0x216e600\u003e\n\n      \n    Filtering options to target a section of the file:    \n\n       -start 'regex'     Start comparing file when line matches 'regex'\n\n                          If multiple lines matching regexes should be required to start capturing,\n                          Separate the regexes with ^^\n                          For example, to start capture after line matching 'abc' and then line matching 'def':\n                          -start 'abc^^def'\n\n                          By default, only the first occurrence of the start/stop sequence will be captured,\n                          if multiple occurrences exist within the file\n\n       -stop 'regex'      Stop comparing file when line matches regex\n                          The last matching line will be captured, unless specified otherwise\n\n       -startIgnoreFirstLine    This modifies the 'start' operation, so that\n                                The first matching line will not be captured\n       \n       -stopIgnoreLastLine      This modifies the 'stop' operation, so that\n                                The last matching line will not be captured\n       \n       -startMultiple     This modifies the 'start' operation, so that\n                          multiple occurrences of the same start/stop sequence may be captured\n\n       -start1 -stop1 -start2 -stop2\n                          Similar to -start and -stop\n                          The '1' and '2' refer the files\n                          Enables comparing different sections within the same file,\n                          or different sections within different files\n                          \n                          For example, to compare functions 'add' and 'subtract' within a single file:\n                              dif a.pm -start1 'sub add' -stop1 '^}' -start2 'sub subtract' -stop '^}'\n\n       -function 'function_name'\n                          Compare same  Python def / Perl sub / TCL proc / JavaScript function from two source files\n                          Internally, this leverages the -start -stop functionality\n                          This feature will also work for some C source files\n\n       -functionSort\n                          Useful when Python/Perl/TCL/JavaScript functions have been moved within a file\n                          This option preprocesses each file, so that the function definitions\n                          appear in alphabetical order\n                          This feature will also work for some C source files\n\n       -language \u003clang\u003e   For use with -function and -functionSort\n                          The language is automatically determined by inspecting the file extension and shebang\n                          Use this option if those clues are not present\n                          Languages are specified as extensions such as: js pl py tcl\n\n\n    Preprocessing options (before filtering):\n       -externalPreprocessScript \u003cscript\u003e          \n                          Run each input file through your custom preprocessing script\n                          It must take input from STDIN and send output to STDOUT, similar to unix 'sort'\n                          \n                          Trivial example:\n                              -externalPreprocessScript 'sort'\n\n                          Example using grep to show 2 lines above and below lines matching the regex 'foo'\n                              -ext 'grep -C 2 foo'\n                          \n                          Examples for comparing binary files:\n                              -ext '/usr/bin/xxd'\n                              -ext '/usr/bin/xxd -c1 -p'\n                              -ext '/usr/bin/hexdump -c'\n                          However, a standalone diff tool may be preferable for comparing binary files\n                          For example:\n                              'qdiff' by Johannes Overmann and Tong Sun\n                              'colorbindiff' by Jerome Lelasseux \n                              'VBinDiff' by Christopher J. Madsen\n                              'dhex'\n                         \n       -bin               Compare binary files\n                          This is a shortcut for running -ext '/usr/bin/xxd'\n       \n       -strings           Run equivalent of Linux 'strings' command on each input file to remove binary characters\n\n       -bcpp              Run each cpp input file through bcpp linting tool with options:  /home/ckoknat/cs2/linux/bcpp -s -bcl -tbcl -ylcnc\n\n       -perltidy          Run each Perl input file through perltidy linting tool with options:  /home/utils/perl5/perlbrew/perls/5.26.2-060/bin/perltidy -l=110 -ce\n\n\n    Postprocessing options (after filtering):\n       -sort              Run Linux 'sort' on each input file\n\n       -uniq              Run Linux 'uniq' on each input file to eliminate duplicated adjacent lines\n                          Use with -sort to eliminate all duplicates\n       \n       -fold              Run 'fold' on each input file with default of 105 characters per column\n                          Useful for comparing long lines, so that scrolling right is not needed within the GUI\n\n       -foldChars N       Run 'fold' on each input file with N characters per column\n\n       -ppOnly            Stop after creating preprocessed files\n\n\n    Viewing options:\n       -quiet             Do not print to screen\n\n       -verbose           Print names and file sizes of preprocessed temporary files, before comparing\n\n       -gui cmd           Instead of using kompare to graphically compare the files, use a different tool\n                          This supports any tool which has command line usage similar to gvimdiff\n                          i.e. 'gvimdiff file1 file2'.\n                          This has been tested on meld, gvimdiff, kdiff3, tkdiff, and kompare, and likely works\n                          with diffmerge, diffuse, kdiff, wdiff, xxdiff, colordiff, beyond compare, etc\n                          Examples:\n\n                          -gui gvimdiff\n                              Uses gvimdiff as a GUI\n                          \n                          -gui kdiff3\n                              Uses kdiff3 as a GUI\n\n                          -gui tkdiff\n                              Uses tkdiff as a GUI\n\n                          -gui kompare\n                              Uses kompare as a GUI\n\n                          -gui meld\n                              Uses meld as a GUI\n                              Note that meld does not display line numbers by default on some OS\n                                  Meld / Preferences / Editor / Display / Show line numbers\n                                  If the box is greyed out, install python-gtksourceview2\n                          \n                          -gui opendiff\n                              Use the macOS FileMerge tool (requires Xcode)\n\n                          -gui none\n                              This is useful when comparing from a script\n                              in an automated process such as regression testing\n                              After running dif, the return status will be:\n                                  0 = files are equal\n                                  1 = files are different\n                                  dif a.yml b.yml -gui none -quiet ; echo $?\n                           \n                          -gui diff\n                              Prints diff to stdout instead of to a GUI\n\n                          -gui 'diff -C 1' | grep -v '^[*-]'\n                              Use diff, with the options:\n                                  one line of Context above and below the diff\n                                  remove the line numbers of the diffs\n\n       -diff              Shortcut for '-gui diff'\n\n\n    Options to compare a large set of files:\n       \u003cdirA\u003e \u003cdirB\u003e           If dif is run against two directories,\n                               will open GUI for each pair of mismatching files\n                               For example:\n                                   dif dirA dirB\n                          \n                               Any of the preprocessing options may be used\n     \n      -report                  When used with two directories  or  -dir2 \u003cdir\u003e  or  -gold\n                               Instead of opening GUIs for each file pair,\n                               generate report of mismatching or missing files\n                               For example:\n                                   dif dirA dirB -report\n                               Any of the preprocessing options may be used\n\n                               It can also be used to print a simple report of\n                               file sizes, number of lines, and md5sums (not a comparison)\n                               For example:\n                                   dif * -report\n                                       or\n                                   dif */file -report\n                                       or\n                                   dif dir -report\n\n      -filePairs               Similar to -report, but only displays the files which are found in both directories, and mismatch\n\n      -filePairsWithOptions    Similar to -filePairs, but also lists the dif command and options\n         \n      -intersection            When used with -report, only list files which exist in both directories\n\n      -fast                    When used with -report, use only the file size to compare, instead of md5sum\n                               This is much faster, but could miss cases where bits are flipped\n\n      -includeFiles \u003cregex\u003e  \n      -excludeFiles \u003cregex\u003e    Both options are for use with two directories  or  -dir2 \u003cdir\u003e  or  -gold\n                               For example:\n                                   dif -includeFiles '*log' dirA dirB\n                               Will open GUI for each pair of mismatching files\n\n                               When used with -dir2 or -gold,\n                               finds files in the current directory matching the Perl regex\n                               For example:\n                                   dif -includeFiles '*log' -dir2 ../old\n\n                               Any of the preprocessing options may be used\n\n       -dir2 \u003cdir\u003e             For each input file specified, run 'dif'\n                                   on the file in the current directory\n                                   against the file in the specified directory\n                               For example:\n                                   cd to the directory containing the files\n                                   dif file1 file2 file3 -dir ../old\n                               will run:\n                                   dif file1 ../old/file1\n                                   dif file2 ../old/file2\n                                   dif file3 ../old/file3\n                               Any of the preprocessing options may be used\n\n       -gold                   When used with one filename (file or file.extension),\n                               assumes that 1st file will be (file.golden or file.golden.extension)\n                             \n                               For example:\n                                   dif file1 -gold\n                               will run:\n                                   dif file1.golden file1.csv\n                    \n                               For example:\n                                   dif file1.csv -gold\n                               will run:\n                                   dif file1.csv.golden file1.csv\n                    \n                               When used with multiple filenames\n                               it runs dif multiple times, once for each of the pairs\n                               This option is useful when doing regressions against golden files\n                             \n                               For example:\n                                   dif file1 file2.csv -gold\n                               will run:\n                                   dif file1.golden file1\n                                   dif file2.csv.golden file2.csv\n                             \n                               Any of the preprocessing options may be used\n       \n      -tree \u003cdir1\u003e \u003cdir2\u003e      Special case.  Run unix 'tree' on each of the directories.  Does not preprocess files\n    \n    Other options:\n       -stdin             Parse input from stdin and send output to stdout\n                          For example:\n                              grep foo bar | dif -stdin \u003coptions\u003e | script2 | script3\n\n       -stdout            Cat all preprocessed files to stdout\n                          In this use case, dif could be called on only one file\n                          This allows dif to be part of a pipeline\n                          For example:\n                              dif file -stdout \u003coptions\u003e | another_script\n                          If -stdin is given, then -stdout is assumed\n\n       -out \u003cfile\u003e        Similar to -stdout, but send output to file\n                          This can be useful if dif is used as a preprocessing engine\n       \n       -filename          Intended for use with option -stdout or -out\n                          At the beginning of each line, prepend the filename\n                          This is similar to the grep --with-filename option\n                          Useful when searching through a large set of files\n       \n       -keeptmp           Default behavior is to remove the tmp directory containing preprocessed files\n                          This option keeps it\n\n\n    Other features:\n        Automatically uncompresses files from these formats into intermediate files:\n            .gz\n            .bz2\n            .xz\n            .Z\n            .zip  (single files only)\n        \n        Compares values inside .xls|.xlsm|.xlsx files\n            requires the Perl Spreadsheet::BasicRead, Spreadsheet::ParseExcel, and Spreadsheet::XLSX modules to be installed\n        \n        Compares values inside .ods OpenOffice spreadsheet files  \n            requires the Perl Spreadsheet::Read and Spreadsheet::ParseODS module to be installed\n        \n        Attempts to compare text inside .pdf files\n            requires the Perl CAM::PDF module to be installed\n\n  \n  Default compare tool:\n        The default compare GUI is meld\n        To change this, create the text file ~/.dif.defaults with one of these content lines:\n            gui: gvimdiff\n            gui: tkdiff\n            gui: kdiff3\n            gui: kompare\n            gui: meld\n            gui: tkdiff\n        You may also want to change the default (uncompressed) file size limit, before gvimdiff takes over from kompare/meld\n        The default is 2000000 bytes\n            meldSizeLimit: 1000000\n\n\n    For convenience, link to this code from ~/bin\n        ln -s /path/dif ~/bin/dif\n\n    \n\n    Perforce or SVN version control support:\n            Perforce uses '#' to signify version numbers.  dif borrows the same notation for SVN\n    Perforce or SVN examples:\n            dif file              compares head version with local version (shortcut)\n            dif file#h            compares head version with local version (shortcut)\n            dif file file#head    compares head version with local version\n            dif file#head #-      compares head version with previous version (shortcut)\n            dif file#7            compares version 7 with local version (shortcut)\n            dif file#6 file#7     compares version 6 with version 7\n            dif file#6 file#+     compares version 6 with version 7\n            dif file#6 file#-     compares version 6 with version 5\n            dif file#6..#9        compares version 6 with version 7, and then compares 7 with 8, then 8 with 9\n    Git example:\n            dif file              compares committed version to local version\n\n\n\n## Installation\n\nNo installation is needed, just copy the 'dif' executable\n\nTo see usage:\n* cd ..  (back into dif main directory)\n* ./dif\n\nTo run dif:\n* ./dif file1 file2 \u003coptions\u003e\n       \nTo run the tests (optional):\n* download dif from GitHub and uncompress it\n* cd dif/test\n* ./dif.t\n\nThis will run dif on the example* unit tests\nIt should return with 'all tests passed'\n\nPerl versions 5.6.1 through 5.30 have been tested\n\n\nFor convenience, copy the dif executable to your ~/bin directory, or create an alias:\n\n    alias dif /path/dif/dif\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkoknat%2Fdif","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkoknat%2Fdif","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkoknat%2Fdif/lists"}