{"id":13692544,"url":"https://github.com/juhakivekas/multidiff","last_synced_at":"2025-05-02T19:32:19.503Z","repository":{"id":45738722,"uuid":"113815834","full_name":"juhakivekas/multidiff","owner":"juhakivekas","description":"Binary data diffing for multiple objects or streams of data","archived":false,"fork":false,"pushed_at":"2023-02-12T08:06:13.000Z","size":92,"stargazers_count":304,"open_issues_count":5,"forks_count":27,"subscribers_count":16,"default_branch":"master","last_synced_at":"2025-03-16T23:42:30.409Z","etag":null,"topics":["diff","diffing","hexdump","packet-analyser","packet-analysis","packet-analyzer","visualizer"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/juhakivekas.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2017-12-11T05:23:53.000Z","updated_at":"2024-10-27T16:03:47.000Z","dependencies_parsed_at":"2024-01-31T10:04:59.241Z","dependency_job_id":"99f9db3d-eeb5-4775-90ed-7a13d1ad0471","html_url":"https://github.com/juhakivekas/multidiff","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/juhakivekas%2Fmultidiff","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/juhakivekas%2Fmultidiff/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/juhakivekas%2Fmultidiff/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/juhakivekas%2Fmultidiff/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/juhakivekas","download_url":"https://codeload.github.com/juhakivekas/multidiff/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252095384,"owners_count":21693908,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["diff","diffing","hexdump","packet-analyser","packet-analysis","packet-analyzer","visualizer"],"created_at":"2024-08-02T17:00:59.315Z","updated_at":"2025-05-02T19:32:19.086Z","avatar_url":"https://github.com/juhakivekas.png","language":"Python","funding_links":[],"categories":["Python","Binary Data"],"sub_categories":["Diff Enhancers"],"readme":"`M U L T I D I F F`\n===================\n\n\u003e Multidiff is a sensory augmentation apparatus\n\nIts purpose is to make machine friendly data easier to understand by humans that are looking at it.\nSpecifically multidiff helps in viewing the differences within a large set of objects by doing diffs between relevant objects and displaying them in a sensible manner.\nThis kind of visualization is handy when looking for patterns and structure in proprietary protocols or weird file formats.\nThe obvious use-cases are reverse engineering and binary data analysis.\n\n![multidiff -p 8000 -i json -o hexdump](./hexdump_stream_mode.png)\n\nAt the core of multidiff is the python difflib library and multidiff wraps it in data providing mechanisms and visualization code.\nThe visualization is the most important part of the project and everything else is just utilities to make it easier to feed data for the visualizer.\nAt this time the tool can do basic format parsing such as hex decoding, hexdumping, and handling data as utf8 strings, as well as read from files, stdin, and sockets.\nAny preprocessing such as cropping, indenting, decompression, etc. will have be done by the user before the objects are provided to multidiff.\n\nCommand-line interface\n----------------------\nThe command line interface is the easiest way to use multidiff. It supports a few common use-cases and is installed by the setup script.\n\n\tpython3 setup.py install\n\tmultidiff -h\n\n### --mode\nThis selects the diffing strategy, currently `sequence` and `baseline` are supported.\nSequence mode diffs every object with the object added just before it while baseline mode always diffs the most recent object with the first object.\n\n### --informat \u0026 --outformat\nThe `infomrat` argument controls what kind of transformations should be done to the data before it gets diffed. `outformat` controls the view of the output data.\n`informat` should mostly be selected based on what is the easiest way to provide data to multidiff while `outformat` should be selected based on how the content of the data is most pleasantly viewed.\n\n### --port\nThere is an embedded tcp socket server that will listen to any packets coming to the specified port and print the diffs as more objects are sent to it.\nThe server supports a json mode in which objects are passed as json objects that may include metadata. This is useful if the client has done some analysis on the data and one would like to show those results in the view stream. The schema is pretty simple:\n\n\t{\n\t\t\"data\":\"[data encoded as base64]\",\n\t\t\"info\":\"some useful note\"\n\t}\n\nExample object providers are in the `examples` directory.\nThese are specific use-cases where it has been helpful to have a stream of diffs visible when inspecting traffic.\n\nExamples\n--------\n\nCheck how much your shell history repeats:\n\n\thistory | multidiff -s -o utf8\n\t\nDiff a bunch of files and scroll through the results:\n\n\tmultidiff interesting_file.bin folder_with_similar_files/ | less -r\n\nStart a multidiff server, then send objects to it:\n\n\tmultidiff -p 8000\n\techo \"interesting\" | nc 127.0.0.1 8000\n\techo \"intersectional\" | nc 127.0.0.1 8000\n\nContributions\n-------------\nPull requests are welcome, and please raise an issue if something is broken or if you can think of a cool feature. I can be reached as \"stilla\" on Protonmail.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjuhakivekas%2Fmultidiff","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjuhakivekas%2Fmultidiff","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjuhakivekas%2Fmultidiff/lists"}