{"id":13813688,"url":"https://github.com/Codepoints/unidump","last_synced_at":"2025-05-15T00:34:05.592Z","repository":{"id":57490005,"uuid":"85202414","full_name":"Codepoints/unidump","owner":"Codepoints","description":"hexdump(1) for Unicode data","archived":false,"fork":false,"pushed_at":"2024-09-03T19:28:08.000Z","size":37,"stargazers_count":38,"open_issues_count":0,"forks_count":2,"subscribers_count":5,"default_branch":"master","last_synced_at":"2024-11-04T00:51:58.716Z","etag":null,"topics":["cli","console","hexdump","python3","unicode","utility"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Codepoints.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-03-16T14:02:28.000Z","updated_at":"2024-09-03T19:28:12.000Z","dependencies_parsed_at":"2024-08-04T04:14:34.279Z","dependency_job_id":null,"html_url":"https://github.com/Codepoints/unidump","commit_stats":{"total_commits":28,"total_committers":2,"mean_commits":14.0,"dds":0.5,"last_synced_commit":"0626d6c04dc6781dd4cbbb65d585183531b9d366"},"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Codepoints%2Funidump","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Codepoints%2Funidump/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Codepoints%2Funidump/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Codepoints%2Funidump/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Codepoints","download_url":"https://codeload.github.com/Codepoints/unidump/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225319305,"owners_count":17455743,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cli","console","hexdump","python3","unicode","utility"],"created_at":"2024-08-04T04:01:25.860Z","updated_at":"2024-11-19T08:30:54.919Z","avatar_url":"https://github.com/Codepoints.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# `unidump`\n\n## `hexdump` for your Unicode data\n\n## Installation\n\nInstall via `pip`:\n\n    # you need Python 3 for unidump\n    pip3 install unidump\n\n## Usage\n\nWithout further ado, here is the usage message of `unidump`:\n\n```\n$ unidump --help\nusage: unidump [-h] [-n LENGTH] [-c ENC] [-e FORMAT] [-v] [FILE [FILE ...]]\n\n  A Unicode code point dump.\n\n  Think of it as  hexdump(1)  for Unicode.  The command analyses  the input and\n  then prints three columns: the raw byte index of the first code point in this\n  row, code points in their hex notation,  and finally the raw input characters\n  with control and whitespace replaced by a dot.\n\n  Invalid byte sequences are represented with an “X” and with the hex value en-\n  closed in question marks, e.g., “?F5?”.\n\n  You can pipe in  data from stdin,  select several files at once,  or even mix\n  all those input methods together.\n\npositional arguments:\n  FILE                  input files. Use `-' or keep empty for stdin.\n\noptional arguments:\n  -h, --help            show this help message and exit\n  -n LENGTH, --length LENGTH\n                        format output using this much input characters.\n                        Default is 16 characters.\n  -c ENC, --encoding ENC\n                        interpret input in this encoding. Default is utf-8.\n                        You can choose any encoding that Python supports, e.g.\n                        “latin-1”.\n  -e FORMAT, --format FORMAT\n                        specify a custom format in Python’s {} notation.\n                        Default is “{byte:\u003e7} {repr} {data} ”.\n  -v, --version         show program's version number and exit\n\nExamples:\n\n* Basic usage with stdin:\n\n      echo -n 'ABCDEFGHIJKLMNOP' | unidump -n 4\n            0    0041 0042 0043 0044    ABCD\n            4    0045 0046 0047 0048    EFGH\n            8    0049 004A 004B 004C    IJKL\n           12    004D 004E 004F 0050    MNOP\n\n* Dump the code points translated from another encoding:\n\n      unidump -c latin-1 some-legacy-file\n\n* Dump many files at the same time:\n\n      unidump foo-*.txt\n\n* Control characters and whitespace are safely rendered:\n\n      echo -n -e '\\x01' | unidump -n 1\n           0    0001    .\n\n* Finally learn what your favorite Emoji is composed of:\n\n      ( echo -n -e '\\xf0\\x9f\\xa7\\x9d\\xf0\\x9f\\x8f\\xbd\\xe2' ; \\\n        echo -n -e '\\x80\\x8d\\xe2\\x99\\x82\\xef\\xb8\\x8f' ; ) | \\\n      unidump -n 5\n           0    1F9DD 1F3FD 200D 2642 FE0F    .🏽.♂️\n\n  See  \u003chttp://emojipedia.org/man-elf-medium-skin-tone/\u003e for images.  The “elf”\n  emoji (the first character) is replaced with a dot here,  because the current\n  version of Python’s unicodedata doesn’t know of this character yet.\n\n* Use it like strings(1):\n\n      unidump -e '{data}' some-file.bin\n\n  This will replace  every unknown byte from the input file  with “X” and every\n  control and whitespace character with “.”.\n\n* Only print the code points of the input:\n\n      unidump -e '{repr}'$'\\n' -n 1 some-file.txt\n\n  This results in a stream of code points in hex notation,  each on a new line,\n  without byte counter  or rendering of actual data.  You can use this to count\n  the total amount of characters  (as opposed to raw bytes)  in a file,  if you\n  pipe it through `wc -l`.\n```\n\n## License\n\nMIT-licensed. See [license file](LICENSE.md).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FCodepoints%2Funidump","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FCodepoints%2Funidump","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FCodepoints%2Funidump/lists"}