{"id":30190282,"url":"https://github.com/sampersand/mviz","last_synced_at":"2025-08-12T19:40:24.635Z","repository":{"id":284844672,"uuid":"956239175","full_name":"sampersand/mviz","owner":"sampersand","description":"A script to print out invisible charcaters in strings or files","archived":false,"fork":false,"pushed_at":"2025-07-22T22:33:26.000Z","size":5738,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-07-23T00:18:27.886Z","etag":null,"topics":["dump","inspect","ruby","viz"],"latest_commit_sha":null,"homepage":"","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sampersand.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-03-27T23:30:32.000Z","updated_at":"2025-07-22T22:33:29.000Z","dependencies_parsed_at":"2025-06-19T20:30:37.297Z","dependency_job_id":"fc322098-e645-4f3a-a6c0-972bd1eac5c2","html_url":"https://github.com/sampersand/mviz","commit_stats":null,"previous_names":["sampersand/p","sampersand/inspect","sampersand/mviz"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/sampersand/mviz","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sampersand%2Fmviz","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sampersand%2Fmviz/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sampersand%2Fmviz/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sampersand%2Fmviz/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sampersand","download_url":"https://codeload.github.com/sampersand/mviz/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sampersand%2Fmviz/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":270126123,"owners_count":24531763,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-12T02:00:09.011Z","response_time":80,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dump","inspect","ruby","viz"],"created_at":"2025-08-12T19:40:20.869Z","updated_at":"2025-08-12T19:40:24.595Z","avatar_url":"https://github.com/sampersand.png","language":"Ruby","funding_links":[],"categories":[],"sub_categories":[],"readme":"# The `mviz` command\nA modern alternative to `vis`: a way visualize invisible and invalid bytes in different encodings.\n\n`mviz` is essentially a replacement for interactive use of `echo` or `cat`: Instead of `echo \"$variable\"` or `cat file.txt`, which (on most terminals) hide invisible characters (like `\\x01`), you instead do `mviz \"$variable\"` or `mviz -f file.txt`.\n\n# Examples\n`mviz` is designed with sensible defaults in mind; its default behaviour is what you want most of the time, but it can easily (and sensibly) be changed with options\n\nHm... let's try inspecting a variable\n![basic usages of mviz](imgs/intro1.png)\n\nWhat else can `mviz` do?\n![basic usages of mviz](imgs/intro2.png)\n\nIt's even usable with pipes!\n![basic usages of mviz](imgs/intro3.png)\n\n\u003c!--\n```sh\n$ mviz \"$variable\"        # See the contents of a shell variable\n$ mviz -d \"$variable\"     # Delete weird characters from the variable\n$ mviz -f file.txt        # Print file.txt, escaping \"weird\" characters\n$ mviz -fw file.txt       # Like the previous line, but newlines and tabs aren't escaped.\n$ some_command | mviz     # Visualize weird characters of `some_command`\n$ some_command | mviz -l  # Like the previous one, but don't escape newlines.\n$ some_command | mviz -b  # Interpret input data as binary, not UTF-8 (the default)\n```\ngreeting=$'\\rHello\\x01, world 🌍! '\nsome_command () { print $'Not\\u00A0much.\\r\\ncool!' }\nPROMPT_EOL_MARK=\n\necho \"$greeting\" # Everything _seems_ to be in order...\nmviz \"$greeting\"    # But it's not!\n\nmviz -r \"$greeting\" # Replace with �\nmviz -C \"$greeting\" # Use control pictures!\nmviz -m \"$greeting\" # Escape UTF-8!\n\nsome_command | mviz      # Pipe stuff in!\nsome_command | mviz -l   # Don't escape newlines!\nsome_command | mviz -dD  # Delete invalid characters!\nsome_command | mviz -b   # Interpret the input as binary data!\nsome_command | mviz -abx # Show the hex of _all_ bytes!\n--\u003e\n\nIt's also quite useful when you're learning how shells work:\n```bash\n# See what files are expanded by a glob\n$ mviz [A-Z]*\n    1: LICENSE\n    2: README.md\n# See how `$variable` word splits\n$ variable='hello    world,   :-)'\n$ mviz $variable\n    1: hello\n    2: world,\n    3: :-)\n# See how `$IFS` affects it\n$ IFS=o\n$ mviz $variable\n    1: hell\n    2:     w\n    3: rld,   :-)\n```\n\nTry `mviz -h` for short usage, and `mviz --help` for the longer one.\n\n# Why not use tool X (`xxd`, `hexdmp`, `vis`, `od`, etc)?\nThe biggest difference between `mviz` and other tools is that `mviz` is intended for looking at mostly-normal text by default, and optimizes for that. It doesn't change the output _unless_ weird characters exist. For example:\n\n![comparisons of mviz to xxd, hexdump -C, od -c, vis, and cat -v](imgs/comparisons.png)\n\u003c!-- ```bash\n% printf 'hello\\x04world, how are you? \\xC3👍\\n' | mviz\nhello\\x04world, how are you? \\xC3👍\\n\n\n% printf 'hello\\x04world, how are you? \\xC3👍\\n' | xxd\n00000000: 6865 6c6c 6f04 776f 726c 642c 2068 6f77  hello.world, how\n00000010: 2061 7265 2079 6f75 3f20 c3f0 9f91 8d0a   are you? ......\n\n% printf 'hello\\x04world, how are you? \\xC3👍\\n' | hexdump -C\n00000000  68 65 6c 6c 6f 04 77 6f  72 6c 64 2c 20 68 6f 77  |hello.world, how|\n00000010  20 61 72 65 20 79 6f 75  3f 20 c3 f0 9f 91 8d 0a  | are you? ......|\n00000020\n\n% printf 'hello\\x04world, how are you? \\xC3👍\\n' | od -c\n0000000    h   e   l   l   o 004   w   o   r   l   d   ,       h   o   w\n0000020        a   r   e       y   o   u   ?     303  👍  **  **  **  \\n\n0000040\n\n% printf 'hello\\x04world, how are you? \\xC3👍\\n' | vis\nhello\\^Dworld, how are you? \\M-C\\M-p\\M^_\\M^Q\\M^M\n\n% printf 'hello\\x04world, how are you? \\xC3👍\\n' | cat -v\nhello^Dworld, how are you? ??M-^_M-^QM-^M\n```\n --\u003e\nIn addition, `mviz` by default adds a \"standout marker\" to escaped characters (by default, it inverts the foreground and background colours), so they're more easily distinguished at a glance.\n\n# How it works\nThe way `mviz` works at a high-level is pretty easy: Every character in an input is checked against the list of patterns, and the first one that matches is used. If no patterns match, the character is checked against the \"default pattern,\" and if that doesn't match, the character is printed verbatim.\n\nTo simplify the most common use-case of `mviz`, where only the \"escaping mechanism\" (called an \"Action\"; see below) is changed, a lot of short-hand flags (such as `-x`, `-d`, etc.) are provided to just change the default action.\n\n`mviz` is broken into three configurable parts: The encoding of the input data, the \"patterns\" to match against the input data, and the action to take when a pattern matches. They're described in more details below:\n\n## Encodings\nThe encoding (which can be specified via `--encoding`, and are case-insensitive) is used to determine which input bytes are valid, and which are invalid.\n\nValid bytes (which differ between encodings, see below) are then matched against patterns as described in `How it works`. However, \"invalid bytes\" (for example `\\xC3` in UTF-8) are handled specially:\n\nBy default, these bytes have their hex values printed out (but this can be changed, e.g. with `--invalid-action=delete`), along with a different \"standout\" pattern than normal escapes (by default, a red background). If any invalid bytes are encountered during an execution, and `--malformed-error` is set (which it is by default), the program will exit with a non-zero exit code at the end.\n\nYou can get a list of all the supported encodings via `--list-encodings`. Non-ASCII-compliant encodings, such as `UTF-16`, aren't supported (as they drastically complicate character matching logic).\n\nThe \"binary\" encoding (which can be specified either with `--encoding=binary` or the `-b` / `--binary` / `--bytes` shorthands) is unique in that it doesn't have any \"invalid bytes.\"\n\nUnless explicitly specified (either via `--encoding`, or one of the shorthands like `-b`), the encoding normally defaults to `UTF-8`. However, if the environment variable `POSIXLY_CORRECT` is set, it defaults to the \"locale\" encoding, which relies on the environment variables `LC_ALL`, `LC_CTYPE`, and `LANG` (in that order) to specify it.\n\n## Patterns\nPatterns are a sets of characters (internally using regular expression character classes—`[a-z]`) that are used to match against input characters. In addition to specifying \"normal\" escape sequences (eg `\\n` for newlines, `\\xHH` for hex escapes, and `\\u{HHHH}` for Unicode codepoints, `\\w` for \"word characters\", etc), patterns also support the following custom escape sequences:\n\n- `\\A` matches all characters\n- `\\N` matches no characters\n- `\\m` matches multibyte characters (and is only useful if the input encoding is multibyte, like UTF-8)\n- `\\M` matches single-byte characters (ie anything `\\m` doesn't match)\n- `\\@` matches the \"default pattern\" (see below)\n\nPatterns are normally used when specifying actions directly (eg `mviz --delete=^a-z` will only output lower-case letters).\n\n### Default Pattern\nThe default pattern is the pattern that is checked _after_ all \"user-specified patterns.\" If it matches, the \"default action\" takes place (which are controlled by shorthands like `-x`, `-o`, etc.) acts upon.\n\nNormally, the default pattern is just `\\x00-\\x1F\\x7F`—that is, all of the control bytes in ASCII. However, there's a few ways it can be changed:\n1. It can be explicitly set via `--default-pattern=PATTERN`, at which point that's exactly what'll be used.\n2. If encoding is `BINARY`, the bytes `\\x80-\\xFF` are also added, as the binary encoding considers all bytes to be valid.\n3. If the encoding is `UTF-8` (the default, unless `POSIXLY_CORRECT` is set), then the codepoints `\\u0080-\\u009F` are added.\n3. If the default action is unchanged, and visual effects aren't be used, then backslash (`\\`) is added to the default pattern. This way, it'll be escaped when \"standout features\" aren't in use.\n\n## Actions\nActions are how characters are escaped. There's a lot of them, and they can be used either as arguments to flags (eg `--invalid-action=octal`) or specified explicitly (eg via `--highlight=a-z`):\n\n| Name | Description |\n|------|-------------|\n| `print`     | Print characters, unchanged, without escaping them. Unlike the other actions, using `print` will not mark values as \"escaped\" for the purposes of `--check-escapes` |\n| `delete`    | Delete characters from the output by not printing anything. Deleted characters are considered \"escaped\" for the purposes of `--check-escape` |\n| `dot`       | Replaces characters by simply printing a single period (`.`). *Note*: Multibyte characters are still represented by a single period. |\n| `replace`   | Identical to `dot`, except instead of a period, the replacement character, � (`\\uFFFD`) is printed instead. |\n| `hex`       | Replaces characters with their hex value (`\\xHH`). Multibyte characters will have each of their bytes printed. |\n| `octal`     | Like `--hex`, except octal escapes (`\\###`) are used instead. The output is always padded to three bytes (so NUL is `\\000`, not `\\0`) |\n| `picture`   | Print out \"[control pictures](https://en.wikipedia.org/wiki/Control_Pictures)\" (`U+240x`-`U+242x`) corresponding to the character. *Note*: Only `\\x00`–`\\x20` and `\\x7F` have control pictures assigned to them, and using `picture` with any other characters will yield a warning (and fall back to `hex`). |\n| `codepoint` | Replaces chars with their UTF-8 codepoints (`\\u{HHHH}`). This only works if the encoding is UTF-8. *Note:* This cannot be used with `--invalid-action`, as invalid bytes don't have a codepoint. |\n| `highlight` | Prints the character unchanged, but considers it \"escaped\". (Thus, visual effects are added to it like any other escape, and `--check-escapes` considers it an escaped character.) |\n| `c-escape`  | Print out C-style escapes for the following characters: `0x07` (`\\a`), `0x08` (`\\b`), `0x09` (`\\t`), `0x0a` (`\\n`), `0x0b` (`\\v`), `0x0c` (`\\f`), `0x0d` (`\\r`), `0x1b` (`\\e`), `0x5c` (`\\\\`). *Note*: Using `c-escape` with any other character will yield a warning (and fall back to `hex`). |\n| `default`   | Use the default pattern: All valid `c-escape` characters have their escape printed (with the sole exception that a backslash is printed as-is if visual effects are enabled), all other characters in `\\x00-\\x1F`, `\\x7F` (and `\\x80-\\xFF` if the encoding is binary) are printed in hex, in `UTF-8` the codepoints `\\u0080-\\u009F` have their codepoints printed, and all other characters are printed as-is.|\n\n# Environment Variables\nThe `mviz` command has numerous environment variables it relies on:\n\n| Variable | Description |\n|----------|-------------|\n| `FORCE_COLOR`, `NO_COLOR` | Controls `--color=auto`. If FORCE_COLOR is set and nonempty, acts like `--color=always`. Else, if NO_COLOR is set and nonempty, acts like `--color=never`. If neither is set to a non-empty value, `--color=auto` defaults to `--color=always` when stdout is a tty. |\n| `POSIXLY_CORRECT` | If present, changes the default `--encoding` to be `locale` (cf locale(1).), and also disables parsing switches after arguments (e.g. passing in `foo -x` as arguments will not interpret `-x` as a switch). |\n| `P_STANDOUT_BEGIN`, `P_STANDOUT_END` | Beginning and ending escape sequences for --colour; Usually don't need to be set, as they have sane defaults. |\n| `P_STANDOUT_ERR_BEGIN`, `P_STANDOUT_ERR_END` | Like `P_STANDOUT_BEGIN`/`P_STANDOUT_END`, except for invalid bytes (eg `\\xC3` in UTF-8) |\n| `LC_ALL`, `LC_CTYPE`, `LANG` | Checked (in that order) for the encoding when `--encoding=locale` is used. |\n\n# Contributions\nBugs are welcome!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsampersand%2Fmviz","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsampersand%2Fmviz","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsampersand%2Fmviz/lists"}