{"id":13586398,"url":"https://github.com/wodny/ncdu-export","last_synced_at":"2025-04-07T15:31:48.318Z","repository":{"id":91325281,"uuid":"49611526","full_name":"wodny/ncdu-export","owner":"wodny","description":"Standalone ncdu export feature","archived":false,"fork":false,"pushed_at":"2024-12-01T09:36:02.000Z","size":39,"stargazers_count":35,"open_issues_count":1,"forks_count":10,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-04-03T18:49:53.559Z","etag":null,"topics":["find","ijson","jq","json","linux","ncdu","sax","yajl"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/wodny.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-01-14T00:32:00.000Z","updated_at":"2024-12-01T09:36:06.000Z","dependencies_parsed_at":null,"dependency_job_id":"90c10d62-e7e2-43a1-be06-b1f21a367fd7","html_url":"https://github.com/wodny/ncdu-export","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wodny%2Fncdu-export","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wodny%2Fncdu-export/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wodny%2Fncdu-export/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wodny%2Fncdu-export/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/wodny","download_url":"https://codeload.github.com/wodny/ncdu-export/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247679544,"owners_count":20978070,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["find","ijson","jq","json","linux","ncdu","sax","yajl"],"created_at":"2024-08-01T15:05:32.534Z","updated_at":"2025-04-07T15:31:48.312Z","avatar_url":"https://github.com/wodny.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# Standalone ncdu export feature and some other tools\n\n[ncdu][1] (NCurses Disk Usage) is a great utility with an ncurses \ninterface that allows browsing through directories and check their disk \nusage (like the `du` command). It first walks through a directory and \nthen allows browsing the cached result.\n\nNewer versions (≥1.9) of [ncdu][1] have a feature allowing you to make \na [JSON export file][2] on a remote machine (`-o` option) and then \nbrowse directories locally (`-f` option).\n\nOn some old machines there may be an old version of [ncdu][1] without \nthat option or there may be no [ncdu][1] at all and it may be expensive \nto build [ncdu][1] for every one of them or for some reason you cannot \nget static binaries for a specific platform. The **ncdu-export** tool is \na workaround - it generates an export file compatible with [ncdu][1] and \n**requires only Python 2.6 or Python 3.2** (or newer) on the remote \nmachine.\n\nBelow, there is also a script based on the `find` command only (without \nusing Python).\n\nNote, however, that there are **static binaries** for x86, x86\\_64 and \nARM available directly from the [ncdu's][1] homepage, so tools from this \nrepository may be useful only if you cannot use those static binaries.\n\nCurrently the scripts' output is not identical to [ncdu][1]'s output but \nshould work well enough.\n\nExample:\n\n    1. Copy the script to the remote host\n    $ scp ncdu-export remote-host:\n\n    2A. Pipe meta-data via ssh to a local file:\n    $ ssh remote-host ./ncdu-export -p / \u003e files.json\n\n    2B. Collect meta-data on the remote host and then download it:\n    $ ssh remote-host\n    $ ./ncdu-export -p / \u003e files.json\n    ^D\n    $ scp remote-host:files.json .\n\n    3. Analyze the data\n    $ ncdu -f files.json\n\n## Remarks on usage\n\n### Pointing to the interpreter\n\nIf you get the `/usr/bin/env: ‘python’: No such file or directory` \nmessage you need to call a chosen version of python explicitly in one of \nthe following ways:\n\n- `python2 ncdu-export` (if using Python 2),\n- `python3 ncdu-export` (if using Python 3),\n- fix the first line (hashbang) of the script according to your \n  environment.\n\n### Names encoded using UTF-8 or ASCII\n\nSince version 0.8.0 `ncdu-export` encodes names using UTF-8 by default.\nUse the `-a` switch to output ASCII. The default changed because at the \ntime of writing this the original `ncdu` ≥ 2.5 based on Zig does not \naccept JSON with names encoded in ASCII. See #6 for details.\n\n## Other tools\n\nTools described below are prepared for filenames containing unusual characters \nlike newlines. They support `-` as the FILE's name so you can use them with \npipes.\n\n### Flatten/unflatten\n\nSometimes one can have a need to automatically filter meta-data dumped using \nthe [ncdu][1] or `ncdu-export` tools. Those dumps can be quite big, hundreds of \nmegabytes. One can process those dumps with [jq][3], but:\n\n- using [jq][3] in non-stream mode can consume a lot of RAM,\n- getting directory name from this kind of dump may be quite complicated (I \n  don't like my own example with `walk`),\n- I didn't find a way to process ncdu's output in jq's [stream mode][4] and \n  using methods like `fromstream(1|truncate_stream(inputs))`; I suppose it's \n  because contrary to most formats used in jq's usage examples ncdu's format is \n  not flat (it's an array of arrays of maps).\n\nThis set of tools can be used to flatten ncdu's output, make it easy to process \nusing [jq][3] and then optionally unflatten it back again. These tools depend on \nthe [ijson][5] Python library using the [YAJL2][6] library underneath. Those \nlibraries work on streams and parse JSON incrementally so it's possible to \nconvert huge dumps without consuming all the RAM.\n\nThe `yajl2_cffi` backend is chosen automatically (if available). It's faster \nthan the pure Python backend. During experiments it reduced the conversion time \nby as much as 40%.\n\nExample of filtering files modified before 2018-01-01:\n\n    $ ./ncdu-export -mp a-directory \u003e files.json\n    $ ./flatten.py files.json \u003e files-flat.json\n    $ export ts=$(date -d 2018-01-01 +%s)\n\n    Rebrowsing in ncdu:\n    $ jq -c 'select(.mtime \u003c (env.ts | tonumber))' \u003c files-flat.json \u003e files-flat-before2018.json\n    $ ./unflatten.py files-flat-before2018.json \u003e files-before2018.json\n    $ ncdu -f files-before2018.json\n\n    Putting files in an archive and removing them:\n    $ jq -j 'select(.mtime \u003c (env.ts | tonumber) and .type == \"file\") | .dirs + \"/\" + .name + \"\\u0000\"' \u003c files-flat.json \u003e files-flat-before2018.txt\n    $ tar cvzf archive.tgz --null -T files-flat-before2018.txt --remove-files\n\n### Find export\n\nThere is also a script that allows you to produce a meta-data dump just using \nthe `find` command on a remote host (without using [ncdu][1] nor Python at all) \nand then process it locally to regenerate the ncdu-compatible JSON format. It \nworks thanks to find's `printf` action (available in the Linux version, not the \nBusyBox one).\n\n    $ ./find.sh a-directory \u003e find-export.txt\n    $ ./find2flat.py find-export.txt \u003e find-flat-export.json\n    $ ./unflatten.py find-flat-export.json \u003e find-export.json\n    $ ncdu -f find-export.json\n\nor\n\n    $ ./find.sh ~/projects/ | ./find2flat.py - | ./unflatten.py - | ncdu -f -\n\n## Graph of tools\n\n\n                         .------------.\n         .---------------| filesystem |\n         |               '------------'\n         |                      |\n         |                      | ncdu -o / ncdu-export\n         |                      v\n         |                  .------.         .---------.\n         | find.sh          | ncdu | ncdu -f |  ncdu   |\n         |                  | JSON |--------\u003e| preview |\n         |                  '------'         '---------'\n         |                    |  ^\n         |         flatten.py |  | unflatten.py\n         v                    v  |\n    .--------.              .------.\n    |  find  | find2flat.py | flat |\u003c---. jq filtering\n    | output |-------------\u003e| JSON |----'\n    '--------'              '------'\n                                |\n                                | jq\n                                v\n                          .-----------.        .---------.\n                          |    tar    | tar -T |   tar   |\n                          | file list |-------\u003e| archive |\n                          '-----------'        '---------'\n\n\n\n[1]: https://dev.yorhel.nl/ncdu\n[2]: https://dev.yorhel.nl/ncdu/jsonfmt\n[3]: https://stedolan.github.io/jq\n[4]: https://stedolan.github.io/jq/manual/#Streaming\n[5]: https://pypi.org/project/ijson/\n[6]: http://lloyd.github.io/yajl/\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwodny%2Fncdu-export","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwodny%2Fncdu-export","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwodny%2Fncdu-export/lists"}