{"id":28902443,"url":"https://github.com/khaledsaeed18/dir-analysis-tool","last_synced_at":"2026-04-28T03:08:50.867Z","repository":{"id":300288207,"uuid":"1005775400","full_name":"KhaledSaeed18/dir-analysis-tool","owner":"KhaledSaeed18","description":"📂 A CLI tool for advanced directory analysis with file classification, duplicate detection, large file identification, interactive mode, HTML reports, and multiple export formats. Perfect for disk cleanup, storage audits, and project analysis.","archived":false,"fork":false,"pushed_at":"2025-07-12T16:50:22.000Z","size":1411,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-07-23T04:29:36.303Z","etag":null,"topics":["chalk","cli","command-line-tool","commander","csv-export","directory","filesystem","html-report","inquirerjs","nodejs","npm-package","pretty-bytes","storage","typescript"],"latest_commit_sha":null,"homepage":"https://www.npmjs.com/package/dir-analysis-tool","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/KhaledSaeed18.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-20T19:50:15.000Z","updated_at":"2025-07-12T16:50:25.000Z","dependencies_parsed_at":"2025-06-20T21:45:43.597Z","dependency_job_id":null,"html_url":"https://github.com/KhaledSaeed18/dir-analysis-tool","commit_stats":null,"previous_names":["khaledsaeed18/dir-analysis-tool"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/KhaledSaeed18/dir-analysis-tool","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KhaledSaeed18%2Fdir-analysis-tool","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KhaledSaeed18%2Fdir-analysis-tool/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KhaledSaeed18%2Fdir-analysis-tool/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KhaledSaeed18%2Fdir-analysis-tool/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/KhaledSaeed18","download_url":"https://codeload.github.com/KhaledSaeed18/dir-analysis-tool/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KhaledSaeed18%2Fdir-analysis-tool/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266761837,"owners_count":23980302,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-23T02:00:09.312Z","response_time":66,"last_error":null,"robots_txt_status":null,"robots_txt_updated_at":null,"robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chalk","cli","command-line-tool","commander","csv-export","directory","filesystem","html-report","inquirerjs","nodejs","npm-package","pretty-bytes","storage","typescript"],"created_at":"2025-06-21T11:08:57.233Z","updated_at":"2026-04-28T03:08:50.861Z","avatar_url":"https://github.com/KhaledSaeed18.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# dir-analysis-tool\n\n[![npm version](https://img.shields.io/npm/v/dir-analysis-tool.svg)](https://www.npmjs.com/package/dir-analysis-tool)\n[![CI](https://github.com/KhaledSaeed18/dir-analysis-tool/actions/workflows/ci.yml/badge.svg)](https://github.com/KhaledSaeed18/dir-analysis-tool/actions/workflows/ci.yml)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Node.js Version](https://img.shields.io/node/v/dir-analysis-tool.svg)](https://nodejs.org/)\n\nA fast, memory-safe CLI for directory analysis — file classification, duplicate detection, large-file identification, HTML reports, and multiple export formats.\n\n## Quick Start\n\n```bash\nnpm install -g dir-analysis-tool   # install globally\n\ndat                                 # analyze current directory\ndat /path/to/project                # analyze any path\ndat analyze . --json | jq .files   # pipe-safe JSON output\n```\n\n## Features\n\n| Feature | Description |\n|---|---|\n| Single-pass streaming walk | No double I/O — files are stat'd once |\n| Streaming MD5 hashing | Duplicate detection without loading files into RAM |\n| TTY-aware progress | Progress bar only in interactive terminals; silent in pipes |\n| File type classification | Images, videos, audio, documents, code, archives, other |\n| Duplicate detection | Groups identical files by content hash; shows wasted space |\n| Large-file detection | Configurable threshold in MB |\n| Empty-file detection | Find all zero-byte files |\n| Top-N largest files | Quick disk usage overview |\n| HTML reports | Charts and tables, matrix/terminal aesthetic |\n| CSV export | Full analysis, large files, or duplicates |\n| Tree view | Visual directory structure (up to 1 000 files) |\n| Watch mode | Debounced re-analysis on every filesystem change |\n| Directory comparison | Side-by-side stats for two directories |\n| Config file | `.dir-analyzer.json` for default settings |\n| Interactive init | `dat init` creates the config file with prompts |\n\n## CLI Reference\n\n### `dat [directory]` / `dat analyze [directory]`\n\nAnalyze a directory. Defaults to the current working directory.\n\n```bash\ndat                              # analyze cwd\ndat /path/to/project             # explicit path\ndat analyze . --json             # JSON output (no ANSI, pipe-safe)\ndat analyze . --tree             # directory tree view\ndat analyze . --duplicates       # detect duplicate files\ndat analyze . --large-files      # files \u003e 100 MB\ndat analyze . --large-files 50   # files \u003e 50 MB\ndat analyze . --empty-files      # detect zero-byte files\ndat analyze . --top-n 20         # top 20 largest files\ndat analyze . --html             # generate HTML report\ndat analyze . --csv              # export to CSV\ndat analyze . --exclude node_modules dist coverage\ndat analyze . --max-depth 3      # limit scan depth\ndat analyze . --min-size 1000    # files \u003e= 1 KB only\ndat analyze . --date-from 2024-01-01 --date-to 2024-12-31\n```\n\n**Options**\n\n| Flag | Description |\n|---|---|\n| `--no-recursive` | Disable recursive scan |\n| `-j, --json` | JSON output (suppresses progress, safe to pipe) |\n| `--tree` | Show directory tree |\n| `--no-types` | Hide file-type breakdown |\n| `-e, --exclude \u003cpatterns...\u003e` | Exclude dirs/files by name or glob |\n| `-l, --large-files [mb]` | Detect large files (default threshold: 100 MB) |\n| `-d, --duplicates` | Enable duplicate detection |\n| `--empty-files` | Detect zero-byte files |\n| `--top-n \u003cn\u003e` | Show top N largest files |\n| `--max-depth \u003cdepth\u003e` | Limit directory depth |\n| `--min-size \u003cbytes\u003e` | Minimum file size filter |\n| `--max-size \u003cbytes\u003e` | Maximum file size filter |\n| `--date-from \u003cYYYY-MM-DD\u003e` | Files modified after this date |\n| `--date-to \u003cYYYY-MM-DD\u003e` | Files modified before this date |\n| `--csv [filename]` | Export full analysis to CSV |\n| `--csv-large [filename]` | Export large-file list to CSV |\n| `--csv-duplicates [filename]` | Export duplicate groups to CSV |\n| `--html [filename]` | Generate HTML report with charts |\n| `-c, --config [path]` | Path to config file |\n\n---\n\n### `dat watch [directory]`\n\nWatch a directory and re-analyze automatically (debounced 2 s) after changes.\n\n```bash\ndat watch .\ndat watch /path/to/project --duplicates --top-n 10\n```\n\n---\n\n### `dat compare \u003cdir1\u003e \u003cdir2\u003e`\n\nCompare two directories side by side.\n\n```bash\ndat compare src/ dist/\ndat compare /project-v1 /project-v2 --json\n```\n\n---\n\n### `dat init`\n\nInteractively create a `.dir-analyzer.json` config file in the current directory.\n\n```bash\ndat init\n```\n\n---\n\n## Configuration\n\nCreate `.dir-analyzer.json` in your project root (or run `dat init`):\n\n```json\n{\n  \"excludePatterns\": [\"coverage\", \"tmp\", \"__pycache__\"],\n  \"clearDefaultExclusions\": false,\n  \"largeSizeThresholdMB\": 100,\n  \"enableDuplicateDetection\": false,\n  \"maxDepth\": -1,\n  \"topN\": 10,\n  \"showEmptyFiles\": false\n}\n```\n\n`dat` searches for this file starting in the current directory and walking up the tree. CLI flags always override config values.\n\n**Default excluded directories** (unless `clearDefaultExclusions: true`): `node_modules`, `.git`, `.svn`, `.hg`, `dist`, `build`, `.cache`\n\n---\n\n## JSON Output\n\nAll fields from a `dat analyze . --json` call:\n\n```jsonc\n{\n  \"path\": \"/absolute/path\",\n  \"totalSizeBytes\": 12345678,\n  \"totalSizeFormatted\": \"12.3 MB\",\n  \"totalSizeMB\": 12.3,\n  \"folders\": 42,\n  \"files\": 512,\n  \"types\": {\n    \"images\": 10, \"videos\": 0, \"documents\": 5,\n    \"audio\": 0, \"code\": 480, \"archives\": 2, \"other\": 15\n  },\n  \"largeFiles\": [{ \"path\": \"...\", \"size\": 104857600, \"sizeFormatted\": \"105 MB\" }],\n  \"duplicateGroups\": [{ \"hash\": \"...\", \"size\": 4096, \"files\": [\"...\", \"...\"], \"wastedSpace\": 4096 }],\n  \"duplicateStats\": { \"totalGroups\": 3, \"wastedSpace\": 12288, \"wastedSpaceFormatted\": \"12.3 kB\" },\n  \"emptyFiles\": [{ \"path\": \"...\", \"mtime\": \"2024-01-01T00:00:00.000Z\" }],\n  \"topLargestFiles\": [{ \"path\": \"...\", \"size\": 1048576, \"sizeFormatted\": \"1.05 MB\" }],\n  \"treeView\": \"📁 project\\n└── ...\"\n}\n```\n\n---\n\n## Migrating from v1\n\n| v1 | v2 |\n|---|---|\n| `dir-analysis-tool` | `dat` (or still `dir-analysis-tool`) |\n| `--path \u003cdir\u003e` | positional argument: `dat \u003cdir\u003e` |\n| `--interactive` | `dat init` (creates config) |\n| `--progress` / `--no-progress` | auto-detected from TTY |\n| `--large-files \u003cbytes\u003e` | `--large-files \u003cmb\u003e` (value now in MB) |\n| `bin/` build output | `dist/` build output |\n\n---\n\n## Requirements\n\n- Node.js \u003e= 18.0.0\n- pnpm (for development)\n\n## Contributing\n\nSee [CONTRIBUTING.md](CONTRIBUTING.md).\n\n## License\n\n[MIT](LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkhaledsaeed18%2Fdir-analysis-tool","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkhaledsaeed18%2Fdir-analysis-tool","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkhaledsaeed18%2Fdir-analysis-tool/lists"}