{"id":51015053,"url":"https://github.com/securityronin/vmdk-forensic","last_synced_at":"2026-06-21T09:02:37.761Z","repository":{"id":362409091,"uuid":"1256798247","full_name":"SecurityRonin/vmdk-forensic","owner":"SecurityRonin","description":"Pure-Rust VMware VMDK toolkit: vmdk-core reader (imported as vmdk; recovers damaged disks via the redundant grain directory) + vmdk-forensic analyzer (RGD adjudication, dangling-pointer \u0026 provenance findings)","archived":false,"fork":false,"pushed_at":"2026-06-16T01:47:12.000Z","size":592,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-16T03:21:03.994Z","etag":null,"topics":["container","data-recovery","dfir","disk-image","forensic-analysis","forensics","incident-response","rust","rust-crate","virtual-disk","vmdk","vmware"],"latest_commit_sha":null,"homepage":null,"language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SecurityRonin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-06-02T05:17:53.000Z","updated_at":"2026-06-16T01:47:16.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/SecurityRonin/vmdk-forensic","commit_stats":null,"previous_names":["securityronin/vmdk","securityronin/vmdk-forensic"],"tags_count":7,"template":false,"template_full_name":null,"purl":"pkg:github/SecurityRonin/vmdk-forensic","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SecurityRonin%2Fvmdk-forensic","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SecurityRonin%2Fvmdk-forensic/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SecurityRonin%2Fvmdk-forensic/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SecurityRonin%2Fvmdk-forensic/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SecurityRonin","download_url":"https://codeload.github.com/SecurityRonin/vmdk-forensic/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SecurityRonin%2Fvmdk-forensic/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34603636,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-21T02:00:05.568Z","response_time":54,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["container","data-recovery","dfir","disk-image","forensic-analysis","forensics","incident-response","rust","rust-crate","virtual-disk","vmdk","vmware"],"created_at":"2026-06-21T09:02:32.931Z","updated_at":"2026-06-21T09:02:37.749Z","avatar_url":"https://github.com/SecurityRonin.png","language":"Rust","funding_links":["https://github.com/sponsors/h4x0r"],"categories":[],"sub_categories":[],"readme":"# vmdk-forensic\n\n[![vmdk-core](https://img.shields.io/crates/v/vmdk-core.svg?label=vmdk-core)](https://crates.io/crates/vmdk-core)\n[![vmdk-forensic](https://img.shields.io/crates/v/vmdk-forensic.svg?label=vmdk-forensic)](https://crates.io/crates/vmdk-forensic)\n[![docs.rs](https://img.shields.io/docsrs/vmdk-core)](https://docs.rs/vmdk-core)\n[![License: Apache-2.0](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](LICENSE)\n[![CI](https://github.com/SecurityRonin/vmdk-forensic/actions/workflows/ci.yml/badge.svg)](https://github.com/SecurityRonin/vmdk-forensic/actions)\n[![Sponsor](https://img.shields.io/badge/sponsor-h4x0r-ea4aaa?logo=github-sponsors)](https://github.com/sponsors/h4x0r)\n\n**Read VMware VMDK disk images others give up on, then audit them for tampering.** Two crates in one workspace: the [`vmdk-core`](https://crates.io/crates/vmdk-core) reader (imported as `vmdk`) presents the virtual disk as a plain `Read + Seek` byte stream — and **recovers data from a damaged disk through the redundant grain directory that `qemu-img` and `libvmdk` throw away** — while the [`vmdk-forensic`](https://crates.io/crates/vmdk-forensic) analyzer turns the structure into severity-graded, evidence-grade findings.\n\n## The two crates\n\n| Crate | Role | Import | `cargo add` |\n|---|---|---|---|\n| [`vmdk-core`](https://crates.io/crates/vmdk-core) | Read-only VMDK reader: decoded virtual-sector `Read + Seek`, RGD-fallback recovery, `ddb` provenance | `use vmdk::…` | `cargo add vmdk-core` |\n| [`vmdk-forensic`](https://crates.io/crates/vmdk-forensic) | Integrity analyzer → canonical `forensicnomicon::report::Finding` (re-exports the reader) | `use vmdk_forensic::…` | `cargo add vmdk-forensic` |\n\n## Command-line tool\n\nOne command — no subcommand — tells you what a VMDK is and whether it's sound:\n\n```console\n$ vmdk disk.vmdk\n```\n\n```text\nFile:              disk.vmdk\nFormat:            VMDK v1 (monolithicSparse)\nVirtual disk size: 4,194,304 bytes (4.00 MiB)\nSectors:           8,192\nCID:               dc80b6c7\n\nProvenance:\n  Content ID:       dc80b6c7\n  Adapter:          ide\n  Geometry (C/H/S): 8/16/63\n  HW version:       4\n  Clean shutdown:   yes\n  Redundant GD:     present\n\nIntegrity: OK (no anomalies detected)\n```\n\nThe default `examine` view folds identity, `ddb` provenance, companion-file\ndiscovery, and the canonical forensic findings — RGD mismatch, dangling grain\ntables, unclean shutdown, FTP-mangled headers — into one answer, and **exits\nnon-zero on any High-severity finding**, so it drops straight into a triage\npipeline. Three flags extend it:\n\n- `--fingerprint` (`--fp`) — append a labelled virtual-disk SHA-256 + MD5\n- `--json` — machine-readable output for a SIEM / pipeline\n- `--recover` — read through a damaged primary grain directory (RGD fallback)\n\n```console\n$ vmdk --fingerprint --json disk.vmdk | jq '.findings, .fingerprint'\n```\n\nThree more verbs cover extraction and comparison:\n\n- `dump` — raw virtual-disk bytes to stdout or a file (`-o`), a byte range\n  (`--offset` / `--length`), a hex view (`--hex`), or a sparse-only\n  reconstruction (`--allocated-only`). `dump --recover` resolves reads through\n  the redundant grain directory, extracting data behind corruption that\n  `qemu-img` fails on — pipe it into a filesystem tool (NTFS, ext4, …) to read\n  the guest's files.\n- `map` — allocated (non-sparse) grain ranges as `start_lba,sector_count`\n- `diff` — byte-by-byte comparison of two VMDK virtual disks\n\n## Rust library\n\nThe reader is published as `vmdk-core` and imported as `vmdk`; the analyzer is `vmdk-forensic`:\n\n```bash\ncargo add vmdk-core       # the reader (imported as `vmdk`)\ncargo add vmdk-forensic   # the analyzer (re-exports the reader)\n```\n\n## Quick start\n\n```rust\nuse vmdk::VmdkReader;\nuse std::io::{Read, Seek, SeekFrom};\n\n// Open any `Read + Seek` source — a File, a Cursor, another container reader.\nlet mut disk = VmdkReader::open(std::fs::File::open(\"disk.vmdk\")?)?;\n\nprintln!(\"virtual size: {} bytes\", disk.virtual_disk_size());\n\n// Read decoded virtual sectors like any byte stream — sparse/compressed grains\n// are decompressed and zero-filled transparently.\nlet mut first_mib = vec![0u8; 1 \u003c\u003c 20];\ndisk.seek(SeekFrom::Start(0))?;\ndisk.read_exact(\u0026mut first_mib)?;\n# Ok::\u003c(), Box\u003cdyn std::error::Error\u003e\u003e(())\n```\n\nFor path-based images with companion files — `monolithicFlat`, the\n`twoGbMaxExtent*` split sets, raw-device maps — use `VmdkFileReader::open_path`,\nwhich locates and opens the extent files for you. For snapshot/delta trees, use\n`VmdkChainReader::open`, which layers a delta on its parent chain.\n\n## What makes this different from `qemu-img` and `libvmdk`\n\nMost VMDK readers answer one question: \"give me the bytes.\" `vmdk` answers the\nquestions a digital forensics examiner actually needs — and reads disks the\nothers give up on:\n\n| Capability | qemu-img / libvmdk | vmdk |\n|---|---|---|\n| Sparse / streamOptimized / flat read | ✅ | ✅ |\n| COWD (`vmfsSparse`/`vmfsThin`) + seSparse (VMFS6) | partial | ✅ |\n| Snapshot / delta chain traversal | ✅ | ✅ |\n| **Recover data behind a damaged primary GD** (redundant-GD fallback) | ✗ | ✅ |\n| **Recover an individual lost grain-table entry** from the redundant copy | ✗ | ✅ |\n| Redundant-GD validation (grain-table *contents*, not pointers) | ✗ | ✅ via `vmdk-forensic` |\n| Structural integrity scan (dangling GD/GT/grain pointers) | ✗ | ✅ via `vmdk-forensic` |\n| `ddb.*` disk database (adapter, geometry, UUID, tools/HW version) | discarded | ✅ |\n| Header provenance — unclean-shutdown flag, FTP-ASCII-mangling check | ✗ | ✅ via `vmdk-forensic` |\n| Change Block Tracking (`-ctk`) reference | ✗ | ✅ |\n| `longContentID` resolution (the `CID == 0xFFFFFFFE` sentinel) | ✗ | ✅ |\n| Raw Device Mapping (`VMFSRDM`) extent enumeration | ✗ | ✅ |\n| Streaming SHA-256 + MD5 of the virtual disk | ✗ | ✅ |\n| Adversarial-input hardening + fuzz testing | ✗ | ✅ |\n| Pure Rust, zero `unsafe`, no C library | ✗ | ✅ |\n\n## Formats\n\nEvery VMDK `createType` and extent type in the VMware Virtual Disk Format spec\n(cross-checked against QEMU `block/vmdk.c` and `libvmdk`):\n\n| `createType` | Notes |\n|---|---|\n| `monolithicSparse`, `streamOptimized` | header v1/v2/v3; DEFLATE grains; `GD_AT_END` footer |\n| `monolithicFlat`, `vmfs`, `vmfsPreallocated`, `vmfsEagerZeroedThick` | preallocated flat extents |\n| `twoGbMaxExtentSparse`, `twoGbMaxExtentFlat` | split 2 GB extent sets |\n| `vmfsSparse`, `vmfsThin` | ESXi COWD copy-on-write sparse |\n| `seSparse` | vSphere 6.5+ space-efficient sparse (nibble-typed, bit-rotated grains) |\n| `vmfsRaw`, `vmfsRawDeviceMap`, `vmfsPassthroughRawDeviceMap`, `fullDevice`, `partitionedDevice` | device / raw-LUN maps |\n| `custom` | arbitrary extent mix, routed by extent type |\n\nExtent types: `FLAT`, `VMFS`, `VMFSRAW`, `VMFSRDM`, `ZERO`, `SPARSE`,\n`VMFSSPARSE`, `SESPARSE`; access `RW` / `RDONLY` / `NOACCESS`. `ZERO` and\n`NOACCESS` regions read as zeros without touching disk.\n\n## Forensic recovery\n\nVMware writes the grain tables **twice** — the grain directory (GD) and a\nredundant copy (RGD) point to separate physical copies. `qemu-img` and `libvmdk`\nread only the primary and fail when it is damaged. `vmdk` uses the redundant copy\nto keep reading:\n\n```rust\nuse vmdk::VmdkReader;\nuse std::io::Read;\n\nlet mut disk = VmdkReader::open(std::fs::File::open(\"damaged.vmdk\")?)?;\n\n// Opt in to recovery, then read normally — damaged pointers resolve through the RGD.\ndisk.enable_rgd_fallback();\nlet mut buf = vec![0u8; 1 \u003c\u003c 20];\nlet _ = disk.read(\u0026mut buf);\nprintln!(\"recovered {} grain(s) via the RGD\", disk.rgd_recovery_count());\n# Ok::\u003c(), Box\u003cdyn std::error::Error\u003e\u003e(())\n```\n\nRecovery is opt-in and never changes a healthy read; without it a dangling pointer\nsimply errors (the safe default). To *triage* a damaged image first — how much of\nthe primary grain directory the RGD can recover, plus tamper/anomaly detection — use\nthe companion [`vmdk-forensic`](https://crates.io/crates/vmdk-forensic) crate.\n\n## Forensic metadata\n\nThe text descriptor carries provenance that other readers parse and then throw\naway. `vmdk` surfaces all of it:\n\n```rust\nuse vmdk::VmdkReader;\n\nlet mut disk = VmdkReader::open(std::fs::File::open(\"disk.vmdk\")?)?;\n\nlet ddb = disk.disk_database();                 // ddb.* disk database\nprintln!(\"adapter:   {:?}\", ddb.adapter_type);  // ide / lsilogic / pvscsi …\nprintln!(\"geometry:  {:?}\", ddb.geometry);      // CHS cylinders/heads/sectors\nprintln!(\"disk UUID: {:?}\", ddb.uuid);\nprintln!(\"HW / tools: {:?} / {:?}\", ddb.virtual_hw_version, ddb.tools_version);\n\nprintln!(\"CBT file:   {:?}\", disk.change_track_path());       // -ctk.vmdk reference\nprintln!(\"content ID: {}\",  disk.effective_content_id());     // resolves longContentID\n# Ok::\u003c(), Box\u003cdyn std::error::Error\u003e\u003e(())\n```\n\nHeader provenance (unclean-shutdown flag, FTP-ASCII-mangling check) and the integrity\n/ anomaly analysis live in the [`vmdk-forensic`](https://crates.io/crates/vmdk-forensic)\ncompanion crate — see [Related](#related).\n\n## API highlights\n\n| Method | Purpose |\n|---|---|\n| `VmdkReader::open(reader)` | open any `Read + Seek` source |\n| `VmdkFileReader::open_path(path)` | open path-based images (flat / multi-extent / device maps) |\n| `VmdkChainReader::open(path)` | layer a delta on its parent snapshot chain |\n| `read` / `seek` (`std::io`) | decoded virtual-sector byte stream |\n| `info()` → `VmdkInfo` | version, CID, geometry, compression, descriptor, disk database |\n| `is_allocated(lba)` / `iter_allocated_grains()` | sparse-map queries |\n| `hash()` → `VmdkDigest` | streaming SHA-256 + MD5 of the virtual disk |\n| `validate_rgd()` / `check_integrity()` | redundant-GD + structural integrity |\n| `grain_directory_recovery()` / `enable_rgd_fallback()` / `rgd_recovery_count()` | RGD recovery |\n| `disk_database()` / `header_provenance()` / `change_track_path()` / `effective_content_id()` | forensic metadata |\n\n`serde` derives on the public report types are available behind the `serde` feature.\n\n## Security\n\n`vmdk` is built to run on untrusted, potentially crafted disk images:\n\n- **No panics on malicious input** — every allocation derived from header fields\n  is bounds-checked; reads are clamped; compressed-grain sizes are capped.\n- **Allocation-amplification hardened** — `numGTEsPerGT` is capped at the spec\n  value (512), matching QEMU, so a crafted header can't drive a multi-gigabyte\n  grain-table allocation.\n- **Zero `unsafe`** — `unsafe_code = \"forbid\"` workspace-wide; no C dependency.\n- **Fuzz-tested** — four `cargo fuzz` targets cover the open path, the full\n  read surface, the RGD recovery paths, and the forensic analysis pipeline; run\n  in CI on every change and deeper on a schedule.\n\nHardened further in **0.6.0** (all on the untrusted-input path):\n\n- **Descriptor path-traversal sandboxed** — extent and `parentFileNameHint`\n  paths are confined to the image directory; an absolute or `..`-climbing path\n  is refused, so a crafted descriptor can't read arbitrary host files.\n- **Decompression-bomb bounded** — a compressed grain is decoded only up to its\n  grain size and refused if it expands further, so a few-KB payload can't\n  amplify into a multi-megabyte allocation.\n- **Snapshot-chain reads grain-clamped** — a sparse grain can no longer\n  zero-mask an allocated grain that follows it within the same read.\n- **Mixed-extent descriptors rejected** — a `custom` image listing both flat and\n  sparse extents fails loud instead of silently dropping the sparse ones.\n\n```bash\n# Requires nightly Rust and cargo-fuzz\nrustup install nightly\ncargo install cargo-fuzz\n\ncargo +nightly fuzz run fuzz_open\ncargo +nightly fuzz run fuzz_read\ncargo +nightly fuzz run fuzz_recover\ncargo +nightly fuzz run fuzz_forensic\n```\n\n## Testing\n\n280+ tests (unit + integration) covering every public API, every format branch,\nthe recovery paths, and adversarial inputs. The real VMware-written\n`monolithicSparse` corpus image (`dfvfs_ext2.vmdk`, from\n[log2timeline/dfvfs](https://github.com/log2timeline/dfvfs), Apache-2.0) is read\n**byte-for-byte against `qemu-img convert -O raw`**, and COWD / seSparse output is\ncross-validated against `qemu-img`'s independent reader — the synthetic fixtures\nand the reader cannot share a blind spot. Coverage is enforced in CI. Which oracle\nand corpus back each capability — with evidence tiers — is documented in\n[the validation page](https://securityronin.github.io/vmdk-forensic/validation/).\n\n```bash\ncargo test\ncargo +stable llvm-cov --workspace --all-features --summary-only\n```\n\n## Related\n\n**vmdk** gives you the virtual disk as bytes. These crates read other container\nformats the same way — a pure `Read + Seek` over the decoded sector stream:\n\n| Crate | Format |\n|---|---|\n| [`ewf`](https://github.com/SecurityRonin/ewf) | E01 / Expert Witness Format (EnCase, FTK Imager) |\n| [`vhdx`](https://github.com/SecurityRonin/vhdx) | Microsoft VHDX (Hyper-V, Azure) |\n| [`vhd`](https://github.com/SecurityRonin/vhd) | Legacy VHD (Virtual PC / Hyper-V Gen-1) |\n| [`qcow2`](https://github.com/SecurityRonin/qcow2) | QEMU / KVM QCOW2 |\n| [`dd`](https://github.com/SecurityRonin/dd) | Raw / flat / dd images |\n\nAudit a VMDK for tampering, corruption, and recoverability with its forensic sibling:\n\n| Crate | Role |\n|---|---|\n| [`vmdk-forensic`](https://github.com/SecurityRonin/vmdk-forensic) | VMDK integrity analysis — RGD adjudication, dangling-pointer scan, recovery triage, header provenance, graded anomalies |\n\nOnce you have the bytes, these parsers analyse the partition layout inside:\n\n| Crate | Scheme |\n|---|---|\n| [`mbr-forensic`](https://github.com/SecurityRonin/mbr-forensic) | Master Boot Record — anomalies, slack carving, boot-code ID |\n| [`gpt-forensic`](https://github.com/SecurityRonin/gpt-forensic) | GUID Partition Table — backup-header reconciliation, CRC32 |\n| [`disk-forensic`](https://github.com/SecurityRonin/disk-forensic) | **Orchestrator** — auto-detects MBR/GPT/APM and dispatches |\n\nContainer-format knowledge (magic numbers, header layouts, encoding rules) lives\nin [`forensicnomicon`](https://github.com/SecurityRonin/forensicnomicon).\n\n---\n\n[Privacy Policy](https://securityronin.github.io/vmdk-forensic/privacy/) · [Terms of Service](https://securityronin.github.io/vmdk-forensic/terms/) · © 2026 Security Ronin Ltd\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsecurityronin%2Fvmdk-forensic","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsecurityronin%2Fvmdk-forensic","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsecurityronin%2Fvmdk-forensic/lists"}