https://github.com/joeavanzato/differ
An easy-to-use, cross-platform utility for capturing and diffing file system metadata snapshots.
https://github.com/joeavanzato/differ
analysis changes diff diff-analysis filesystem snapshot
Last synced: 9 months ago
JSON representation
An easy-to-use, cross-platform utility for capturing and diffing file system metadata snapshots.
- Host: GitHub
- URL: https://github.com/joeavanzato/differ
- Owner: joeavanzato
- License: mit
- Created: 2024-09-25T00:33:28.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-10-08T17:36:52.000Z (about 1 year ago)
- Last Synced: 2024-11-16T06:47:27.752Z (about 1 year ago)
- Topics: analysis, changes, diff, diff-analysis, filesystem, snapshot
- Language: Go
- Homepage:
- Size: 52.7 KB
- Stars: 3
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# differ
## File System Metadata Snapshots made Easy
### What is it?
differ is a purpose-built tool for generating and comparing ('diffing') metadata snapshots of logical drives for any necessary purpose - this may include tasks such as determining changes made by a specific piece of software, changes between patches, malware analysis/sandboxing, integrity checks, etc.
differ works cross-platform between Linux and Windows (and probably Mac as well but I don't have a test machine for that).
### Why?
differ was created because I had a need to perform a configurable file system metadata snapshot and subsequent comparison and I could not identify a simple and flexible open-source tool for this task.
Example Usecases Include:
* Baselining the contents of a logical drive to identify all changes following a system/software change
* Establishing a baseline for use in Incident Response processes and to identify changes in system files or created/deleted files following a breach
* Identifying differences in pre- and post- metadata snapshots during dynamic malware analysis (files created, files modified, files deleted)
* Quickly hashing files in any number of directories based on extension allow or block lists to identify any unwanted software
* Feeding data into allow/block lists to further DFIR processes/investigations
* Hunting for specific file-types across a system or specific directories
### How to use?
differ can be run both through command-line arguments or fed a configuration file - the easiest way to use it is to download the most recent build - this will include differ.exe and differ_config.json.
### Configuration File
To launch differ using a configuration file, just tell it where to find it like below;
```
differ.exe -config "configs\full_system_snapshot.json"
differ.exe -config "configs\full_scan_common_malware_extensions.json"
differ.exe -config "some\\path\\to\\config.json"
```
The full_system_snapshot configuration file is shown below - this configuration tells differ to recursively snapshot the metadata for all files starting at C:\ with no restrictions on extensions and performing the SHA1 hash of each encountered file. CSV export is disabled by default.
On a common personal system using a nearly-full 2 TB M.2 SSD, this type of scan will take approximately 15-30 minutes depending on CPU availability. The type of disk drive and connection mechanism will greatly influence the speed of the snapshot due to the potential for increased read-times. I would recommend only snapshotting required directories and extensions when possible.
```json
{
"directories": [
"C:\\"
],
"use_extension_allowlist": false,
"extension_allowlist": [
".exe"
],
"use_extension_blocklist": false,
"extension_blocklist": [
".txt"
],
"hash_enabled": true,
"hash_algorithm": "sha1",
"do_csv_export": false
}
```
* directories - specify a list of directories to walk recursively for snapshot generation
* use_extension_allowlist - if true, will skip all files that do not possess an extension present in the allowlist
* use_extension_blocklist - if true, will skip all files that have an extension present in the blocklist
* hash_enabled - if true, will hash all included files
* hash_algorithm - can be sha1/sha256/md5
* do_csv_export - if true, will generate a CSV output in addition to parquet
By default, differ will store a *.parquet file in the current working directory that contains the UNIX timestamp and hostname of the snapshot, such as '1727226208164680600_DESKTOP-KH2I9H2_differ_snapshot'.
Enabling CSV exports results in an immediately human-readable file being produced if the user doesn't want to convert the provided parquet to some other format - this is mainly done for storage purposes.
### Command-Line Arguments
```
-config some_file.json : When specified, differ will ignore all other command-line arguments and rely solely on the data contained within the configuration file for execution.
-directory "C:\\" : Tells differ the directory to use as the starting point for a recursive file-walk snapshot
-csv : Tells differ to also produce CSV output in addition to the default Parquet
-hash md5 / -hash sha1 / -hash sha256 : Tells differ to also compute the hash of all scanned files using one of the specified algorithms
-compare file1,file2 : Tells differ to 'diff' the two provided files - differ will automatically attempt to determine which one is older/newer based on the file naming format
```
### Comparing Snapshots
To compare two separate snapshots, use the '-compare' argument as follows:
```
differ.exe -compare 1727205513801559400_DESKTOP-KH2I9H2_differ_snapshot.parquet,1727224094973553500_DESKTOP-KH2I9H2_differ_snapshot.parquet
```
differ will perform a few different checks when looking for changes:
* Files with the same path, name and extension but that...
* Have different hashes (modification)
* Have different modification times (modification)
* Have different file sizes (modification)
* Files that do not appear in the older snapshot but do appear in the newer one (creation)
* Files that do not appear in the newer snapshot but do appear in the previous one (deletion)
All differences are written to a CSV output file (snapshot_diff.csv) in the current working directory.
Be aware there are caveats here - if a file is moved between two directories, we will count that as both a deletion and creation since we are not doing 'hash-scanning' across the entire snapshot at this time.
### Common Extension Lists
For convenience, a few configuration files are provided inside the configs directory for common use-cases. They are detailed below;
* full_system_snapshot_(win|linux).json
* Recursively snapshot an entire drive starting at C:\ (or \\ on Linux) with no restrictions on extension and also performing SHA1 hash.
* quick_common_malware_hashscan.json
* Contains common directories where malware often lives and an extension allow-list for the most common file types encountered during incidents.
* full_scan_common_malware_extensions.json
* Same as above but will scan for common malware extensions across the entire logical drive starting at C:\.