https://github.com/crev-dev/recursive-digest
A recursive file-system digest (hash)
https://github.com/crev-dev/recursive-digest
Last synced: 9 months ago
JSON representation
A recursive file-system digest (hash)
- Host: GitHub
- URL: https://github.com/crev-dev/recursive-digest
- Owner: crev-dev
- License: apache-2.0
- Created: 2020-02-01T03:40:53.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2023-04-04T20:35:12.000Z (almost 3 years ago)
- Last Synced: 2025-03-21T03:51:15.877Z (10 months ago)
- Language: Rust
- Size: 29.3 KB
- Stars: 13
- Watchers: 2
- Forks: 3
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE-APACHE
Awesome Lists containing this project
README
# Recursive file-system digest
This library implements a simple but efficient recursive file-system digest
algorithm. You have a directory with some content in it, and you'd like
a cryptographical digest (hash) of its content.
It was created for the purpose of checksuming source code packages
in `crev`, but it is generic and can be used for any other purpose.
## Algorithm
Given any digest algorithm `H` (a Hash function algorithm),
a `RecursiveDigest(H, path)` is:
* for a file: `H("F" || file_content)`
* for a symlink: `H("L" || symlink_content)`
* for a directory: `H("D" || directory_content)`
As you can see a one-letter ASCII prefix is used to make it impossible
to create a file that has the same digest as a directory,
etc. The drawback of this approach is that `RecursiveDigest(H, path)` of
a simple file is not the same as just a normal digest of it (`H(file_content)`) .
`file_content` is just the byte content of a file.
`symlink_content` is just the path the symlink is pointing to, as bytes.
`directory_content` is created by:
* sorting all entries of a directory by name, in ascending order,
using a simple byte-sequence comparison
* for all entries concatenating pairs of:
* `H(entry_name)`
* `RecursiveDigest(H, entry_path)`
If optional additional data extensions is used, the `H(entry_name)` above becomes
`H(entry_name || 0 || additional data)`. The format and meaning of additional
data is unspecified, but was intendet for fielsystem metadata like file system
permissions and ownership.