Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/tweedegolf/lzjd-rs
Rust implementation of the LZJD algorithm (https://github.com/EdwardRaff/jLZJD)
https://github.com/tweedegolf/lzjd-rs
Last synced: about 1 month ago
JSON representation
Rust implementation of the LZJD algorithm (https://github.com/EdwardRaff/jLZJD)
- Host: GitHub
- URL: https://github.com/tweedegolf/lzjd-rs
- Owner: tweedegolf
- License: gpl-3.0
- Created: 2019-03-20T07:27:21.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2023-12-24T00:12:29.000Z (about 1 year ago)
- Last Synced: 2024-11-20T15:43:06.035Z (about 1 month ago)
- Language: Rust
- Size: 56.6 KB
- Stars: 17
- Watchers: 5
- Forks: 3
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# LZJD
[Documentation](https://docs.rs/lzjd)
Rust implementation of Lempel-Ziv Jaccard Distance (LZJD) algorithm based on [jLZJD](https://github.com/EdwardRaff/jLZJD)
Main differences:
- Rust instead of Java
- Can use any hasher (executable uses CRC32) instead of just Murmur3
- Does not allocate memory for every unique hash, instead keeps k=1024 smallest
- Based on Vec instead of IntSetNoRemove, which is more like HashMap
- Hash files are considerably smaller if small sequences have been digested```
USAGE:
lzjd [FLAGS] [OPTIONS] ...FLAGS:
-c, --compare compare SDBFs in file, or two SDBF files
-r, --deep generate SDBFs from directories and files
-g, --gen-compare compare all pairs in source data
-h, --help Prints help information
-V, --version Prints version informationOPTIONS:
-o, --output send output to files
-t, --threshold only show results >= threshold [default: 1]ARGS:
... Sets the input file to use
```See also:
- [Original paper](http://www.edwardraff.com/publications/alternative-ncd-lzjd.pdf)
- [Follow-up paper](https://arxiv.org/abs/1708.03346)