https://github.com/adamliesko/tlsh
TLSH (Trend Micro Locality Sensitive Hash) library for Ruby
https://github.com/adamliesko/tlsh
fuzzy gem hashing locality-sensitive-hashing ruby similarity-measurement tlsh
Last synced: 8 months ago
JSON representation
TLSH (Trend Micro Locality Sensitive Hash) library for Ruby
- Host: GitHub
- URL: https://github.com/adamliesko/tlsh
- Owner: adamliesko
- License: other
- Created: 2017-08-17T23:12:39.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2017-08-28T17:21:14.000Z (over 8 years ago)
- Last Synced: 2025-08-01T11:52:58.130Z (9 months ago)
- Topics: fuzzy, gem, hashing, locality-sensitive-hashing, ruby, similarity-measurement, tlsh
- Language: Ruby
- Size: 549 KB
- Stars: 25
- Watchers: 4
- Forks: 3
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE.txt
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
[](https://badge.fury.io/rb/tlsh)
[](https://travis-ci.org/adamliesko/tlsh)
[](https://coveralls.io/github/adamliesko/tlsh?branch=master)
# TLSH - Trend Micro Locality Sensitive Hash
TLSH is a fuzzy matching library. Given a byte stream with a minimum length of 256 bytes, TLSH generates a hash value which can be used for similarity comparisons. Similar objects will have similar hash values which allow for the detection of similar objects by comparing their hash values. Note that the byte stream should have a sufficient amount of complexity. For example, a byte stream of identical bytes will not generate a hash value.
The computed hash is 35 bytes long (output as 70 hexadecimal characters). The first 3 bytes are used to capture the information about the file as a whole (length, ...), while the last 32 bytes are used to capture information about incremental parts of the file.
DISCLAIMER: Based on [Trendmicro's TLSH](https://github.com/trendmicro/tlsh) and work of [glaslos](https://github.com/glaslos) Go port [tlsh](https://github.com/glaslos/tlsh).
## Installation
$ gem install tlsh
## Usage
Computing a diff between two files
```ruby
> Tlsh.diff_files('./../fixtures/test_file_1', './../fixtures/test_file_2')
=> 501
```
Getting a hash of a file
```ruby
> t1 = Tlsh.hash_file('./../fixtures/test_file_1')
=> #
> t1.string.to_s
=> "b2317c38fac0333c8ff7d3ff31fcf3b7fb3f9a3ef3bf3c880cfc43ebf97f3cc73fbfc"
```
Comparing hashes between each other
```ruby
> t1 = Tlsh.hash_file('./../fixtures/test_file_1')
=> #
> t2 = Tlsh.hash_file('./../fixtures/test_file_2')
=> #
> t1.diff(t2)
=> 488
```
Getting a hash of bytes
```ruby
> Tlsh.hash_bytes([11,4,...2])
=> #
```
## Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/adamliesko/tlsh. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
## License
The gem is available as open source under the terms of the [Apache](https://opensource.org/licenses/Apache-2.0).
## Code of Conduct
Everyone interacting in the Tlsh project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/adamliesko/tlsh/blob/master/CODE_OF_CONDUCT.md).