https://github.com/ritchie46/lsh-rs
Locality Sensitive Hashing in Rust with Python bindings
https://github.com/ritchie46/lsh-rs
cosine-similarity l2-distance lsh lsh-algorithm rust
Last synced: 2 months ago
JSON representation
Locality Sensitive Hashing in Rust with Python bindings
- Host: GitHub
- URL: https://github.com/ritchie46/lsh-rs
- Owner: ritchie46
- License: mit
- Created: 2020-03-06T14:16:40.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2023-06-20T06:19:59.000Z (over 2 years ago)
- Last Synced: 2025-06-20T15:10:53.937Z (4 months ago)
- Topics: cosine-similarity, l2-distance, lsh, lsh-algorithm, rust
- Language: Rust
- Homepage:
- Size: 511 KB
- Stars: 116
- Watchers: 3
- Forks: 22
- Open Issues: 8
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# lsh-rs (Locality Sensitive Hashing)
[](https://docs.rs/lsh-rs/latest/lsh_rs/)
[](https://travis-ci.org/ritchie46/lsh-rs)Locality sensitive hashing can help retrieving Approximate Nearest Neighbors in sub-linear time.
For more information on the subject see:
* [Introduction on LSH](http://people.csail.mit.edu/gregory/annbook/introduction.pdf)
* [Section 2. describes the hash families used in this crate](https://arxiv.org/pdf/1411.3787.pdf)
* [LSH and neural networks](https://www.ritchievink.com/blog/2020/04/07/sparse-neural-networks-and-hash-tables-with-locality-sensitive-hashing/)## Implementations
* **Base LSH**
- Signed Random Projections *(Cosine similarity)*
- L2 distance
- MIPS *(Dot products/ Maximum Inner Product Search)*
- MinHash *(Jaccard Similarity)*
* **Multi Probe LSH**
- **Step wise probing**
- SRP (only bit shifts)
- **Query directed probing**
- L2
- MIPS
* Generic numeric types## Getting started
```rust
use lsh_rs::LshMem;
// 2 rows w/ dimension 3.
let p = &[vec![1., 1.5, 2.],
vec![2., 1.1, -0.3]];// Do one time expensive preprocessing.
let n_projections = 9;
let n_hash_tables = 30;
let dim = 10;
let dim = 3;
let mut lsh = LshMem::new(n_projections, n_hash_tables, dim)
.srp()
.unwrap();
lsh.store_vecs(p);// Query in sublinear time.
let query = &[1.1, 1.2, 1.2];
lsh.query_bucket(query);
```## Documentation
* [Read the Rust docs](https://docs.rs/lsh-rs/latest/lsh_rs/).
* [Read the Python docs](https://lsh-rs.readthedocs.io/en/latest/) for the Python bindings.## Python
At the moment, the Python bindings are only compiled for Linux x86_64 systems.`$ pip install floky`
```python
from floky import SRP
import numpy as npN = 10000
n = 100
dim = 10# Generate some random data points
data_points = np.random.randn(N, dim)# Do a one time (expensive) fit.
lsh = SRP(n_projections=19, n_hash_tables=10)
lsh.fit(data_points)# Query approximated nearest neigbors in sub-linear time
query = np.random.randn(n, dim)
results = lsh.predict(query)
```