Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ankane/ngt-ruby
High-speed approximate nearest neighbors for Ruby
https://github.com/ankane/ngt-ruby
Last synced: about 1 month ago
JSON representation
High-speed approximate nearest neighbors for Ruby
- Host: GitHub
- URL: https://github.com/ankane/ngt-ruby
- Owner: ankane
- License: apache-2.0
- Created: 2019-10-22T21:57:19.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2024-07-27T04:08:59.000Z (about 2 months ago)
- Last Synced: 2024-07-27T05:24:26.866Z (about 2 months ago)
- Language: Ruby
- Homepage:
- Size: 101 KB
- Stars: 48
- Watchers: 4
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# NGT Ruby
[NGT](https://github.com/yahoojapan/NGT) - high-speed approximate nearest neighbors - for Ruby
[![Build Status](https://github.com/ankane/ngt-ruby/actions/workflows/build.yml/badge.svg)](https://github.com/ankane/ngt-ruby/actions)
## Installation
Add this line to your application’s Gemfile:
```ruby
gem "ngt"
```On Mac, also install OpenMP:
```sh
brew install libomp
```NGT is not available for Windows
## Getting Started
Prep your data
```ruby
objects = [
[1, 1, 2, 1],
[5, 4, 6, 5],
[1, 2, 1, 2]
]
```Create an index
```ruby
index = Ngt::Index.new(dimensions)
```Insert objects
```ruby
index.batch_insert(objects)
```Search the index
```ruby
index.search(query, size: 3)
```Save the index
```ruby
index.save(path)
```Load an index
```ruby
index = Ngt::Index.load(path)
```Get an object by id
```ruby
index.object(id)
```Insert a single object
```ruby
index.insert(object)
```Remove an object by id
```ruby
index.remove(id)
```Build the index
```ruby
index.build_index
```Optimize the index
```ruby
optimizer = Ngt::Optimizer.new(outgoing: 10, incoming: 120)
optimizer.adjust_search_coefficients(index)
optimizer.execute(index, new_path)
```## Full Example
```ruby
dim = 10
objects = []
100.times do |i|
objects << dim.times.map { rand(100) }
endindex = Ngt::Index.new(dim)
index.batch_insert(objects)query = objects[0]
result = index.search(query, size: 3)result.each do |res|
puts "#{res[:id]}, #{res[:distance]}"
p index.object(res[:id])
end
```## Index Options
Defaults shown below
```ruby
Ngt::Index.new(dimensions,
edge_size_for_creation: 10,
edge_size_for_search: 40,
object_type: :float, # :float, :integer
distance_type: :l2, # :l1, :l2, :hamming, :angle, :cosine, :normalized_angle, :normalized_cosine, :jaccard
path: nil
)
```## Optimizer Options
Defaults shown below
```ruby
Ngt::Optimizer.new(
outgoing: 10,
incoming: 120,
queries: 100,
low_accuracy_from: 0.3,
low_accuracy_to: 0.5,
high_accuracy_from: 0.8,
high_accuracy_to: 0.9,
gt_epsilon: 0.1,
merge: 0.2
)
```## Data
Data can be an array of arrays
```ruby
[[1, 2, 3], [4, 5, 6]]
```Or a Numo array
```ruby
Numo::NArray.cast([[1, 2, 3], [4, 5, 6]])
```## Resources
- [ANN Benchmarks](https://github.com/erikbern/ann-benchmarks)
## Credits
This library is modeled after NGT’s [Python API](https://github.com/yahoojapan/NGT/blob/master/python/README-ngtpy.md).
## History
View the [changelog](https://github.com/ankane/ngt-ruby/blob/master/CHANGELOG.md)
## Contributing
Everyone is encouraged to help improve this project. Here are a few ways you can help:
- [Report bugs](https://github.com/ankane/ngt-ruby/issues)
- Fix bugs and [submit pull requests](https://github.com/ankane/ngt-ruby/pulls)
- Write, clarify, or fix documentation
- Suggest or add new featuresTo get started with development:
```sh
git clone https://github.com/ankane/ngt-ruby.git
cd ngt-ruby
bundle install
bundle exec rake vendor:all
bundle exec rake test
```