{"id":18581926,"url":"https://github.com/coder/hnsw","last_synced_at":"2025-04-10T11:35:41.572Z","repository":{"id":238923611,"uuid":"796750256","full_name":"coder/hnsw","owner":"coder","description":"In-memory vector index for Go","archived":false,"fork":false,"pushed_at":"2024-06-14T21:11:20.000Z","size":48,"stargazers_count":131,"open_issues_count":2,"forks_count":10,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-02T22:35:29.069Z","etag":null,"topics":["ai","faiss","go","golang","vector-database"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"cc0-1.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/coder.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-05-06T15:00:21.000Z","updated_at":"2025-03-27T18:23:19.000Z","dependencies_parsed_at":"2024-05-22T19:40:14.963Z","dependency_job_id":null,"html_url":"https://github.com/coder/hnsw","commit_stats":null,"previous_names":["coder/hnsw"],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coder%2Fhnsw","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coder%2Fhnsw/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coder%2Fhnsw/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coder%2Fhnsw/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/coder","download_url":"https://codeload.github.com/coder/hnsw/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248208688,"owners_count":21065205,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","faiss","go","golang","vector-database"],"created_at":"2024-11-07T00:08:23.791Z","updated_at":"2025-04-10T11:35:36.534Z","avatar_url":"https://github.com/coder.png","language":"Go","readme":"# hnsw\n[![GoDoc](https://godoc.org/github.com/golang/gddo?status.svg)](https://pkg.go.dev/github.com/coder/hnsw@main?utm_source=godoc)\n![Go workflow status](https://github.com/coder/hnsw/actions/workflows/go.yaml/badge.svg)\n\n\n\nPackage `hnsw` implements Hierarchical Navigable Small World graphs in Go. You\ncan read up about how they work [here](https://www.pinecone.io/learn/series/faiss/hnsw/). In essence,\nthey allow for fast approximate nearest neighbor searches with high-dimensional\nvector data.\n\nThis package can be thought of as an in-memory alternative to your favorite \nvector database (e.g. Pinecone, Weaviate). It implements just the essential\noperations:\n\n| Operation | Complexity            | Description                                  |\n| --------- | --------------------- | -------------------------------------------- |\n| Insert    | $O(log(n))$           | Insert a vector into the graph               |\n| Delete    | $O(M^2 \\cdot log(n))$ | Delete a vector from the graph               |\n| Search    | $O(log(n))$           | Search for the nearest neighbors of a vector |\n| Lookup    | $O(1)$                | Retrieve a vector by ID                      |\n\n\u003e [!NOTE]\n\u003e Complexities are approximate where $n$ is the number of vectors in the graph\n\u003e and $M$ is the maximum number of neighbors each node can have. This [paper](https://arxiv.org/pdf/1603.09320) is a good resource for understanding the effect of\n\u003e the various construction parameters.\n\n## Usage\n\n```\ngo get github.com/coder/hnsw@main\n```\n\n```go\ng := hnsw.NewGraph[int]()\ng.Add(\n    hnsw.MakeNode(1, []float32{1, 1, 1}),\n    hnsw.MakeNode(2, []float32{1, -1, 0.999}),\n    hnsw.MakeNode(3, []float32{1, 0, -0.5}),\n)\n\nneighbors := g.Search(\n    []float32{0.5, 0.5, 0.5},\n    1,\n)\nfmt.Printf(\"best friend: %v\\n\", neighbors[0].Vec)\n// Output: best friend: [1 1 1]\n```\n\n\n\n## Persistence\n\nWhile all graph operations are in-memory, `hnsw` provides facilities for loading/saving from persistent storage.\n\nFor an `io.Reader`/`io.Writer` interface, use `Graph.Export` and `Graph.Import`.\n\nIf you're using a single file as the backend, hnsw provides a convenient `SavedGraph` type instead:\n\n```go\npath := \"some.graph\"\ng1, err := LoadSavedGraph[int](path)\nif err != nil {\n    panic(err)\n}\n// Insert some vectors\nfor i := 0; i \u003c 128; i++ {\n    g1.Add(hnsw.MakeNode(i, []float32{float32(i)}))\n}\n\n// Save to disk\nerr = g1.Save()\nif err != nil {\n    panic(err)\n}\n\n// Later...\n// g2 is a copy of g1\ng2, err := LoadSavedGraph[int](path)\nif err != nil {\n    panic(err)\n}\n```\n\nSee more:\n* [Export](https://pkg.go.dev/github.com/coder/hnsw#Graph.Export)\n* [Import](https://pkg.go.dev/github.com/coder/hnsw#Graph.Import)\n* [SavedGraph](https://pkg.go.dev/github.com/coder/hnsw#SavedGraph)\n\nWe use a fast binary encoding for the graph, so you can expect to save/load\nnearly at disk speed. On my M3 Macbook I get these benchmark results:\n\n```\ngoos: darwin\ngoarch: arm64\npkg: github.com/coder/hnsw\nBenchmarkGraph_Import-16            4029            259927 ns/op         796.85 MB/s      496022 B/op       3212 allocs/op\nBenchmarkGraph_Export-16            7042            168028 ns/op        1232.49 MB/s      239886 B/op       2388 allocs/op\nPASS\nok      github.com/coder/hnsw   2.624s\n```\n\nwhen saving/loading a graph of 100 vectors with 256 dimensions.\n\n## Performance\n\nBy and large the greatest effect you can have on the performance of the graph\nis reducing the dimensionality of your data. At 1536 dimensions (OpenAI default),\n70% of the query process under default parameters is spent in the distance function.\n\nIf you're struggling with slowness / latency, consider:\n* Reducing dimensionality\n* Increasing $M$\n\nAnd, if you're struggling with excess memory usage, consider:\n* Reducing $M$ a.k.a `Graph.M` (the maximum number of neighbors each node can have)\n* Reducing $m_L$ a.k.a `Graph.Ml` (the level generation parameter)\n\n## Memory Overhead\n\nThe memory overhead of a graph looks like:\n\n$$\n\\displaylines{\nmem_{graph} = n \\cdot \\log(n) \\cdot \\text{size(id)} \\cdot M \\\\\nmem_{base} = n \\cdot d \\cdot 4 \\\\\nmem_{total} = mem_{graph} + mem_{base}\n}\n$$\n\nwhere:\n* $n$ is the number of vectors in the graph\n* $\\text{size(key)}$ is the average size of the key in bytes\n* $M$ is the maximum number of neighbors each node can have\n* $d$ is the dimensionality of the vectors\n* $mem_{graph}$ is the memory used by the graph structure across all layers\n* $mem_{base}$ is the memory used by the vectors themselves in the base or 0th layer\n\nYou can infer that:\n* Connectivity ($M$) is very expensive if keys are large\n* If $d \\cdot 4$ is far larger than $M \\cdot \\text{size(key)}$, you should expect linear memory usage spent on representing vector data\n* If $d \\cdot 4$ is far smaller than $M \\cdot \\text{size(key)}$, you should expect $n \\cdot \\log(n)$ memory usage spent on representing graph structure\n\nIn the example of a graph with 256 dimensions, and $M = 16$, with 8 byte keys, you would see that each vector takes:\n\n* $256 \\cdot 4 = 1024$ data bytes \n* $16 \\cdot 8 = 128$ metadata bytes\n\nand memory growth is mostly linear.\n","funding_links":[],"categories":["Sdks \u0026 Libraries"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcoder%2Fhnsw","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcoder%2Fhnsw","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcoder%2Fhnsw/lists"}