{"id":17873929,"url":"https://github.com/kelindar/search","last_synced_at":"2025-05-16T06:04:58.383Z","repository":{"id":259725494,"uuid":"864976834","full_name":"kelindar/search","owner":"kelindar","description":"Go library for embedded vector search and semantic embeddings using llama.cpp","archived":false,"fork":false,"pushed_at":"2025-03-07T19:34:37.000Z","size":731,"stargazers_count":445,"open_issues_count":3,"forks_count":15,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-05-15T18:56:33.537Z","etag":null,"topics":["ai","bert","embeddings","gguf","gpu","llamacpp","search-engine","semantic-search","simd","vector-search"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kelindar.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":["kelindar"]}},"created_at":"2024-09-29T17:05:04.000Z","updated_at":"2025-05-14T11:15:03.000Z","dependencies_parsed_at":"2024-10-27T16:24:37.156Z","dependency_job_id":"6e0e9231-92bf-4afe-bf50-14b690445883","html_url":"https://github.com/kelindar/search","commit_stats":null,"previous_names":["kelindar/search"],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kelindar%2Fsearch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kelindar%2Fsearch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kelindar%2Fsearch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kelindar%2Fsearch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kelindar","download_url":"https://codeload.github.com/kelindar/search/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254478187,"owners_count":22077676,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","bert","embeddings","gguf","gpu","llamacpp","search-engine","semantic-search","simd","vector-search"],"created_at":"2024-10-28T11:06:00.224Z","updated_at":"2025-05-16T06:04:58.352Z","avatar_url":"https://github.com/kelindar.png","language":"Go","funding_links":["https://github.com/sponsors/kelindar"],"categories":["*Ops for AI"],"sub_categories":["Model Serving \u0026 Inference"],"readme":"\u003cp align=\"center\"\u003e\n\u003cimg width=\"330\" height=\"110\" src=\".github/logo.png\" border=\"0\" alt=\"kelindar/search\"\u003e\n\u003cbr\u003e\n\u003cimg src=\"https://img.shields.io/github/go-mod/go-version/kelindar/search\" alt=\"Go Version\"\u003e\n\u003ca href=\"https://pkg.go.dev/github.com/kelindar/search\"\u003e\u003cimg src=\"https://pkg.go.dev/badge/github.com/kelindar/search\" alt=\"PkgGoDev\"\u003e\u003c/a\u003e\n\u003ca href=\"https://goreportcard.com/report/github.com/kelindar/search\"\u003e\u003cimg src=\"https://goreportcard.com/badge/github.com/kelindar/search\" alt=\"Go Report Card\"\u003e\u003c/a\u003e\n\u003ca href=\"https://opensource.org/licenses/MIT\"\u003e\u003cimg src=\"https://img.shields.io/badge/License-MIT-blue.svg\" alt=\"License\"\u003e\u003c/a\u003e\n\u003ca href=\"https://coveralls.io/github/kelindar/search\"\u003e\u003cimg src=\"https://coveralls.io/repos/github/kelindar/search/badge.svg\" alt=\"Coverage\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n# Semantic Search\n\nThis library was created to provide an **easy and efficient solution for embedding and vector search**, making it perfect for small to medium-scale projects that still need some **serious semantic power**. It’s built around a simple idea: if your dataset is small enough, you can achieve accurate results with brute-force techniques, and with some smart optimizations like **SIMD**, you can keep things fast and lean.\n\nThe library’s strength lies in its simplicity and support for **GGUF BERT models**, letting you leverage sophisticated embeddings without getting bogged down by the complexities of traditional search systems. It offers **GPU acceleration**, enabling quick computations on supported hardware. If your dataset has fewer than 100,000 entries, this library is a great fit for integrating semantic search into your Go applications with minimal hassle.\n\n![demo](./.github/demo.gif)\n\n## 🚀 Key Features\n\n- **llama.cpp without cgo**: The library is built to work with [llama.cpp](https://github.com/ggerganov/llama.cpp) without using cgo. Instead, it relies on [purego](https://github.com/ebitengine/purego) , which allows calling shared C libraries directly from Go code without the need for cgo. This design significantly simplifies the integration, deployment, and cross-compilation, making it easier to build Go applications that interface with native libraries.\n- **Support for BERT Models**: The library supports BERT models via [llama.cpp](https://github.com/ggerganov/llama.cpp/pull/5423). Vast variations of BERT models can be used, as long as they are using GGUF format.\n- **Precompiled Binaries with Vulkan GPU Support**: Available for Windows and Linux in the [dist](dist) directory, compiled with Vulkan for GPU acceleration. However, you can compile the library yourself with or without GPU support.\n- **Search Index for Embeddings**: The library supports the creation of a search index from computed embeddings, which can be saved to disk and loaded later. This feature is suitable for basic vector-based searches in small-scale applications, but it may face efficiency challenges with large datasets due to the use of brute-force techniques.\n\n## 🤔 Limitations\n\nWhile simple vector search excels in small-scale applications,avoid using this library if you have the following requirements.\n\n- **Large Datasets**: The current implementation is designed for small-scale applications, and datasets exceeding 100,000 entries may suffer from performance bottlenecks due to the brute-force search approach. For larger datasets, approximate nearest neighbor (ANN) algorithms and specialized data structures should be considered for efficiency.\n- **Complex Query Requirements**: The library focuses on simple vector similarity search and does not support advanced query capabilities like multi-field filtering, fuzzy matching, or SQL-like operations that are common in more sophisticated search engines.\n- **High-Dimensional Complex Embeddings**: Large language models (LLMs) generate embeddings that are both high-dimensional and computationally intensive. Handling these embeddings in real-time can be taxing on the system unless sufficient GPU resources are available and optimized for low-latency inference.\n\n## 📚 How to Use the Library\n\nThis example demonstrates how to use the library to generate embeddings for text and perform a simple vector search. The code snippet below shows how to load a model, generate embeddings for text, create a search index, and perform a search.\n\n1. **Install library**: Precompiled binaries for Windows and Linux are provided in the [dist](dist) directory. If your target architecture or platform isn't covered by these binaries, you'll need to compile the library from the source. Drop these binaries in `/usr/lib` or equivalent.\n\n1. **Load a model**: The `search.NewVectorizer` function initializes a model using a GGUF file. This example loads the _MiniLM-L6-v2.Q8_0.gguf_ model. The second parameter, indicates the number of GPU layers to enable (0 for CPU only).\n\n```go\nm, err := search.NewVectorizer(\"../dist/MiniLM-L6-v2.Q8_0.gguf\", 0)\nif err != nil {\n    // handle error\n}\ndefer m.Close()\n```\n\n3. **Generate text embeddings**: The `EmbedText` method is used to generate vector embeddings for a given text input. This converts your text into a dense numerical vector representation given the model you loaded in the previous step.\n\n```go\nembedding, err := m.EmbedText(\"Your text here\")\n```\n\n4. **Create an index and adding vectors**: Create a new index using `search.NewIndex`. The type parameter `[string]` in this example specifies that each vector is associated with a string value. You can add multiple vectors with corresponding labels.\n\n```go\nindex := search.NewIndex[string]()\nindex.Add(embedding, \"Your text here\")\n```\n\n5. **Search the index**: Perform a search using the `Search` method, which takes an embedding vector and a number of results to retrieve. This example searches for the 10 most relevant results and prints them along with their relevance scores.\n\n```go\nresults := index.Search(embedding, 10)\nfor _, r := range results {\n    fmt.Printf(\"Result: %s (Relevance: %.2f)\\n\", r.Value, r.Relevance)\n}\n```\n\n## 🛠 Compile library\n\nFirst, clone the repository and its submodules with the following commands. The `--recurse-submodules` flag is used to clone the `ggml` submodule, which is a header-only library for matrix operations.\n\n```bash\ngit submodule update --init --recursive\ngit lfs pull\n```\n\n### Compile on Linux\n\nMake sure you have a C/C++ compiler and CMake installed. For Ubuntu, you can install them with the following commands:\n\n```bash\nsudo apt-get update\nsudo apt-get install build-essential cmake\n```\n\nThen you can compile the library with the following commands:\n\n```bash\nmkdir build \u0026\u0026 cd build\ncmake -DBUILD_SHARED_LIBS=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_COMPILER=g++ -DCMAKE_C_COMPILER=gcc ..\ncmake --build . --config Release\n```\n\nThis should generate `libllama_go.so` that statically links everything necessary. You can also install the library by coping it into `/usr/lib`.\n\n### Compile on Windows\n\nMake sure you have a C/C++ compiler and CMake installed. For Windows, a simple option is to use [Build Tools for Visual Studio](https://visualstudio.microsoft.com/downloads/) (make sure CLI tools are included) and [CMake](https://cmake.org/download/).\n\n```bash\nmkdir build \u0026\u0026 cd build\ncmake -DCMAKE_BUILD_TYPE=Release ..\ncmake --build . --config Release\n```\n\nIf you are using Visual Studio, solution files are generated. You can open the solution file with Visual Studio and build the project from there. The `bin` directory would then contain `llamago.dll`.\n\n### GPU and other options\n\nTo enable GPU support (e.g. Vulkan), you'll need to add an appropriate flag to the CMake command, please refer to refer to the [llama.cpp](https://github.com/ggerganov/llama.cpp/blob/master/docs/build.md#vulkan) build documentation for more details. For example, to compile with Vulkan support on Windows make sure Vulkan SDK is installed and then run the following commands:\n\n```bash\nmkdir build \u0026\u0026 cd build\ncmake -DCMAKE_BUILD_TYPE=Release -DGGML_VULKAN=ON ..\ncmake --build . --config Release\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkelindar%2Fsearch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkelindar%2Fsearch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkelindar%2Fsearch/lists"}