{"id":13606148,"url":"https://github.com/localvoid/ndx","last_synced_at":"2025-04-06T12:07:53.718Z","repository":{"id":39004716,"uuid":"71558507","full_name":"localvoid/ndx","owner":"localvoid","description":":mag: Full text indexing and searching library","archived":false,"fork":false,"pushed_at":"2023-03-15T07:40:24.000Z","size":1110,"stargazers_count":155,"open_issues_count":1,"forks_count":11,"subscribers_count":8,"default_branch":"master","last_synced_at":"2025-03-30T11:09:14.667Z","etag":null,"topics":["full-text-search","inverted-index","javascript","search-engine","typescript"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/localvoid.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2016-10-21T11:13:53.000Z","updated_at":"2025-03-24T12:51:51.000Z","dependencies_parsed_at":"2023-02-03T06:31:41.601Z","dependency_job_id":"a6d5e3e9-a0d1-4fb8-bc53-ed9618814b81","html_url":"https://github.com/localvoid/ndx","commit_stats":null,"previous_names":["ndx-search/ndx"],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/localvoid%2Fndx","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/localvoid%2Fndx/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/localvoid%2Fndx/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/localvoid%2Fndx/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/localvoid","download_url":"https://codeload.github.com/localvoid/ndx/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247478320,"owners_count":20945266,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["full-text-search","inverted-index","javascript","search-engine","typescript"],"created_at":"2024-08-01T19:01:06.528Z","updated_at":"2025-04-06T12:07:53.698Z","avatar_url":"https://github.com/localvoid.png","language":"TypeScript","funding_links":[],"categories":["Text Search"],"sub_categories":["Reactive Programming"],"readme":"# [ndx](https://github.com/ndx-search/ndx) \u0026middot; [![GitHub license](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/ndx-search/ndx/blob/master/LICENSE)\n\nLightweight Full-Text Indexing and Searching Library.\n\nThis library were designed for a specific use case when all documents are\nstored on a disk (IndexedDB) and can be dynamically added or removed to an\nindex.\n\nQuery function supports only disjunction operators. Queries like `one two` will\nwork as `\"one\" or \"two\"`.\n\nInverted Index doesn't store term locations and query function won't be able\nto search for phrases like `\"Super Mario\"`.\n\nThere are many [alternative solutions](https://github.com/leeoniya/uFuzzy#benchmark) with different tradeoffs that may better suit for your\nparticular use cases. For a simple document search with a static dataset, I\nwould recommend to use something like [fst](https://github.com/BurntSushi/fst)\nand deploy it as an edge function (wasm).\n\n## Features\n\n- Multiple fields full-text indexing and searching.\n- Per-field score boosting.\n- [BM25](https://en.wikipedia.org/wiki/Okapi_BM25) ranking function to rank\nmatching documents.\n- [Trie](https://en.wikipedia.org/wiki/Trie) based dynamic\n[Inverted Index](https://en.wikipedia.org/wiki/Inverted_index).\n- Configurable tokenizer and term filter.\n- Free text queries with query expansion.\n\n## Example\n\n```js\nimport { createIndex, indexAdd } from \"ndx\";\nimport { indexQuery } from \"ndx/query\";\n\nconst termFilter = (term) =\u003e term.toLowerCase();\n\nfunction createDocumentIndex(fields) {\n  // `createIndex()` creates an index data structure.\n  // First argument specifies how many different fields we want to index.\n  const index = createIndex(\n    fields.length,\n    // Tokenizer is a function that breaks text into words, phrases, symbols,\n    // or other meaningful elements called tokens.\n    (s) =\u003e s.split(\" \"),\n    // Filter is a function that processes tokens and returns terms, terms are\n    // used in Inverted Index to index documents.\n    termFilter,\n  );\n  // `fieldGetters` is an array with functions that will be used to retrieve\n  // data from different fields.\n  const fieldGetters = fields.map((f) =\u003e (doc) =\u003e doc[f.name]);\n  // `fieldBoostFactors` is an array of boost factors for each field, in this\n  // example all fields will have identical weight.\n  const fieldBoostFactors = fields.map(() =\u003e 1);\n\n  return {\n    index,\n    // `add()` will add documents to the index.\n    add(doc) {\n      indexAdd(\n        index,\n        fieldGetters,\n        // Docum  ent key, it can be an unique document id or a refernce to a\n        // document if you want to store all documents in memory.\n        doc.id,\n        // Document.\n        doc,\n      );\n    },\n    // `remove()` will remove documents from the index.\n    remove(id) {\n      // When document is removed we are just marking document id as being\n      // removed. Index data structure still contains references to the removed\n      // document.\n      indexRemove(index, removed, id);\n      if (removed.size \u003e 10) {\n        // `indexVacuum()` removes all references to removed documents from the\n        // index.\n        indexVacuum(index, removed);\n      }\n    },\n\n    // `search()` will be used to perform queries.\n    search(q) {\n      return indexQuery(\n        index,\n        fieldBoostFactors,\n        // BM25 ranking function constants:\n        // BM25 k1 constant, controls non-linear term frequency normalization\n        // (saturation).\n        1.2,\n        // BM25 b constant, controls to what degree document length normalizes\n        // tf values.\n        0.75,\n        q,\n      );\n    }\n  };\n}\n\n// Create a document index that will index `content` field.\nconst index = createDocumentIndex([{ name: \"content\" }]);\n\nconst docs = [\n  {\n    \"id\": \"1\",\n    \"content\": \"Lorem ipsum dolor\",\n  },\n  {\n    \"id\": \"2\",\n    \"content\": \"Lorem ipsum\",\n  }\n];\n\n// Add documents to the index.\ndocs.forEach((d) =\u003e { index.add(d); });\n\n// Perform a search query.\nindex.search(\"Lorem\");\n// =\u003e [{ key: \"2\" , score: ... }, { key: \"1\", score: ... } ]\n//\n// document with an id `\"2\"` is ranked higher because it has a `\"content\"`\n// field with a less number of terms than document with an id `\"1\"`.\n\nindex.search(\"dolor\");\n// =\u003e [{ key: \"1\", score: ... }]\n```\n\n### Tokenizers and Filters\n\n`ndx` library doesn't provide any tokenizers or filters. There are other\nlibraries that implement tokenizers, for example\n[Natural](https://github.com/NaturalNode/natural/) has a good collection of\ntokenizers and stemmers.\n\n## License\n\n[MIT](http://opensource.org/licenses/MIT)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flocalvoid%2Fndx","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flocalvoid%2Fndx","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flocalvoid%2Fndx/lists"}