{"id":16618577,"url":"https://github.com/brentp/nim-lapper","last_synced_at":"2025-04-14T04:10:13.540Z","repository":{"id":66472207,"uuid":"110996836","full_name":"brentp/nim-lapper","owner":"brentp","description":"fast easy interval overlapping for nim-lang","archived":false,"fork":false,"pushed_at":"2021-10-27T21:11:11.000Z","size":29,"stargazers_count":27,"open_issues_count":0,"forks_count":3,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-04-12T04:51:24.584Z","etag":null,"topics":["interval","nim","nim-lang","overlap","search"],"latest_commit_sha":null,"homepage":"https://brentp.github.io/nim-lapper/index.html","language":"Nim","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/brentp.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2017-11-16T16:43:23.000Z","updated_at":"2025-01-30T03:04:45.000Z","dependencies_parsed_at":"2023-02-25T14:16:27.483Z","dependency_job_id":null,"html_url":"https://github.com/brentp/nim-lapper","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brentp%2Fnim-lapper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brentp%2Fnim-lapper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brentp%2Fnim-lapper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brentp%2Fnim-lapper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/brentp","download_url":"https://codeload.github.com/brentp/nim-lapper/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248819404,"owners_count":21166477,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["interval","nim","nim-lang","overlap","search"],"created_at":"2024-10-12T02:20:43.892Z","updated_at":"2025-04-14T04:10:13.519Z","avatar_url":"https://github.com/brentp.png","language":"Nim","funding_links":[],"categories":[],"sub_categories":[],"readme":"simple, fast interval searches for nim\n\nThis uses a binary search in a sorted list of intervals along with knowledge of the longest interval.\nIt works when the size of the largest interval is smaller than the average distance between intervals.\nAs that ratio of largest-size::mean-distance increases, the performance decreases.\nOn realistic (for my use-case) data, this is 1000 times faster to query results and \u003e5000\ntimes faster to check for presence than a brute-force method. \n\nLapper also has a special case `seek` method when we know that the queries will be in order.\nThis method uses a cursor to indicate that start of the last search and does a linear search\nfrom that cursor to find matching intervals. This gives an additional 2-fold speedup over\nthe `find` method.\n\nAPI docs and examples in `nim-doc` format are available [here](https://brentp.github.io/nim-lapper/index.html)\n\nSee the `Performance` section for how large the intervals can be and still get a performance\nbenefit.\n\nTo use this, it's simply required that your type have a `start(m) int` and `stop(m) int` method to satisfy\nthe [concept](https://nim-lang.org/docs/manual.html#generics-concepts) used by `Lapper`\n\nYou can install this with `nimble install lapper`.\n\n## Example\n\n```nim\nimport lapper\nimport strutils\n\n# define an appropriate data-type. it must have a `start(m) int` and `stop(m) int` method.\n#type myinterval = tuple[start:int, stop:int, val:int]\n# if we want to modify the result, then we have to use a ref object type\ntype myinterval = ref object\n  start: int\n  stop: int\n  val: int\n\nproc start(m: myinterval): int {.inline.} = return m.start\nproc stop(m: myinterval): int {.inline.} = return m.stop\nproc `$`(m:myinterval): string = return \"(start:$#, stop:$#, val:$#)\" % [$m.start, $m.stop, $m.val]\n\n# create some fake data\nvar ivs = new_seq[myinterval]()\nfor i in countup(0, 100, 10):\n  ivs.add(myinterval(start:i, stop:i + 15, val:0))\n\n# make the Lapper \"data-structure\"\nvar l = lapify(ivs)\nvar empty:seq[myinterval]\n\nassert l.find(10, 20, empty)\nvar notfound = not l.find(200, 300, empty)\nassert notfound\n\nvar res = new_seq[myinterval]()\n\n# find is the more general case, l.seek gives a speed benefit when consecutive queries are in order.\necho l.find(50, 70, res)\necho res\n# @[(start: 40, stop: 55, val:0), (start: 50, stop: 65, val: 0), (start: 60, stop: 75, val: 0), (start: 70, stop: 85, val: 0)]\nfor r in res:\n  r.val += 1\n\n# or we can do a function on each overlapping interval\nl.each_seek(50, 60, proc(a:myinterval) = inc(a.val))\n# or\nl.each_find(50, 60, proc(a:myinterval) = a.val += 10)\n\ndiscard l.seek(50, 70, res)\necho res\n#@[(start:40, stop:55, val:12), (start:50, stop:65, val:12), (start:60, stop:75, val:1)]\n\n```\n\n\n## Performance\n\nThe output of running `bench.nim` (with -d:release) which generates *200K intervals*\nwith positions ranging from 0 to 50 million and max lengths from 10 to 1M is:\n\n| max interval size | lapper time | lapper seek time | brute-force time | speedup | seek speedup | each-seek speedup |\n| ----------------- | ----------- | ---------------- | ---------------  | ------- | ------------ | ----------------- |\n|10|0.06|0.04|387.44|6983.81|9873.11|9681.66|\n|100|0.05|0.04|384.92|7344.32|10412.97|15200.84|\n|1000|0.06|0.05|375.37|6250.23|7942.50|15703.24|\n|10000|0.15|0.14|377.29|2554.61|2702.13|15942.76|\n|100000|0.99|0.99|377.88|383.36|381.37|16241.61|\n|1000000|12.52|12.53|425.61|34.01|33.96|17762.58|\n\nNote that this is a worst-case scenario as we could also \nsimulate a case where there are few long intervals instead of\nmany large ones as in this case. Even so, we get a 34X speedup with `lapper`.\n\nAlso note that testing for presence will be even faster than\nthe above comparisons as it returns true as soon as an overlap is found.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbrentp%2Fnim-lapper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbrentp%2Fnim-lapper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbrentp%2Fnim-lapper/lists"}