{"id":17166360,"url":"https://github.com/andreaferretti/spills","last_synced_at":"2025-04-09T16:17:46.740Z","repository":{"id":66316547,"uuid":"59769874","full_name":"andreaferretti/spills","owner":"andreaferretti","description":"Disk-based sequences","archived":false,"fork":false,"pushed_at":"2018-11-07T14:01:56.000Z","size":26,"stargazers_count":11,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-04-09T16:17:37.489Z","etag":null,"topics":["disk-based","nim","sequence"],"latest_commit_sha":null,"homepage":"http://andreaferretti.github.io/spills","language":"Nim","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/andreaferretti.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-05-26T17:29:33.000Z","updated_at":"2025-02-25T22:50:33.000Z","dependencies_parsed_at":"2023-03-22T07:48:11.697Z","dependency_job_id":null,"html_url":"https://github.com/andreaferretti/spills","commit_stats":{"total_commits":27,"total_committers":1,"mean_commits":27.0,"dds":0.0,"last_synced_commit":"1d29c86654ac9568ff5c9a01945d4203037a0e98"},"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andreaferretti%2Fspills","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andreaferretti%2Fspills/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andreaferretti%2Fspills/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andreaferretti%2Fspills/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/andreaferretti","download_url":"https://codeload.github.com/andreaferretti/spills/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248065282,"owners_count":21041872,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["disk-based","nim","sequence"],"created_at":"2024-10-14T23:05:23.304Z","updated_at":"2025-04-09T16:17:46.708Z","avatar_url":"https://github.com/andreaferretti.png","language":"Nim","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Spills\n\nSpills are sequences that spill to disk when they do not fit in memory. They\nare simply represented by a memory-mapped file, hence they are to be used with\ntype that are flat - that is, they are obtained by combining primitive types,\nobjects and arrays, but do not involve seqs, strings or references to heap\nmemory. In short, you should be able to compute their size statically.\n\nSpills work in two modes:\n\n* writable spills wrap a stream, and one can `add` to them, which just amounts\n  to writing to the stream;\n* normal spills are fixed-size. You can read and write their elements, iterate\n  over them and so on, but cannot grow.\n\nUsually, one first populates a writable spills, then obtains the corresponding\nspill from that, and works from there.\n\nAn example:\n\n```nim\nimport spills\n\ntype Foo = object\n  a, b: int\n  c: float\n\ninitSpills()\n\nvar x = writableSpill[Foo]()\nfor i in 0 .. 1000000:\n  x.add(Foo(a: i, b: i + 1, c: i.float))\nx.close()\n\nvar y = spill(x)\necho y\necho y[1234]\n\nvar z = y.map(proc(f: Foo): float = f.c)\n\necho z[1234]\n\ny.close()\nz.close()\n```\n\nTo avoid breaking with empty spills, the library always create spills with a\nmagic number header, so that even an empty spill does not correspond to an\nempty file. To read files without this header (perhaps written by some external\ntool) one can do something like\n\n```nim\nvar y = spill[char](\"some file\", hasHeader = false)\n...\ny.close()\n```\n\n## Managing resources\n\nSince spills are associated to files, there are two concerns:\n\n* closing streams and other objects to make sure that changes to disk are\n  flushed and resources released\n* removing intermediate temporary files.\n\nSpills are written to a temporary directory by default. To set this directory\nand create it, call `initSpills(dir)`. Just calling `initSpills()` will use a\ndefault directory of `/tmp/spills`.\n\nEvery method that creates a new spill object optionally accepts a path parameter.\nIf this parameter is missing, files are created in the temporary directory.\nAt the end, you can call `destroySpills()` to remove the files generated in this\ndirectory. In this way, you can choose which files to persist across sessions,\nand which ones to remove.\n\nFinally, spills (both simple and writable) have a close method that will unmap\nthe file from memory (respectively, close the associated stream).\n\n## Sequence operations\n\nSpills admit a few standard sequence operations. Other than reading and writing\nsingle items, there are `map`, `filter`, `foldl` and `foldr`. These work as the\nsimilar operations in `sequtils`, except that `map` and `filter` optionally\ntake a path parameter.\n\nThere are also functions `toSpill[T](s: seq[T]): Spill[T]` and\n`toSeq[T](s: Spill[T]): seq[T]` to convert back and forth from sequences.\n\n## Strings\n\nStrings are variable-length, and as such cannot be stored into spills. Since\nthey are quite a common type, we provide a `VarChar[N]` type under `spills/varchar`.\n\n`VarChar[N]` is a wrapper over an array of `N` chars and a length field. If you\nknow beforehand that all your strings will not be longer than `N`, you can use\nit instead. One can convert back and forth using\n\n```nim\nimport spills/varchar\n\nlet\n  a = \"Hello, world\"\n  b = a.varchar(15)\n  c = $b\n\nassert a == c\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fandreaferretti%2Fspills","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fandreaferretti%2Fspills","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fandreaferretti%2Fspills/lists"}