{"id":16762426,"url":"https://github.com/tlack/atree","last_synced_at":"2025-04-06T21:15:21.607Z","repository":{"id":43998782,"uuid":"82355824","full_name":"tlack/atree","owner":"tlack","description":"Stevan Apter-style trees in C++17","archived":false,"fork":false,"pushed_at":"2023-12-17T11:12:31.000Z","size":16,"stargazers_count":370,"open_issues_count":1,"forks_count":9,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-03-30T19:08:14.167Z","etag":null,"topics":["apl","data-structures","kdb","tree","vector"],"latest_commit_sha":null,"homepage":null,"language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tlack.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-02-18T02:36:43.000Z","updated_at":"2024-12-03T22:21:32.000Z","dependencies_parsed_at":"2024-10-26T21:12:25.913Z","dependency_job_id":"8b4bd0d4-567c-4b8f-a2cf-0f97d85ebc1f","html_url":"https://github.com/tlack/atree","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tlack%2Fatree","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tlack%2Fatree/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tlack%2Fatree/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tlack%2Fatree/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tlack","download_url":"https://codeload.github.com/tlack/atree/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247550691,"owners_count":20956987,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["apl","data-structures","kdb","tree","vector"],"created_at":"2024-10-13T04:44:43.742Z","updated_at":"2025-04-06T21:15:21.583Z","avatar_url":"https://github.com/tlack.png","language":"C++","readme":"# Apter Trees in C++\n\nApter Trees are a simpler representation of trees using just two vectors: `[nodevalues,\nparentindices]`.\n\nThis repo contains a tree-like data type implemented in C++17, in the style of Stevan Apter in \n[Treetable: a case-study in q](http://archive.vector.org.uk/art10500340).\n\n## Who cares?\n\nA tree is a data structure in which values have parent-child relationships to\neach other. They come in many forms.\n\nIn most software, trees are implemented like a typical binary tree, where each\nnode contains its own data and a pointer to each of its children, nominally just\nleft and right, which are also nodes. The cycle continues.\n\nUsing such a data structure can be challenging due to recursion and slow due to\ncache behavior in modern systems and frequent malloc()s. The concept of who\n\"owns\" a tree node in such a system can become complex in multi-layered\nsoftware.\n\nApter Trees are much faster, easier to reason about, and easier to implement.\n\n## How it works\n\nAn Apter tree is implemented as two same-sized arrays.\n\nOne is a vector (array) of data (we'll call it `d`). These correspond to the\nvalues, or things that each node contains.\n\nThe other is a vector of parent indices (`p`). The index of an item in the `d`\nvector is used as its key, which we will call `c` in the examples below. \n\nOften, the key/index `c` will just be an int. \n\nSo, if we had a dog family tree in which Coco was the father of Molly and Arca,\nand Arca had a son named Cricket, you might have a data structure like:\n\n```\n\ttree.d = [\"Coco\", \"Molly\", \"Arca\",\"Cricket\"]\n\ttree.p = [0,0,0,2]\n```\n\nA node with a key of `0` whose parent is zero is the root node. Apter trees\nrequire a root node, or the use of `-1` to mean \"no parent\", which is slightly\nless elegant so I'll ignore it.\n\nComputers are very, very fast at manipulating vectors. They're so much faster\nthan pointer operations that comparisons of big-O notation for an algorithm\ndon't play out in practice. \n\n## Operations in psuedocode\n\nThe technique is applicable in all languages.  This library is written in C++\nbut I will use psuedocode to explain how it works.\n\n* Empty tree\n\n```\ntree() = { {d:[], p:[]} }       # some sort of [data,parentidxs] vector\n```\n\n* Number of nodes\n\n```\nnodecnt(t) = { len(t.p) }\n```\n\n* Keys of all nodes\n\n```\njoin(x,y) = { flatten(x,y) }    # append to vec. i.e., x.push_back(y), x[]=y, etc.\nrange(x,y) = { # Q til, APL/C++ iota; return [x, x+1, x+2, ...y-1]\n\ti=x; ret=[]; while(i++\u003cy) ret=join(ret,i); return ret\n}\nkeys(t) = { range(0, nodecnt(t)) }\n```\n\n* Add an item to the tree:\n\n```\ninsert(t, val, parentidx) = {\n\tt.d = join(t.d,val)\n\tt.p = join(t.p,parentidx)\n}\n```\n\n* Determine parent of a given node:\n\nRemember, we use the numeric index of a node (`childidx`) as its identifier:\n\n```\nparentof(t,childidx) = { t.p[childidx] }\n```\n\n* Retrieve value of node:\n\nWe'll use `c` instead of `childidx`, from here on out.\n\n```\ndata(t,c) = { t.d[c] }\n```\n\n* Scan for keys that have a given value:\n\n```\nwhere(vec,val) = { # return indices of vec that contain val\n\tmatches=[]\n\tfor idx,v in vec: if(v==val, {matches=join(matches,idx)})\n\treturn matches\n}\nsearch(t,val) = { where(t.d,val) }\n```\n\n* Determine children of a node:\n\n```\nchildnodes(t,c) = { where(t.p,c) }\n```\n\n* Retrieve child nodes' values\n\n```\n# We're assuming you can index with a vector; otherwise loop required\nchilddata(t,c) = { data(childnodes(c)) }\n```\n\n* Determine leaf nodes (those with no children):\n\nFirst, build a vector of all the indices. Then remove those indices that are\nalso in `p`. The psuedocode below is a slow implementation; should be done as a \nsingle loop.\n\n```\nexcept(x,y) = { # return elements of x except those in y. set subtraction\n\tret=[]; \n\tfor idx,xx in x: if(!xx in y, {ret=join(ret,xx)}); \n\treturn ret;\n}\nleaves(t) = { except(keys(t), t.p) }\n```\n\n* Determine vector of parents for a given node, or path to node:\n\nHere we keep going up the tree until we can't go any further (ends at 0). When node zero's\nparent is zero, we know we've reached the root node - that the \"checking last value\" trick\nworks. We call this form of iteration `exhaust`. It's called `scan` in K and Q. \n\nWe reverse the result so that it is in `parentA.parentB.parentC.child` order.\n\n```\nexhaust(vec,start) = {\n\tret=[]; last=x=start\n\tdo {\n\t\tlast=x\n\t\tret=join(ret,x)\n\t\tx=vec[x]\n\t} until x==last\n\treturn ret\n}\nparentnodes(t,c) = { reverse(exhaust(t.p, c) }\n```\n\n* Determine data for path through tree (i.e., all parents of a node)\n\n```\nparentdata(t,c) = { data(parentnodes(t,c)) }\n```\n\n* In order traversal\n\nThe simplest way is to get the list of leaves, and then determine the path to each. We finally\nsort those vectors.\n\nI believe there is a simpler way that can work in a single pass, or at least a single pass\nwith a sufficiently large stack. I'm still exploring this idea. Let me know if you have any\nsuggestions.\n\nWe're assuming a fairly flexible sort function here which can handle sorting vectors of vectors.\n\n```\nall(t) = { sort(each(leaves(t), (c){ parent(t, c) })) }\n```\n\n* Delete item \n\nThis can vary depending on your application. A sentinel value like `MAXINT` in\nthe parent column is probably easiest. Some systems uses `-1` to represent an\nempty node if you can spare the sign bit.\n\n## Again, who cares? (Unfounded editorializing)\n\nI think this is the most elegant implementation style of trees I've seen. \n\nGiven the right vector operations library, it's by far the shortest, which\nmeans you can easily understand it, find bugs in it. \n\nGiven the simplicity, it's easy to adapt for other usage scenarios. For\ninstance, you could maintain a third index of keys to create low overhead\nsorted order for data. \n\nYou can ignore the parent index vector and iterate\nquickly through the values if you are searching for something, which is like a\ndeep map, for free. You can remap all parent-child relatioships in one go. You\ncan build a serialized version instantly or transmit it over a network without\niteration. It's GPU friendly. It's easy to use in an embedded context. It's\nsecure because you can easily impose boundaries (by not allowing the vectors\nto grow beyond a certain size).\n\nGiven common sense it's also the fastest. There's very little memory overhead,\nand in many cases, access will be linear. Intermediate values and recursion is\nminimized. Computers are great at handling ints cuz that's what the benchmarks\ndo.\n\nPointers are annoying anyway.\n\n## Origins\n\nI've been lazily trying to figure out who invented this technique. It's so\nobvious I would imagine it must have had a name during the vector-oriented 60s\nand 70s.\n\nThe first full explanation I saw was from Apter as explained above, but it was\nalso documented widely as early as K3. Here's a version in Q:\n\n```\n/ nested directory: use a parent vector, e.g.\n/ a\n/ b\n/  c\n/   d\n/  e\np:0N 0N 1 2 1 / parent\nn:`a`b`c`d`e  / name\nc:group p     / children\nn p scan 3    / full path\n```\n\nI must have read that a hundred times before I internalized its genius.  [See\nfour other ways to represent trees in K](https://a.kx.com/q/tree.q) each in\nabout three lines of code.\n\nAPL, or at least Dyalog, seems to implement trees in a more traditional way,\nusing nested boxes: [see here for more](https://dfns.dyalog.com/n_BST.htm).\nThey do however use a [similar technique for vector\ngraphs](https://dfns.dyalog.com/n_Graphs.htm).\n\nIt appears to be known to J users as seen in [Rosetta Code's tree\nimplementation in\nJ](https://rosettacode.org/wiki/Tree_traversal#J:_Alternate_implementation)\n(with some illuminating comments).\n\nJohn Earnest goes into [much more detail about vector tree implementations]\n(https://github.com/JohnEarnest/ok/blob/gh-pages/docs/Trees.md), including the\n\"index of offsets\" approach to deleting entries. Worth a read.\n\nA more elaborate approach is to also track the depth of each item. [Details about\nthat approach](http://dl.acm.org/citation.cfm?id=2935331) can be found in Aaron\nW. Hsu's paper on the subject.\n\n## Other common tree implementations\n\nHere are some other well known trees. \n\nNone of these do the same thing as an Apter tree, and some are far larger due\nto generalizations, but it's still interesting to consider how much code\ndifferent styles of trees requires to do simple operations.\n\n* [FreeBSD's kernel tree implementation](https://svnweb.freebsd.org/base/head/sys/sys/tree.h?revision=277642\u0026view=markup)\n\n* [klib's tree](https://github.com/attractivechaos/klib/blob/master/kbtree.h)\n\n* [a tree class in Ruby](https://github.com/ealdent/simple-tree/blob/master/lib/simple_tree.rb)\n\n* [Python declarative tree class](https://github.com/ShuaiW/Python/blob/master/POC/Tree.py)\n\n## Status of this code\n\nI've made a sorta lazy attempt at implementing this in C++ as a way to learn C++17. Not really ready\nfor use or fully fleshed out. I still have a lot to learn about C++.\n\n## Thanks\n\nArthur Whitney, Apter, others: inspiration. \n\nJohn Earnest: source materials\n\nDave Linn: proof reading.\n\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftlack%2Fatree","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftlack%2Fatree","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftlack%2Fatree/lists"}