{"id":15451490,"url":"https://github.com/davidar/sdbm","last_synced_at":"2025-04-28T17:39:07.147Z","repository":{"id":66883573,"uuid":"765179","full_name":"davidar/sdbm","owner":"davidar","description":"Git mirror of sdbm source code ⛺","archived":false,"fork":false,"pushed_at":"2010-07-09T02:55:45.000Z","size":140,"stargazers_count":26,"open_issues_count":0,"forks_count":7,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-30T11:41:33.679Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"http://www.cse.yorku.ca/~oz/sdbm.bun","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/davidar.png","metadata":{"files":{"readme":"readme.ms","changelog":"CHANGES","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2010-07-09T02:51:37.000Z","updated_at":"2024-11-16T23:57:20.000Z","dependencies_parsed_at":"2023-02-20T13:30:40.731Z","dependency_job_id":null,"html_url":"https://github.com/davidar/sdbm","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davidar%2Fsdbm","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davidar%2Fsdbm/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davidar%2Fsdbm/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davidar%2Fsdbm/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/davidar","download_url":"https://codeload.github.com/davidar/sdbm/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251357121,"owners_count":21576652,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-01T21:26:24.132Z","updated_at":"2025-04-28T17:39:07.121Z","avatar_url":"https://github.com/davidar.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":".\\\" tbl | readme.ms | [tn]roff -ms | ...\n.\\\" note the \"C\" (courier) and \"CB\" fonts: you will probably have to\n.\\\" change these.\n.\\\" $Id: readme.ms,v 1.1 90/12/13 13:09:15 oz Exp Locker: oz $\n\n.de P1\n.br\n.nr dT 4\n.nf\n.ft C\n.sp .5\n.nr t \\\\n(dT*\\\\w'x'u\n.ta 1u*\\\\ntu 2u*\\\\ntu 3u*\\\\ntu 4u*\\\\ntu 5u*\\\\ntu 6u*\\\\ntu 7u*\\\\ntu 8u*\\\\ntu 9u*\\\\ntu 10u*\\\\ntu 11u*\\\\ntu 12u*\\\\ntu 13u*\\\\ntu 14u*\\\\ntu\n..\n.de P2\n.br\n.ft 1\n.br\n.sp .5\n.br\n.fi\n..\n.\\\" CW uses the typewriter/courier font.\n.de CW\n\\fC\\\\$1\\\\fP\\\\$2\n..\n\n.\\\" Footnote numbering [by Henry Spencer]\n.\\\" \u003ctext\u003e\\*f for a footnote number..\n.\\\" .FS\n.\\\" \\*F \u003cfootnote text\u003e\n.\\\" .FE\n.\\\"\n.ds f \\\\u\\\\s-2\\\\n+f\\\\s+2\\\\d\n.nr f 0 1\n.ds F \\\\n+F.\n.nr F 0 1\n\n.ND\n.LP\n.TL\n\\fIsdbm\\fP \\(em Substitute DBM\n.br\nor\n.br\nBerkeley \\fIndbm\\fP for Every UN*X\\** Made Simple\n.AU\nOzan (oz) Yigit\n.AI\nThe Guild of PD Software Toolmakers\nToronto - Canada\n.sp\noz@nexus.yorku.ca\n.LP\n.FS\nUN*X is not a trademark of any (dis)organization.\n.FE\n.sp 2\n\\fIImplementation is the sincerest form of flattery. \\(em L. Peter Deutsch\\fP\n.SH\nA The Clone of the \\fIndbm\\fP library\n.PP\nThe sources accompanying this notice \\(em \\fIsdbm\\fP \\(em constitute\nthe first public release (Dec. 1990) of a complete clone of\nthe Berkeley UN*X \\fIndbm\\fP library. The \\fIsdbm\\fP library is meant to\nclone the proven functionality of \\fIndbm\\fP as closely as possible,\nincluding a few improvements. It is practical, easy to understand, and\ncompatible.\nThe \\fIsdbm\\fP library is not derived from any licensed, proprietary or\ncopyrighted software.\n.PP\nThe \\fIsdbm\\fP implementation is based on a 1978 algorithm\n[Lar78] by P.-A. (Paul) Larson known as ``Dynamic Hashing''.\nIn the course of searching for a substitute for \\fIndbm\\fP, I\nprototyped three different external-hashing algorithms [Lar78, Fag79, Lit80]\nand ultimately chose Larson's algorithm as a basis of the \\fIsdbm\\fP\nimplementation. The Bell Labs\n\\fIdbm\\fP (and therefore \\fIndbm\\fP) is based on an algorithm invented by\nKen Thompson, [Tho90, Tor87] and predates Larson's work.\n.PP\nThe \\fIsdbm\\fR programming interface is totally compatible\nwith \\fIndbm\\fP and includes a slight improvement in database initialization.\nIt is also expected to be binary-compatible under most UN*X versions that\nsupport the \\fIndbm\\fP library.\n.PP\nThe \\fIsdbm\\fP implementation shares the shortcomings of the \\fIndbm\\fP\nlibrary, as a side effect of various simplifications to the original Larson\nalgorithm. It does produce \\fIholes\\fP in the page file as it writes\npages past the end of file. (Larson's paper include a clever solution to\nthis problem that is a result of using the hash value directly as a block\naddress.) On the other hand, extensive tests seem to indicate that \\fIsdbm\\fP\ncreates fewer holes in general, and the resulting pagefiles are\nsmaller. The \\fIsdbm\\fP implementation is also faster than \\fIndbm\\fP\nin database creation.\nUnlike the \\fIndbm\\fP, the \\fIsdbm\\fP\n.CW store\noperation will not ``wander away'' trying to split its\ndata pages to insert a datum that \\fIcannot\\fP (due to elaborate worst-case\nsituations) be inserted. (It will fail after a pre-defined number of attempts.)\n.SH\nImportant Compatibility Warning\n.PP\nThe \\fIsdbm\\fP and \\fIndbm\\fP\nlibraries \\fIcannot\\fP share databases: one cannot read the (dir/pag)\ndatabase created by the other. This is due to the differences\nbetween the \\fIndbm\\fP and \\fIsdbm\\fP algorithms\\**, \n.FS\nTorek's discussion [Tor87]\nindicates that \\fIdbm/ndbm\\fP implementations use the hash\nvalue to traverse the radix trie differently than \\fIsdbm\\fP\nand as a result, the page indexes are generated in \\fIdifferent\\fP order.\nFor more information, send e-mail to the author.\n.FE\nand the hash functions\nused.\nIt is easy to convert between the \\fIdbm/ndbm\\fP databases and \\fIsdbm\\fP\nby ignoring the index completely: see\n.CW dbd ,\n.CW dbu\netc.\n.R\n.LP\n.SH\nNotice of Intellectual Property\n.LP\n\\fIThe entire\\fP sdbm  \\fIlibrary package, as authored by me,\\fP Ozan S. Yigit,\n\\fIis hereby placed in the public domain.\\fP As such, the author is not\nresponsible for the consequences of use of this software, no matter how\nawful, even if they arise from defects in it. There is no expressed or\nimplied warranty for the \\fIsdbm\\fP library.\n.PP\nSince the \\fIsdbm\\fP\nlibrary package is in the public domain, this \\fIoriginal\\fP\nrelease or any additional public-domain releases of the modified original\ncannot possibly (by definition) be withheld from you. Also by definition,\nYou (singular) have all the rights to this code (including the right to\nsell without permission, the right to hoard\\**\n.FS\nYou cannot really hoard something that is available to the public at\nlarge, but try if it makes you feel any better.\n.FE\nand the right to do other icky things as\nyou see fit) but those rights are also granted to everyone else.\n.PP\nPlease note that all previous distributions of this software contained\na copyright (which is now dropped) to protect its\norigins and its current public domain status against any possible claims\nand/or challenges.\n.SH\nAcknowledgments\n.PP\nMany people have been very helpful and supportive.  A partial list would\nnecessarily include Rayan Zacherissen (who contributed the man page,\nand also hacked a MMAP version of \\fIsdbm\\fP),\nArnold Robbins, Chris Lewis,\nBill Davidsen, Henry Spencer, Geoff Collyer, Rich Salz (who got me started\nin the first place), Johannes Ruschein\n(who did the minix port) and David Tilbrook. I thank you all.\n.SH\nDistribution Manifest and Notes\n.LP\nThis distribution of \\fIsdbm\\fP includes (at least) the following:\n.P1\n\tCHANGES\t\tchange log\n\tREADME\t\tthis file.\n\tbiblio\t\ta small bibliography on external hashing\n\tdba.c\t\ta crude (n/s)dbm page file analyzer\n\tdbd.c\t\ta crude (n/s)dbm page file dumper (for conversion)\n\tdbe.1\t\tman page for dbe.c\n\tdbe.c\t\tJanick's database editor\n\tdbm.c\t\ta dbm library emulation wrapper for ndbm/sdbm\n\tdbm.h\t\theader file for the above\n\tdbu.c\t\ta crude db management utility\n\thash.c\t\thashing function\n\tmakefile\tguess.\n\tpair.c\t\tpage-level routines (posted earlier)\n\tpair.h\t\theader file for the above\n\treadme.ms\ttroff source for the README file\n\tsdbm.3\t\tman page\n\tsdbm.c\t\tthe real thing\n\tsdbm.h\t\theader file for the above\n\ttune.h\t\tplace for tuning \u0026 portability thingies\n\tutil.c\t\tmiscellaneous\n.P2\n.PP\n.CW dbu\nis a simple database manipulation program\\** that tries to look\n.FS\nThe \n.CW dbd ,\n.CW dba ,\n.CW dbu\nutilities are quick hacks and are not fit for production use. They were\ndeveloped late one night, just to test out \\fIsdbm\\fP, and convert some\ndatabases.\n.FE\nlike Bell Labs'\n.CW cbt\nutility. It is currently incomplete in functionality.\nI use\n.CW dbu\nto test out the routines: it takes (from stdin) tab separated\nkey/value pairs for commands like\n.CW build\nor\n.CW insert\nor takes keys for\ncommands like\n.CW delete\nor\n.CW look .\n.P1\n\tdbu \u003cbuild|creat|look|insert|cat|delete\u003e dbmfile\n.P2\n.PP\n.CW dba\nis a crude analyzer of \\fIdbm/sdbm/ndbm\\fP\npage files. It scans the entire\npage file, reporting page level statistics, and totals at the end.\n.PP\n.CW dbd\nis a crude dump program for \\fIdbm/ndbm/sdbm\\fP\ndatabases. It ignores the\nbitmap, and dumps the data pages in sequence. It can be used to create\ninput for the\n.CW dbu \nutility.\nNote that\n.CW dbd\nwill skip any NULLs in the key and data\nfields, thus is unsuitable to convert some peculiar databases that\ninsist in including the terminating null.\n.PP\nI have also included a copy of the\n.CW dbe\n(\\fIndbm\\fP DataBase Editor) by Janick Bergeron [janick@bnr.ca] for\nyour pleasure. You may find it more useful than the little\n.CW dbu\nutility.\n.PP\n.CW dbm.[ch]\nis a \\fIdbm\\fP library emulation on top of \\fIndbm\\fP\n(and hence suitable for \\fIsdbm\\fP). Written by Robert Elz.\n.PP\nThe \\fIsdbm\\fP\nlibrary has been around in beta test for quite a long time, and from whatever\nlittle feedback I received (maybe no news is good news), I believe it has been\nfunctioning without any significant problems. I would, of course, appreciate\nall fixes and/or improvements. Portability enhancements would especially be\nuseful.\n.SH\nImplementation Issues\n.PP\nHash functions:\nThe algorithm behind \\fIsdbm\\fP implementation needs a good bit-scrambling\nhash function to be effective. I ran into a set of constants for a simple\nhash function that seem to help \\fIsdbm\\fP perform better than \\fIndbm\\fP\nfor various inputs:\n.P1\n\t/*\n\t * polynomial conversion ignoring overflows\n\t * 65599 nice. 65587 even better.\n\t */\n\tlong\n\tdbm_hash(char *str, int len) {\n\t\tregister unsigned long n = 0;\n\t\n\t\twhile (len--)\n\t\t\tn = n * 65599 + *str++;\n\t\treturn n;\n\t}\n.P2\n.PP\nThere may be better hash functions for the purposes of dynamic hashing.\nTry your favorite, and check the pagefile. If it contains too many pages\nwith too many holes, (in relation to this one for example) or if\n\\fIsdbm\\fP\nsimply stops working (fails after \n.CW SPLTMAX\nattempts to split) when you feed your\nNEWS \n.CW history\nfile to it, you probably do not have a good hashing function.\nIf you do better (for different types of input), I would like to know\nabout the function you use.\n.PP\nBlock sizes: It seems (from various tests on a few machines) that a page\nfile block size\n.CW PBLKSIZ\nof 1024 is by far the best for performance, but\nthis also happens to limit the size of a key/value pair. Depending on your\nneeds, you may wish to increase the page size, and also adjust\n.CW PAIRMAX\n(the maximum size of a key/value pair allowed: should always be at least\nthree words smaller than\n.CW PBLKSIZ .)\naccordingly. The system-wide version of the library\nshould probably be\nconfigured with 1024 (distribution default), as this appears to be sufficient\nfor most common uses of \\fIsdbm\\fP.\n.SH\nPortability\n.PP\nThis package has been tested in many different UN*Xes even including minix,\nand appears to be reasonably portable. This does not mean it will port\neasily to non-UN*X systems.\n.SH\nNotes and Miscellaneous\n.PP\nThe \\fIsdbm\\fP is not a very complicated package, at least not after you\nfamiliarize yourself with the literature on external hashing. There are\nother interesting algorithms in existence that ensure (approximately)\nsingle-read access to a data value associated with any key. These are\ndirectory-less schemes such as \\fIlinear hashing\\fP [Lit80] (+ Larson\nvariations), \\fIspiral storage\\fP [Mar79] or directory schemes such as\n\\fIextensible hashing\\fP [Fag79] by Fagin et al. I do hope these sources\nprovide a reasonable playground for experimentation with other algorithms.\nSee the June 1988 issue of ACM Computing Surveys [Enb88] for an\nexcellent overview of the field. \n.PG\n.SH\nReferences\n.LP\n.IP [Lar78] 4m\nP.-A. Larson,\n``Dynamic Hashing'', \\fIBIT\\fP, vol.  18,  pp. 184-201, 1978.\n.IP [Tho90] 4m\nKen Thompson, \\fIprivate communication\\fP, Nov. 1990\n.IP [Lit80] 4m\nW. Litwin,\n`` Linear Hashing: A new tool  for  file  and table addressing'',\n\\fIProceedings of the 6th Conference on Very Large  Dabatases  (Montreal)\\fP,\npp.  212-223,  Very Large Database Foundation, Saratoga, Calif., 1980.\n.IP [Fag79] 4m\nR. Fagin, J.  Nievergelt,  N.  Pippinger,  and  H.  R. Strong,\n``Extendible Hashing - A Fast Access Method for Dynamic Files'',\n\\fIACM Trans. Database Syst.\\fP, vol. 4,  no.3, pp. 315-344, Sept. 1979.\n.IP [Wal84] 4m\nRich Wales,\n``Discussion of \"dbm\" data base system'', \\fIUSENET newsgroup unix.wizards\\fP,\nJan. 1984.\n.IP [Tor87] 4m\nChris Torek,\n``Re:  dbm.a  and  ndbm.a  archives'', \\fIUSENET newsgroup comp.unix\\fP,\n1987.\n.IP [Mar79] 4m\nG. N. Martin,\n``Spiral Storage: Incrementally  Augmentable  Hash  Addressed  Storage'',\n\\fITechnical Report #27\\fP, University of Varwick, Coventry, U.K., 1979.\n.IP [Enb88] 4m\nR. J. Enbody and H. C. Du,\n``Dynamic Hashing  Schemes'',\\fIACM Computing Surveys\\fP,\nvol. 20, no. 2, pp. 85-113, June 1988.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdavidar%2Fsdbm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdavidar%2Fsdbm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdavidar%2Fsdbm/lists"}