{"id":13739934,"url":"https://github.com/lun-4/awtfdb","last_synced_at":"2025-08-03T10:31:05.660Z","repository":{"id":39672024,"uuid":"460139663","full_name":"lun-4/awtfdb","owner":"lun-4","description":"the Anime Woman's Tagged File Data Base.","archived":false,"fork":false,"pushed_at":"2024-11-15T15:23:33.000Z","size":925,"stargazers_count":38,"open_issues_count":14,"forks_count":2,"subscribers_count":1,"default_branch":"mistress","last_synced_at":"2024-12-01T10:54:15.199Z","etag":null,"topics":["booru","organizational","tagging","tagging-albums","zig"],"latest_commit_sha":null,"homepage":"","language":"Zig","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lun-4.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-02-16T19:01:30.000Z","updated_at":"2024-11-16T13:58:05.000Z","dependencies_parsed_at":"2023-11-28T02:26:05.874Z","dependency_job_id":"90ae2da5-694f-48d6-8691-c176abc0ee35","html_url":"https://github.com/lun-4/awtfdb","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lun-4%2Fawtfdb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lun-4%2Fawtfdb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lun-4%2Fawtfdb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lun-4%2Fawtfdb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lun-4","download_url":"https://codeload.github.com/lun-4/awtfdb/tar.gz/refs/heads/mistress","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":228540822,"owners_count":17934029,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["booru","organizational","tagging","tagging-albums","zig"],"created_at":"2024-08-03T04:00:39.854Z","updated_at":"2025-08-03T10:31:05.640Z","avatar_url":"https://github.com/lun-4.png","language":"Zig","funding_links":[],"categories":["Applications"],"sub_categories":[],"readme":"# awtfdb\n\n(wip) a \"many-compromises\" file indexing system.\n\nunderstand what's up with it here: https://blog.l4.pm/the-system-is-the-solution\n\n## project state\n\nv0.2 is released.\n\ni run master branch, which may be unstable and corrupt your database (like it did to mine,\nthat was fun)\n\nv0.3 with the goodies of `master` branch will be carved onto stone once Zig reaches v0.11.0\n(i'll also attempt to only pin down zig stable versions after that release)\n\ni have been working on this since April 2022, and also been using it on a daily\nbasis. the base is relatively solid enough, but there are unanswered design\nquestions, see issues tab.\n\nhere are my awtfdb statistics for the past 7 months (May 2023):\n - i have indexed 167405 files.\n - i have created 56190 tags.\n - i have 2425603 file\u003c-\u003etag mappings in the database.\n - my database is 336MB.\n\nthere is no specification yet that would enable others to contribute to the\nsystem by writing their own tools on top, but _that is planned_.\n\nthis indexing system is very CLI based and there is a szurubooru frontend\nfor better viewing. see `extra/` for that respective tooling.\n\nas this is v0.x, it is **NOT PRODUCTION READY**. **THERE IS NO WARRANTY PROVIDED\nBY ME. I WORK ON THIS PART TIME. BEWARE OF BUGS.**\n\n## how build\n\n- almost everything is cross-platform, save for `awtfdb-watcher`\n  - this was not tested as part of the v0.1 effort.\n- get [zig](https://github.com/ziglang/zig) (tested with 0.13.0)\n- get GraphicsMagick and development headers\n  - https://github.com/lun-4/awtfdb/issues/67\n\n```\ngit clone https://github.com/lun-4/awtfdb\ncd awtfdb\nzig build\n```\n\n## install thing\n\n```\n# choose your own prefix thats on $PATH or something\nzig build install --prefix ~/.local/ --verbose -Drelease-safe=true\n```\n\n## using it\n\n```\n# create the awtfdb index file on your home directory\nawtfdb-manage create\n\n# add files into the index\n# the ainclude tool contains a lot of tag inferrers\n# you can see 'ainclude -h' for explanations into such\nainclude --tag meme:what_the_dog_doing ~/my/folder/dog_doing.jpg\n\n# find files with tags\nafind meme:what_the_dog_doing\n\n# spawn the rename watcher for libraries that have minimal rename changes\n# as the watcher is relatively unstable.\n#\n# install bpftrace and a recent linux kernel,\n# then add this to your init system\nawtfdb-watcher /home/user\n\n# list files and their tags with als\nals ~/my/folder\n\n# manage tags\natags create my_new_funny_tag\natags search funny\natags delete my_new_funny_tag\n\n# remove files or directories from the index (does not remove from filesystem)\narm -r ~/my/folder\n```\n\n## roadmap for the thing\n\nfuck if i know.\n\n## inspiration\n\n- hydrus\n- booru software (danbooru etc.)\n\n## (OLD-ISH) design notes\n\n(i leave them there for the funny)\n\nif i could make it a single phrase: awtfdb is an incremental non-destructive tagging system for your life's files full of compromises.\n\nyou start it with 'awtfdb path/to/home/directory', it'll create a sqlite file on homedir/awtfdb.db, and run necessary migrations\n\nthen, say, you have a folder full of music. you can track them with 'binclude mediadir', but that'd just track them without tags. we know its a media directory, why not infer tags based on content?\n\n'binclude --add-single-tag source:bandcamp --infer-media-tags mediadir/bd-dl'\n\nartist:, title:, album:, and others get inferred from the id3 tags, if those arent provided, inferring from path is done (say, we know title is equal to filename)\n\nyou can take a look at the changes binclude will make, it'll always betch-add tags, never remove. if you find that its inferring is wrong for an album, ctrl-c/say no, and redo it ignoring that specific album\n\n'binclude --add-single-tag source:bandcamp --infer-media-tags mediadir/bd-dl --exclude mediadir/bd-dl/album_with_zalgotext'\n\nyou can 'badd tag file' to add a single tag to a single file, or to a folder: 'badd -R tag folder'\n\n'bstat path' to see file tags\n\n'bfind \u003cpredicate\u003e' to execute search across all files e.g 'bfind format:flac \"artist:dj kuroneko\"' to return all flacs made by dj kuroneko\n\n### why isnt this a conventional danbooru/hydrus model\n\nthe name of a tag isnt unique, tags map to their own tag ids (tag \"cores\" as ids would be overused vocab from DB world), except, to make this work at universe-scale where i can share my tags with you without conflicting with anything pre-existing\n\nthe idea of is shamelessly being copied from the proposal here: https://www.nayuki.io/page/designing-better-file-organization-around-tags-not-hierarchies#complex-indirect-tags\n\ni dont follow that proposal to the letter (storage pools do not apply to me at the moment), but some ideas from it are pretty good to follow\n\nwe use hash(random data) as the id, which enables Universal Ids But Theyre Not UUIDs You Cant Just Collide The Tags Lmao (since if you try to change the core_data by even 1 byte, crypto hash gets avalanche effect'd). this enables us to have 'tree (english)', while also having 'Árvore (portuguese)' map to the same id\n\nofc someone can create a different tag core for the idea of a tree, but thats out of scope. its better than hydrus's PTR because of the fact you can add different human representations to the same tag core, breaking your depedency on english, while also enabling metadata to be added to a tag core, so if i wanted to add the wikipedia link for a tree, i can do that\n\n### some implementation details\n\nnow that i have a bit of spec like the db, what happens for implementation?\n\ni want to have something that isnt monolithic, there is no http api of the sorts to manage the database, you just open it, and sqlite will take care of concurrent writers (file based locking)\n\nyou just use libawtfdb (name tbd) and itll provide you the high level api such as \"execute this query\"\n\nthere needs to be at least the watcher daemon, but what about secondary functionality? say, i want a program to ensure the hashes are fine for the files, but do it in an ultra slow way, generating reports or even calling notify-send when it finds a discrepancy in hashes\n\nthat is a feature way out of scope for the \"watcher daemon that only checks up for new renames in the entire filesystem. also thw watcher requires root\", adding more things to that piece of code leads to opening my surface area for catastrophic failure. the system should handle all of those processes reading and writing to the database file\n\n#### the db is an IPC platform for all tools\n\nthis does infer that the database needs to have more than the tagging data\n\nbut a way to become an IPC format between all awtfdb utilities\n\nmaybe a janitor process is ensuring all files exist and you want to see the progress bar and plan accordingly, while also ensuring there isn't two of them having to clash with each other's work\n\nipc and job system\n\n#### one singular tool that does db administration\n\none tool will have to be database management though\n\nthe database migrations will have to go to A Tool Somewhere\n\nmaybe awtfdb-manage\n\ncreate db, run some statistics, show migration warnings, etc\n\nthe watcher stays as is, just a watcher for file renames\n\n#### job system\n\nthe purpose of a job system in awtfdb\n\n- schedule periodic awtfdb jobs (every week there should be a checkup of\n  all paths and their hashes, for example. that job would be a \"janitor\")\n- get job status and _historical_ reports\n  - understand failure rate of jobs, if a job fails too much, alerts should\n    be dispatched to the user via `notify-send`\n  - if the example \"janitor\" job found a discrepancy between file and hash,\n    should it email, fix it automatically, notify-send, or leave a track\n    record? that should be configurable by the user.\n\nthe implementation proposal for this is as follows:\n\n- job watcher daemon\n- `job_configs` table has:\n  - id of job\n  - configuration of job (json string with well-defined schema in tool?)\n  - enabled flag (removing jobs is always a soft delete)\n  - executable path of tool\n    - the watcher does not run jobs itself, just delegates and supervises\n      execution of other executables that actually run what theyre supposed to\n- `job_runs` table with historical evidence\n\ntechnically possible to leave some functionality of the daemon to the\nsystem's initd (systemd, runit+snooze, cron), but i think i might tire myself\nout from having to autogenerate service units and connect it all together\nand HOPE for a LIGHT FROM GOD that it's going to work as intended. god fuck\nlinux\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flun-4%2Fawtfdb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flun-4%2Fawtfdb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flun-4%2Fawtfdb/lists"}