{"id":20401449,"url":"https://github.com/jackrusher/mundaneum","last_synced_at":"2025-04-12T14:09:16.268Z","repository":{"id":13906195,"uuid":"75307474","full_name":"jackrusher/mundaneum","owner":"jackrusher","description":"A clojure wrapper around WikiData","archived":false,"fork":false,"pushed_at":"2023-07-07T17:06:11.000Z","size":390,"stargazers_count":128,"open_issues_count":0,"forks_count":16,"subscribers_count":8,"default_branch":"master","last_synced_at":"2025-03-26T08:47:35.629Z","etag":null,"topics":["clojure","dsl","sparql","wikidata"],"latest_commit_sha":null,"homepage":null,"language":"Clojure","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"0bsd","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jackrusher.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2016-12-01T15:46:57.000Z","updated_at":"2024-09-04T13:34:03.000Z","dependencies_parsed_at":"2023-01-11T20:21:27.014Z","dependency_job_id":null,"html_url":"https://github.com/jackrusher/mundaneum","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jackrusher%2Fmundaneum","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jackrusher%2Fmundaneum/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jackrusher%2Fmundaneum/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jackrusher%2Fmundaneum/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jackrusher","download_url":"https://codeload.github.com/jackrusher/mundaneum/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248578870,"owners_count":21127713,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["clojure","dsl","sparql","wikidata"],"created_at":"2024-11-15T04:49:32.113Z","updated_at":"2025-04-12T14:09:16.243Z","avatar_url":"https://github.com/jackrusher.png","language":"Clojure","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Mundaneum\n\nThis is a thin Clojure wrapper around the\n[Wikidata](https://www.wikidata.org/wiki/Wikidata:Main_Page) project's\nmassive semantic database. It's named after the\n[Mundaneum](https://en.wikipedia.org/wiki/Mundaneum), which was [Paul\nOtley](https://en.wikipedia.org/wiki/Paul_Otlet)'s mad and wonderful\nc. 1910 vision for something like the World Wide Web.\n\n(There's a mini-doc about him and it\n[here](https://www.youtube.com/watch?v=hSyfZkVgasI).)\n\n## Coordinates\n\nRecent changes to the group-id policy at Clojars have made publishing\nartifacts there less appealing to me, so for the moment this library\nis available to `deps.edn` users at:\n\n``` clojure\nio.github.jackrusher/mundaneum {:git/sha \"SHA\"}\n```\n\n## Motivation\n\nWikidata is amazing! And it provides API access to all the knowledge\nit has collected! This is great, but exploratory programmatic access\nto that data can be fairly painful.\n\nThe official Wikidata API Java library offers a document-oriented\ninterface that makes it hard to ask interesting questions. A better\nway to do most things is with the Wikidata query service, which uses\nthe\nstandard [Semantic Web](https://en.wikipedia.org/wiki/Semantic_Web)\nquery language, [SPARQL](https://en.wikipedia.org/wiki/SPARQL).\n\nThe SPARQL query service is nice, but because WikiData's data model\nmust cope with (a) items with multiple names in multiple languages,\nand (b) single names that map to multiple items, they've used a layer\nof abstraction by which everything in the DB is referred to by an `id`\nthat looks like `P50` (property number 50, meaning \"author\") or\n`Q6882` (entity number 6882, the author \"James Joyce\").\n\nFor example, to get a selection of works authored by James Joyce,\none would issue a query like:\n\n``` sparql\nSELECT ?work\nWHERE { ?work wdt:P50 wd:Q6882. } \nLIMIT 10\n```\n\n(Users of [Datomic](http://www.datomic.com) will recognize the `?work`\nstyle of selector, which is not a coincidence as SPARQL and Datomic\nare both flavors of [Datalog](https://en.wikipedia.org/wiki/Datalog).)\n\nThe above query is simple enough, except for the non-human-readable\nidentifiers in the `WHERE` clause, which were both found by manually\nsearching the web interface at Wikidata.\n\nIn order to do exploratory programming against this API in a more\nhuman-friendly way without leaving my coding environment, I've built\nthis library. The approach I took was:\n\n* download and reformat the full list of ~2000 properties (fresh as of\n  2022-04-18), shape them into a map of keyword/keyword pairs where\n  the key is made form the English name of the property and the value\n  is a namespaced keyword like `:prefix/id`:\n\n``` clojure\n(wdt :author)\n;;=\u003e :wdt/P50\n```\n\n* create a helper function that tries to correctly guess the id of an\n  entity based on a string that's similar to its \"label\" (common name,\n  currently sadly restricted to English in this code)\n\n``` clojure\n(entity \"James Joyce\")\n;;=\u003e :wd/Q6882\n\n;; the entity function tries to return the most notable entity \n;; that matches, but sometimes that isn't what you want.\n\n(describe (entity \"U2\"))\n;;=\u003e \"Irish alternative rock band\"\n\n;; not the one I meant, let's try with more info:\n(describe (entity \"U2\" (wdt :part-of) (entity \"Berlin U-Bahn\")))\n;;=\u003e \"underground line in Berlin\"\n```\n\nThis already helps to keep my emacs-driven process running\nsmoothly. The next point of irritation was assembling query strings by\nhand, like an animal. Luckily, the very well put together\n[Flint](https://github.com/yetanalytics/flint/) library provides an\nexcellent Clojure DSL for the SPARQL query language. Combined with my\nhelper functions, this looks like:\n\n``` clojure\n;; what are some works authored by James Joyce?\n(query `{:select [?work ?workLabel]\n         :where  [[?work ~(wdt :author) ~(entity \"James Joyce\")]]\n         :limit 10}\n;; [{:work \"Q864141\", :workLabel \"Eveline\"}\n;;  {:work \"Q861185\", :workLabel \"A Little Cloud\"}\n;;  {:work \"Q459592\", :workLabel \"Dubliners\"}\n;;  {:work \"Q682681\", :workLabel \"Giacomo Joyce\"}\n;;  {:work \"Q764318\", :workLabel \"Two Gallants\"}\n;;  {:work \"Q429967\", :workLabel \"Chamber Music\"}\n;;  {:work \"Q465360\", :workLabel \"A Portrait of the Artist as a Young Man\"}\n;;  {:work \"Q6511\", :workLabel \"Ulysses\"}\n;;  {:work \"Q866956\", :workLabel \"An Encounter\"}\n;;  {:work \"Q6507\", :workLabel \"Finnegans Wake\"}] \n```\n\nThis is actually quite similar to the programmatic query interface I\ncreated for the first\npurpose-built [TripleStore](https://en.wikipedia.org/wiki/Triplestore)\naround 20 years ago.\n\nThis code is much easier to understand if you have some familiarity\nwith SPARQL and how it can be used to query Wikidata. I strongly\nrecommend [this\nintroduction](https://m.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries)\nto get started. I've tried to make sure all the examples are easy to\ntranslate to the DSL used here.\n\n## Multilingual support\n\nIn queries, a language-specific string can be specified with a map\ncontaining an ISO language code keyword as key and a string as value,\nlike `{:en \"human\"}`. This can also be used to specify the language to\nuse for an entity lookup:\n\n``` clojure\n(entity {:de \"Mensch\"})\n;;=\u003e :wd/Q5\n```\n\nThe `label` and `describe` functions can also take an extra first\nparameter indicating the language to use:\n\n``` clojure\n(label :de :wd/Q5)\n;;=\u003e \"Mensch\"\n(describe :fr :wd/Q5)\n;;=\u003e \"individu appartenant à l’espèce Homo sapiens, la seule espèce restante du genre Homo – distinct de « humain fictif » et de « humain possiblement fictif »\"\n```\n\nNote that these calls are all memoized _per language_, so repeatedly\nlooking up a given entity/label/description causes no additional\nnetwork traffic.\n\nIn addition to these affordances, there is also a dynamic variable\n`mundaneum.query/*default-language*` which is an `atom` containing an\nISO language code keyword like `:en` that controls which language will\nbe used _by default_ for labels and description input/output. If you\nare planning to enjoy an interactive session in French you could set\nthe default like this:\n\n``` clojure\n(reset! *default-language* :fr)\n```\n\nOn the other hand, if you want to mix languages freely, you can use a\nlocal binding like this:\n\n``` clojure\n;; lookup an entity using Thai as the default language, then get the\n;; English label for it.\n(let [thai-name \"กรุงเทพมหานคร\"\n      id (binding [*default-language* :th]\n           (entity thai-name))]\n  (str thai-name \" is called \" (label id) \" in English.\"))\n;;=\u003e \"กรุงเทพมหานคร is called Bangkok in English.\"\n```\n\nAlthough if one is doing something like this, it's probably nicer to\nuse the previously described API:\n\n``` clojure\n(describe (entity {:th \"กรุงเทพมหานคร\"}))\n\"capital of Thailand\"\n```\n\n## Learn more\n\nAdditional documentation can be found in the Clerk notebooks in the\n`notebooks` directory, beginning with `basics.clj`. If you start your\nREPL with the `:dev` alias, you'll already have Clerk loaded. (This\nwill happen automatically if you use `cider-jack-in` from Emacs via a\nbit of configuration in this repo's `.dir-locals` file.)\n\nEnjoy!\n\n## License\n\nCopyright © 2016-2022 Jack Rusher. Distributed under the BSD 0-clause license.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjackrusher%2Fmundaneum","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjackrusher%2Fmundaneum","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjackrusher%2Fmundaneum/lists"}