{"id":20842797,"url":"https://github.com/mudge/riveted","last_synced_at":"2025-05-08T22:42:29.530Z","repository":{"id":43865766,"uuid":"9642041","full_name":"mudge/riveted","owner":"mudge","description":"A Clojure library for the fast processing of XML with VTD-XML.","archived":false,"fork":false,"pushed_at":"2022-02-15T13:22:14.000Z","size":150,"stargazers_count":30,"open_issues_count":2,"forks_count":2,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-05-08T22:42:23.799Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://clojars.org/riveted","language":"Clojure","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mudge.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2013-04-24T07:39:30.000Z","updated_at":"2025-04-16T21:49:20.000Z","dependencies_parsed_at":"2022-09-17T14:30:50.239Z","dependency_job_id":null,"html_url":"https://github.com/mudge/riveted","commit_stats":null,"previous_names":[],"tags_count":13,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mudge%2Friveted","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mudge%2Friveted/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mudge%2Friveted/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mudge%2Friveted/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mudge","download_url":"https://codeload.github.com/mudge/riveted/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253160727,"owners_count":21863624,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-18T01:25:31.292Z","updated_at":"2025-05-08T22:42:29.513Z","avatar_url":"https://github.com/mudge.png","language":"Clojure","funding_links":[],"categories":[],"sub_categories":[],"readme":"# riveted [![Clojure CI](https://github.com/mudge/riveted/actions/workflows/clojure.yml/badge.svg)](https://github.com/mudge/riveted/actions/workflows/clojure.yml)\n\nA Clojure library for the\n[fast](http://vtd-xml.sourceforge.net/benchmark1.html) processing of XML with\n[VTD-XML](http://vtd-xml.sourceforge.net), a [Virtual Token\nDescriptor](http://vtd-xml.sf.net/VTD.html) XML parser.\n\nIt provides a more Clojure-like abstraction over VTD while still exposing the\npower of its low-level interface.\n\n## Installation\n\nAs riveted is available on [Clojars](https://clojars.org/riveted), add the\nfollowing to your [Leiningen](https://github.com/technomancy/leiningen)\ndependencies:\n\n```clojure\n[riveted \"0.2.0\"]\n```\n\n## Compatibility\n\nriveted is tested against Clojure 1.3, 1.4, 1.5.1, 1.6, 1.7, 1.8, 1.9, 1.10.0,\n1.10.1, 1.10.2 and 1.10.3.\n\n## API Documentation\n\nThe latest [riveted API documentation](http://mudge.name/riveted/) is\nautomatically generated with [Codox](https://github.com/weavejester/codox).\n\n## Quick Start\n\nFor more details, see [Usage](#usage) below.\n\n```clojure\n(ns foo\n  (:require [riveted.core :as vtd]))\n\n(def nav (vtd/navigator (slurp \"foo.xml\")))\n\n;; Navigating by direction and returning text content.\n(-\u003e nav vtd/first-child vtd/next-sibling vtd/text) ;=\u003e \"Foo\"\n\n;; Navigating by direction, restricted by element and returning attribute\n;; value.\n(-\u003e nav (vtd/first-child :p) (attr :id)) ;=\u003e \"42\"\n\n;; Return the tag names of all children elements.\n(-\u003e\u003e nav vtd/children (map vtd/tag)) ;=\u003e (\"p\" \"a\" \"b\")\n\n;; Navigating by element name, regardless of location.\n(-\u003e nav (vtd/select :p) first vtd/text)\n\n;; Navigating by XPath, returning all matches.\n(map vtd/text (vtd/search nav \"//author\"))\n\n;; Navigating by XPath, returning the first match.\n(vtd/text (vtd/at nav \"/article/title\"))\n\n;; Calling seq (or any function that uses seq such as first, second, nth,\n;; last, etc.) on the navigator yields a sequence of all parsed tokens as\n;; simple maps with a type and value entry.\n(first nav) ;=\u003e {:type :start-tag, :value \"a\"}\n```\n\n## Usage\n\nOnce installed, you can include riveted into your desired namespace by\nrequiring `riveted.core` like so:\n\n```clojure\n(ns foo\n  (:require [riveted.core :as vtd]))\n```\n\nThe core data structure in riveted is the navigator: this represents both your\nXML document and your current location within it. It can be interrogated for\nthe tag name, attributes and text value of any given element and also provides\nthe ability to move around the document.\n\nLet's say we have a file called `foo.xml` with the following content:\n\n```xml\n\u003carticle\u003e\n  \u003ctitle\u003eFoo bar\u003c/title\u003e\n  \u003cauthor id=\"1\"\u003e\n    \u003cname\u003eRobert Paulson\u003c/name\u003e\n    \u003cname\u003eJoe Bloggs\u003c/name\u003e\n  \u003c/author\u003e\n  \u003cabstract\u003e\n    A \u003ci\u003egreat\u003c/i\u003e article all about things.\n  \u003c/abstract\u003e\n\u003c/article\u003e\n```\n\nLet's load this into an initial navigator with the `navigator` function,\npassing it a UTF-8 encoded string of XML and then storing the result in the\n[var](http://clojure.org/vars) `nav`:\n\n```clojure\n(def nav (vtd/navigator (slurp \"foo.xml\")))\n```\n\nIf you already have your XML in a byte array, you can pass this directly to `navigator` instead of a UTF-8 string:\n\n```clojure\n(def nav (vtd/navigator my-byte-array))\n```\n\n`navigator` also takes an optional second argument to enable XML namespace\nsupport which is disabled by default. We'll look at this\n[later](#namespace-support) but, for now, we can process this document without\nusing namespaces.\n\nNow that we have a navigator, we can navigate the document in several ways\n(c.f. [VTD-XML's explanation of its different\nviews](http://vtd-xml.sourceforge.net/userGuide/3.html)):\n\n* As a [cursor-based hierarchical view](#traversing-by-direction);\n* Using [element selectors](#traversing-by-element-name);\n* Using [XPath](#traversing-by-xpath);\n* As a [flat view of tokens](#flat-view-of-tokens).\n\nThere is also a [mutable interface](#mutable-interface) for more constrained\nmemory usage.\n\n### Traversing by direction\n\nAfter parsing a document, the navigator's cursor is always at the root element\nof our XML: for `foo.xml`, this means the `article` element. If we want to\nretrieve the `title` and we know it's the first child of the article we can\nsimply use riveted's `first-child` function:\n\n```clojure\n(vtd/first-child nav)\n```\n\nThis returns a new navigator with its cursor set to the `title` element. We\ncan check this by using the `text` and `tag` functions to return the text\ncontent and tag name of the current cursor respectively:\n\n```clojure\n(vtd/text (vtd/first-child nav)) ;=\u003e \"Foo bar\"\n(vtd/tag (vtd/first-child nav))  ;=\u003e \"title\"\n```\n\nIf we then want to move to the `author` element, we can use the `next-sibling`\nfunction in a similar way:\n\n```clojure\n(vtd/next-sibling (vtd/first-child nav))\n```\n\nIt may be more readable to use Clojure's [threading macro,\n`-\u003e`](http://clojuredocs.org/clojure_core/clojure.core/-%3E) when traversing\nin multiple directions:\n\n```clojure\n(-\u003e nav vtd/first-child vtd/next-sibling)\n```\n\nIf we want to test an element for its attributes, we can use `attr?` like so:\n\n```clojure\n(-\u003e nav vtd/first-child vtd/next-sibling (vtd/attr? :id)) ;=\u003e true\n```\n\nWe can then fetch the value of the attribute with `attr`:\n\n```clojure\n(-\u003e nav vtd/first-child vtd/next-sibling (vtd/attr :id)) ;=\u003e \"1\"\n\n;; equivalent to:\n(vtd/attr (vtd/next-sibling (vtd/first-child nav)) :id)\n```\n\nAs well as `first-child` and `next-sibling`, you can move in one direction\nwith the following functions:\n\n```clojure\n(vtd/previous-sibling nav) ;=\u003e move to the previous sibling element\n(vtd/last-child nav)       ;=\u003e move to the last child element\n(vtd/parent nav)           ;=\u003e move to the parent element\n(vtd/root nav)             ;=\u003e move to the root element\n```\n\nWe can also test navigators to distinguish elements from the entire document:\n\n```clojure\n(-\u003e nav vtd/first-child vtd/element?)   ;=\u003e true\n(-\u003e nav vtd/parent vtd/document?)       ;=\u003e true\n(-\u003e nav vtd/first-child vtd/attribute?) ;=\u003e false\n```\n\nAs we are positioned on the `author` element, we might now want to collect the\ntext values of the `name` elements within it. We could do this using the\ndirectional functions above but riveted provides a `children` function to do\nthis for us:\n\n```clojure\n(-\u003e\u003e nav vtd/first-child vtd/next-sibling vtd/children (map vtd/text))\n;=\u003e (\"Robert Paulson\" \"Joe Bloggs\")\n\n;; or if you prefer not to use the threading macro:\n(map vtd/text (vtd/children (vtd/next-sibling (vtd/first-child nav))))\n```\n\nNote that `children`, along with `next-siblings` and `previous-siblings`,\nreturns a lazy sequence of matching elements. They also take an optional\nsecond argument which allows you to specify an element name which will\nrestrict results further.\n\nFor example, if you wanted to return the `author` element directly from the\noriginal navigator, you could ask for the first `author` child like so:\n\n```clojure\n(-\u003e nav (vtd/first-child :author))\n```\n\nOr ask the root for all child `author` elements:\n\n```clojure\n(-\u003e nav (vtd/children :author)) ;=\u003e a sequence of all author child elements\n```\n\nYou can also get the full text content of a mixed-content node with `text`\nwhich would be perfect for our `abstract` element:\n\n```clojure\n(-\u003e nav (vtd/first-child :abstract) vtd/text)\n;=\u003e \"A great article all about things.\"\n```\n\nIf you want to retrieve the raw XML contents of a node, you can use `fragment`\nto do so:\n\n```clojure\n(-\u003e nav (vtd/first-child :abstract) vtd/fragment)\n;=\u003e \"A \u003ci\u003egreat\u003c/i\u003e article all about things.\"\n```\n\n### Traversing by element name\n\nIf we'd rather not navigate a document in terms of directions, riveted also\nprovides a way to traverse XML by element names with `select`.\n\nTo continue our example from above, if we wanted to pull the `title` text, we\ncould ask the navigator for all `title` elements (regardless of location) like\nso:\n\n```clojure\n(vtd/select nav :title)\n```\n\nAs this is a lazy sequence, we can ask for the text of the first item like so:\n\n```clojure\n(-\u003e nav (vtd/select :title) first vtd/text) ;=\u003e \"Foo bar\"\n```\n\nSimilarly, we can ask for the text value of all `name` elements like so:\n\n```clojure\n(map vtd/text (vtd/select nav :name)) ;=\u003e (\"Robert Paulson\" \"Joe Bloggs\")\n```\n\nNote that this will return `name` elements *anywhere* in the document but we\ncould restrict its search by moving the navigator, perhaps using some of the\ndirection functions from above:\n\n```clojure\n(map vtd/text (-\u003e nav (vtd/first-child :author) (vtd/select :name)))\n;=\u003e (\"Robert Paulson\" \"Joe Bloggs\")\n```\n\nOr perhaps with `select` itself:\n\n```clojure\n(map vtd/text (-\u003e nav (vtd/select :author) first (vtd/select :name)))\n;=\u003e (\"Robert Paulson\" \"Joe Bloggs\")\n```\n\nFinally, we can return a lazy sequence of *all* elements by simply using a\nwildcard match:\n\n```clojure\n(vtd/select nav \"*\")\n```\n\n### Traversing by XPath\n\nThe last way to traverse a document is to use XPath 1.0 with the `search`\nfunction. Note that this is only used to navigate to elements (so it's not\npossible to directly return attribute values with an XPath expression).\n\nFor example, to select all `name` elements:\n\n```clojure\n(vtd/search nav \"//name\")\n```\n\nIf you are expecting only one match then you can use the `at` function to\nreturn only one result:\n\n```clojure\n(vtd/at nav \"/article/title\")\n```\n\nIf accessing attributes via XPath, you can use `text` to return the value of\nthe attribute:\n\n```clojure\n(text (vtd/at nav \"/article/@id\"))\n```\n\n### Namespace support\n\nIf you wish to use namespace-aware features, you will need to enable namespace\nsupport when creating the initial navigator like so:\n\n```clojure\n(def ns-nav (vtd/navigator (slurp \"namespaced.xml\") true))\n```\n\nYou can then pass a prefix and URL when using `search` and `at` like so:\n\n```clojure\n(vtd/search ns-nav \"//ns1:name\" \"ns1\" \"http://purl.org/dc/elements/1.1/\")\n```\n\n### Flat view of tokens\n\nIf you need lower level access to the parsed document, you can exploit the\nfact that navigators implement [Clojure's `Seqable`\ninterface](http://clojure.org/sequences) and can be traversed as a flat\nsequence much like a list or vector:\n\n```clojure\n(first nav)  ;=\u003e {:type :start-tag, :value \"article\"}\n(second nav) ;=\u003e {:type :start-tag, :value \"title\"}\n(nth nav 2)  ;=\u003e {:type :character-data, :value \"Foo bar\"}\n(nth nav 4)  ;=\u003e {:type :attribute-name, :value \"id\"}\n(seq nav)    ;=\u003e the full sequence of tokens\n\n;; Return all comments from a document.\n(filter (comp #{:comment} :type) nav)\n```\n\nThis gives you access to *all* tokens in the document including XML\ndeclarations, doctypes, comments, processing instructions, etc. However, it is\na very low level of abstraction and if you only care about navigating\nelements, it might be better to use a cursor-based view instead.\n\n### Mutable interface\n\nriveted also provides a mutable interface to\n[VTDNav](http://vtd-xml.sourceforge.net/javadoc/com/ximpleware/VTDNav.html)\n(much like Clojure's [transient](http://clojure.org/transients) data\nstructures) for lower-memory usage (at the cost of immutability):\n\n```clojure\n;; Create an initial navigator as per usual.\n(def nav (navigator \"\u003croot\u003e\u003ca\u003eFoo\u003c/a\u003e\u003cb\u003eBar\u003c/b\u003e\u003c/root\u003e\"))\n\n;; Mutate nav to point to the a element.\n(vtd/first-child! nav)\n\n(vtd/text nav)\n;=\u003e \"Foo\"\n\n;; Mutate nav to point to the b element.\n(vtd/next-sibling! nav)\n\n(vtd/text nav)\n;=\u003e \"Bar\"\n\n;; Mutate nav to point to the a element again.\n(vtd/previous-sibling! nav)\n\n;; Mutate nav to point to the root element.\n(vtd/parent! nav)\n\n;; Mutate nav to point to the root of the document (regardless of location).\n(vtd/root! nav)\n```\n\nIn order to mitigate the problems with mutable state, it might be best to use\nthe above functions much like you would `transient`; viz. within the confines\nof a function like so:\n\n```clojure\n(defn title [nav]\n  (-\u003e (vtd/root nav)                    ; Create a new navigator to the root\n      (vtd/first-child! :front)         ; for mutation.\n      (vtd/first-child! :article-meta)\n      (vtd/first-child! :title-group)\n      (vtd/first-child! :article-title)\n      vtd/text))\n```\n\nIn this way, only one extra navigator is created.\n\n## Acknowledgements\n\n[Andrew Diamond's `clj-vtd-xml`](https://github.com/diamondap/clj-vtd-xml) and\n[Tim Williams' gist](https://gist.github.com/willtim/822769) are existing\ninterfaces to VTD-XML from Clojure that were great sources of inspiration.\n\n[Dave Ray's `seesaw`](https://github.com/daveray/seesaw) set the standard for\nhelpful docstrings.\n\nClojure's\n[`core.clj`](https://github.com/clojure/clojure/blob/master/src/clj/clojure/core.clj)\nprovided fascinating reading, particularly regarding the use of `:inline`\nmetadata.\n\nThanks to [Heikki Hämäläinen](https://github.com/hjhamala) for contributing a\ncharacter encoding fix for Windows users.\n\nThanks to [Eugen Stan](https://github.com/ieugen) for suggesting that\n`navigator` should also accept byte arrays as well as UTF-8 strings.\n\n## License\n\nCopyright © 2013-2022 Paul Mucur.\n\nDistributed under the Eclipse Public License, the same as Clojure.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmudge%2Friveted","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmudge%2Friveted","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmudge%2Friveted/lists"}