{"id":13509766,"url":"https://github.com/kbrw/sweet_xml","last_synced_at":"2025-05-14T03:05:44.228Z","repository":{"id":18970696,"uuid":"22191596","full_name":"kbrw/sweet_xml","owner":"kbrw","description":null,"archived":false,"fork":false,"pushed_at":"2025-01-07T09:32:12.000Z","size":357,"stargazers_count":365,"open_issues_count":28,"forks_count":61,"subscribers_count":11,"default_branch":"master","last_synced_at":"2025-03-31T15:11:47.614Z","etag":null,"topics":["elixir","stream","xml","xpath"],"latest_commit_sha":null,"homepage":"","language":"Elixir","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kbrw.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2014-07-24T01:32:45.000Z","updated_at":"2025-03-24T18:30:56.000Z","dependencies_parsed_at":"2023-11-09T11:56:31.481Z","dependency_job_id":"bee314c0-a7ae-4079-bce5-7a8484341f3d","html_url":"https://github.com/kbrw/sweet_xml","commit_stats":{"total_commits":114,"total_committers":40,"mean_commits":2.85,"dds":0.8245614035087719,"last_synced_commit":"5b154155bfde94def80c6e3503c8fa4b6dc8fd48"},"previous_names":["awetzel/sweet_xml"],"tags_count":21,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kbrw%2Fsweet_xml","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kbrw%2Fsweet_xml/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kbrw%2Fsweet_xml/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kbrw%2Fsweet_xml/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kbrw","download_url":"https://codeload.github.com/kbrw/sweet_xml/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246998135,"owners_count":20866690,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["elixir","stream","xml","xpath"],"created_at":"2024-08-01T02:01:12.697Z","updated_at":"2025-04-03T12:04:55.125Z","avatar_url":"https://github.com/kbrw.png","language":"Elixir","funding_links":[],"categories":["XML"],"sub_categories":[],"readme":"# SweetXml\n\n[![Build Status](https://api.travis-ci.org/kbrw/sweet_xml.svg)](http://travis-ci.org/kbrw/sweet_xml)\n[![Module Version](https://img.shields.io/hexpm/v/sweet_xml.svg)](https://hex.pm/packages/sweet_xml)\n[![Hex Docs](https://img.shields.io/badge/hex-docs-lightgreen.svg)](https://hexdocs.pm/sweet_xml/)\n[![Total Download](https://img.shields.io/hexpm/dt/sweet_xml.svg)](https://hex.pm/packages/sweet_xml)\n[![License](https://img.shields.io/hexpm/l/sweet_xml.svg)](https://github.com/kbrw/sweet_xml/blob/master/LICENSE)\n[![Last Updated](https://img.shields.io/github/last-commit/kbrw/sweet_xml.svg)](https://github.com/kbrw/sweet_xml/commits/master)\n\n`SweetXml` is a thin wrapper around `:xmerl`. It allows you to convert a\n`char_list` or `xmlElement` record as defined in `:xmerl` to an elixir value such\nas `map`, `list`, `string`, `integer`, `float` or any combination of these.\n\n## Installation\n\nAdd dependency to your project's `mix.exs`:\n\n```elixir\ndef deps do\n  [{:sweet_xml, \"~\u003e 0.7.5\"}]\nend\n```\n\n`SweetXml` depends on `:xmerl`. On some Linux systems, you might need\nto install the package `erlang-xmerl`.\n\n## Examples\n\nGiven an XML document such as below:\n\n```xml\n\u003c?xml version=\"1.05\" encoding=\"UTF-8\"?\u003e\n\u003cgame\u003e\n  \u003cmatchups\u003e\n    \u003cmatchup winner-id=\"1\"\u003e\n      \u003cname\u003eMatch One\u003c/name\u003e\n      \u003cteams\u003e\n        \u003cteam\u003e\n          \u003cid\u003e1\u003c/id\u003e\n          \u003cname\u003eTeam One\u003c/name\u003e\n        \u003c/team\u003e\n        \u003cteam\u003e\n          \u003cid\u003e2\u003c/id\u003e\n          \u003cname\u003eTeam Two\u003c/name\u003e\n        \u003c/team\u003e\n      \u003c/teams\u003e\n    \u003c/matchup\u003e\n    \u003cmatchup winner-id=\"2\"\u003e\n      \u003cname\u003eMatch Two\u003c/name\u003e\n      \u003cteams\u003e\n        \u003cteam\u003e\n          \u003cid\u003e2\u003c/id\u003e\n          \u003cname\u003eTeam Two\u003c/name\u003e\n        \u003c/team\u003e\n        \u003cteam\u003e\n          \u003cid\u003e3\u003c/id\u003e\n          \u003cname\u003eTeam Three\u003c/name\u003e\n        \u003c/team\u003e\n      \u003c/teams\u003e\n    \u003c/matchup\u003e\n    \u003cmatchup winner-id=\"1\"\u003e\n      \u003cname\u003eMatch Three\u003c/name\u003e\n      \u003cteams\u003e\n        \u003cteam\u003e\n          \u003cid\u003e1\u003c/id\u003e\n          \u003cname\u003eTeam One\u003c/name\u003e\n        \u003c/team\u003e\n        \u003cteam\u003e\n          \u003cid\u003e3\u003c/id\u003e\n          \u003cname\u003eTeam Three\u003c/name\u003e\n        \u003c/team\u003e\n      \u003c/teams\u003e\n    \u003c/matchup\u003e\n  \u003c/matchups\u003e\n\u003c/game\u003e\n```\nWe can do the following:\n\n```elixir\nimport SweetXml\ndoc = \"...\" # as above\n```\n\nGet the name of the first match:\n\n```elixir\nresult = doc |\u003e xpath(~x\"//matchup/name/text()\") # `sigil_x` for (x)path\nassert result == 'Match One'\n```\n\nGet the XML record of the name of the first match:\n\n```elixir\nresult = doc |\u003e xpath(~x\"//matchup/name\"e) # `e` is the modifier for (e)ntity\nassert result == {:xmlElement, :name, :name, [], {:xmlNamespace, [], []},\n        [matchup: 2, matchups: 2, game: 1], 2, [],\n        [{:xmlText, [name: 2, matchup: 2, matchups: 2, game: 1], 1, [],\n          'Match One', :text}], [],\n        ...}\n```\n\nGet the full list of matchup name:\n\n```elixir\nresult = doc |\u003e xpath(~x\"//matchup/name/text()\"l) # `l` stands for (l)ist\nassert result == ['Match One', 'Match Two', 'Match Three']\n```\n\nGet a list of winner-id by attributes:\n\n```elixir\nresult = doc |\u003e xpath(~x\"//matchup/@winner-id\"l)\nassert result == ['1', '2', '1']\n```\n\nGet a list of matchups with different map structure:\n\n```elixir\nresult = doc |\u003e xpath(\n  ~x\"//matchups/matchup\"l,\n  name: ~x\"./name/text()\",\n  winner: [\n    ~x\".//team/id[.=ancestor::matchup/@winner-id]/..\",\n    name: ~x\"./name/text()\"\n  ]\n)\nassert result == [\n  %{name: 'Match One', winner: %{name: 'Team One'}},\n  %{name: 'Match Two', winner: %{name: 'Team Two'}},\n  %{name: 'Match Three', winner: %{name: 'Team One'}}\n]\n```\n\nOr directly return a mapping of your liking:\n\n```elixir\nresult = doc |\u003e xmap(\n  matchups: [\n    ~x\"//matchups/matchup\"l,\n    name: ~x\"./name/text()\",\n    winner: [\n      ~x\".//team/id[.=ancestor::matchup/@winner-id]/..\",\n      name: ~x\"./name/text()\"\n    ]\n  ],\n  last_matchup: [\n    ~x\"//matchups/matchup[last()]\",\n    name: ~x\"./name/text()\",\n    winner: [\n      ~x\".//team/id[.=ancestor::matchup/@winner-id]/..\",\n      name: ~x\"./name/text()\"\n    ]\n  ]\n)\nassert result == %{\n  matchups: [\n    %{name: 'Match One', winner: %{name: 'Team One'}},\n    %{name: 'Match Two', winner: %{name: 'Team Two'}},\n    %{name: 'Match Three', winner: %{name: 'Team One'}}\n  ],\n  last_matchup: %{name: 'Match Three', winner: %{name: 'Team One'}}\n}\n```\n\n## The ~x Sigil\n\nWarning ! Because we use `xmerl` internally, only XPath 1.0 paths are handled.\n\nIn the above examples, we used the expression `~x\"//some/path\"` to\ndefine the path. The reason is it allows us to more precisely specify what\nis being returned.\n\n  * `~x\"//some/path\"`\n\n    without any modifiers, `xpath/2` will return the value of the entity if\n    the entity is of type `xmlText`, `xmlAttribute`, `xmlPI`, `xmlComment`\n    as defined in `:xmerl`\n\n  * `~x\"//some/path\"e`\n\n    `e` stands for (e)ntity. This forces `xpath/2` to return the entity with\n    which you can further chain your `xpath/2` call\n\n  * `~x\"//some/path\"l`\n\n    'l' stands for (l)ist. This forces `xpath/2` to return a list. Without\n    `l`, `xpath/2` will only return the first element of the match\n\n  * `~x\"//some/path\"k`\n\n     'k' stands for (k)eyword. This forces `xpath/2` to return a Keyword instead of a Map.\n\n  * `~x\"//some/path\"el` - mix of the above\n\n  * `~x\"//some/path\"s`\n\n    's' stands for (s)tring. This forces `xpath/2` to return the value as\n    string instead of a char list.\n\n  * `~x\"//some/path\"S`\n\n    'S' stands for soft (S)tring. This forces `xpath/2` to return the value as\n    string instead of a char list, but if node content is incompatible with a string,\n    set `\"\"`.\n\n  * `~x\"//some/path\"o`\n\n    'o' stands for (o)ptional. This allows the path to not exist, and will return nil.\n\n  * `~x\"//some/path\"sl` - string list.\n\n  * `~x\"//some/path\"i`\n\n    'i' stands for (i)nteger. This forces `xpath/2` to return the value as\n    integer instead of a char list.\n\n  * `~x//some/path\"I`\n\n    'I' stands for soft (I)nteger. This forces `xpath/2` to return the value as\n    integer instead of a char list, but if node content is incompatible with an integer,\n    set `0`.\n\n  * `~x\"//some/path\"f`\n\n    'f' stands for (f)loat. This forces `xpath/2` to return the value as\n    float instead of a char list.\n\n  * `~x//some/path\"F`\n\n    'F' stands for soft (F)loat. This forces `xpath/2` to return the value as\n    float instead of a char list, but if node content is incompatible with a float,\n    set `0.0`.\n\n  * `~x\"//some/path\"il` - integer list.\n\nIf you use the *optional* modifier `o` together with a *soft* cast modifier\n(uppercase), then the value is set to `nil` when the value is not compatible\nfor instance `~x//some/path/text()\"Fo` return `nil` if the text is not a number.\n\nAlso in the examples section, we always import SweetXml first. This\nmakes `x_sigil` available in the current scope. Without it, instead of using\n`~x`, you can use the `%SweetXpath` struct\n\n```elixir\nassert ~x\"//some/path\"e == %SweetXpath{path: '//some/path', is_value: false, is_list: false, cast_to: false}\n```\n\nNote the use of char_list in the path definition.\n\n## Namespace support\n\nGiven a XML document such as below\n\n```xml\n\u003c?xml version=\"1.05\" encoding=\"UTF-8\"?\u003e\n\u003cgame xmlns=\"http://example.com/fantasy-league\" xmlns:ns1=\"http://example.com/baseball-stats\"\u003e\n  \u003cmatchups\u003e\n    \u003cmatchup winner-id=\"1\"\u003e\n      \u003cname\u003eMatch One\u003c/name\u003e\n      \u003cteams\u003e\n        \u003cteam\u003e\n          \u003cid\u003e1\u003c/id\u003e\n          \u003cname\u003eTeam One\u003c/name\u003e\n          \u003cns1:runs\u003e5\u003c/ns1:runs\u003e\n        \u003c/team\u003e\n        \u003cteam\u003e\n          \u003cid\u003e2\u003c/id\u003e\n          \u003cname\u003eTeam Two\u003c/name\u003e\n          \u003cns1:runs\u003e2\u003c/ns1:runs\u003e\n        \u003c/team\u003e\n      \u003c/teams\u003e\n    \u003c/matchup\u003e\n  \u003c/matchups\u003e\n\u003c/game\u003e\n```\n\nWe can do the following:\n\n```elixir\nimport SweetXml\nxml_str = \"...\" # as above\ndoc = parse(xml_str, namespace_conformant: true)\n```\n\nNote the fact that we explicitly parse the XML with the `namespace_conformant:\ntrue` option. This is needed to allow nodes to be identified in a prefix\nindependent way.\n\nWe can use namespace prefixes of our preference, regardless of what prefix is\nused in the document:\n\n```elixir\nresult = doc\n  |\u003e xpath(~x\"//ff:matchup/ff:name/text()\"\n           |\u003e add_namespace(\"ff\", \"http://example.com/fantasy-league\"))\n\nassert result == 'Match One'\n```\n\nWe can specify multiple namespace prefixes:\n\n```elixir\nresult = doc\n  |\u003e xpath(~x\"//ff:matchup//bb:runs/text()\"\n           |\u003e add_namespace(\"ff\", \"http://example.com/fantasy-league\")\n           |\u003e add_namespace(\"bb\", \"http://example.com/baseball-stats\"))\n\nassert result == '5'\n```\n\n## From Chaining to Nesting\n\nHere's a brief explanation to how nesting came about.\n\n### Chaining\n\nBoth `xpath` and `xmap` can take an `:xmerl` XML record as the first argument.\nTherefore you can chain calls to these functions like below:\n\n```elixir\ndoc\n|\u003e xpath(~x\"//li\"l)\n|\u003e Enum.map fn (li_node) -\u003e\n  %{\n    name: li_node |\u003e xpath(~x\"./name/text()\"),\n    age: li_node |\u003e xpath(~x\"./age/text()\")\n  }\nend\n```\n\n### Mapping to a structure\n\nSince the previous example is such a common use case, SweetXml allows you just\nsimply do the following\n\n```elixir\ndoc\n|\u003e xpath(\n  ~x\"//li\"l,\n  name: ~x\"./name/text()\",\n  age: ~x\"./age/text()\"\n)\n```\n\n### Nesting\n\nBut what you want is sometimes more complex than just that, SweetXml thus also\nallows nesting\n\n```elixir\ndoc\n|\u003e xpath(\n  ~x\"//li\"l,\n  name: [\n    ~x\"./name\",\n    first: ~x\"./first/text()\",\n    last: ~x\"./last/text()\"\n  ],\n  age: ~x\"./age/text()\"\n)\n```\n\n### Transform By\n\nSometimes we need to transform the value to what we need, SweetXml supports that\nvia `transform_by/2`\n\n```elixir\ndoc = \"\u003cli\u003e\u003cname\u003e\u003cfirst\u003ejohn\u003c/first\u003e\u003clast\u003edoe\u003c/last\u003e\u003c/name\u003e\u003cage\u003e30\u003c/age\u003e\u003c/li\u003e\"\n\nresult = doc |\u003e xpath(\n  ~x\"//li\"l,\n  name: [\n    ~x\"./name\",\n    first: ~x\"./first/text()\"s |\u003e transform_by(\u0026String.capitalize/1),\n    last: ~x\"./last/text()\"s |\u003e transform_by(\u0026String.capitalize/1)\n  ],\n  age: ~x\"./age/text()\"i\n)\n\n^result = [%{age: 30, name: %{first: \"John\", last: \"Doe\"}}]\n```\n\nThe same can be used to break parsing code into reusable functions that can be\nused in nesting:\n\n```elixir\ndoc = \"\u003cli\u003e\u003cname\u003e\u003cfirst\u003ejohn\u003c/first\u003e\u003clast\u003edoe\u003c/last\u003e\u003c/name\u003e\u003cage\u003e30\u003c/age\u003e\u003c/li\u003e\"\n\nparse_name = fn xpath_node -\u003e\n  xpath_node |\u003e xmap(\n    first: ~x\"./first/text()\"s |\u003e transform_by(\u0026String.capitalize/1),\n    last: ~x\"./last/text()\"s |\u003e transform_by(\u0026String.capitalize/1)\n  )\nend\n\nresult = doc |\u003e xpath(\n  ~x\"//li\"l,\n  name: ~x\"./name\" |\u003e transform_by(parse_name),\n  age: ~x\"./age/text()\"i\n)\n\n^result = [%{age: 30, name: %{first: \"John\", last: \"Doe\"}}]\n```\n\nFor more examples, please take a look at the tests and help.\n\n## Streaming\n\n`SweetXml` now also supports streaming in various forms. Here's a sample XML doc.\nNotice the certain lines have XML tags that span multiple lines:\n\n```xml\n\u003c?xml version=\"1.05\" encoding=\"UTF-8\"?\u003e\n\u003chtml\u003e\n  \u003chead\u003e\n    \u003ctitle\u003eXML Parsing\u003c/title\u003e\n    \u003chead\u003e\u003ctitle\u003eNested Head\u003c/title\u003e\u003c/head\u003e\n  \u003c/head\u003e\n  \u003cbody\u003e\n    \u003cp\u003eNeato €\u003c/p\u003e\u003cul\u003e\n      \u003cli class=\"first star\" data-index=\"1\"\u003e\n        First\u003c/li\u003e\u003cli class=\"second\"\u003eSecond\n      \u003c/li\u003e\u003cli\n            class=\"third\"\u003eThird\u003c/li\u003e\n    \u003c/ul\u003e\n    \u003cdiv\u003e\n      \u003cul\u003e\n        \u003cli\u003eForth\u003c/li\u003e\n      \u003c/ul\u003e\n    \u003c/div\u003e\n    \u003cspecial_match_key\u003efirst star\u003c/special_match_key\u003e\n  \u003c/body\u003e\n\u003c/html\u003e\n```\n\n### Working with `File.stream!/1`\n\nWorking with streams is exactly the same as working with binaries:\n\n```elixir\nFile.stream!(\"file_above.xml\") |\u003e xpath(...)\n```\n\n### `SweetXml` element streaming\n\nOnce you have a file stream, you may not want to work with the entire document to\nsave memory:\n\n```elixir\nfile_stream = File.stream!(\"file_above.xml\")\n\nresult = file_stream\n|\u003e stream_tags([:li, :special_match_key])\n|\u003e Stream.map(fn\n    {_, doc} -\u003e\n      xpath(doc, ~x\"./text()\")\n  end)\n|\u003e Enum.to_list\n\nassert result == ['\\n        First', 'Second\\n      ', 'Third', 'Forth', 'first star']\n```\n\n**Warning:** In case of large document, you may want to use the `discard`\noption to avoid memory leak.\n\n```elixir\nresult = file_stream\n|\u003e stream_tags([:li, :special_match_key], discard: [:li, :special_match_key])\n```\n\n## Security\n\nWhenever you have to deal with some XML that was not generated by your system (untrusted document),\nit is highly recommended that you separate the parsing step from the mapping step, in order to be able\nto prevent some default behavior through options. You can check the doc for `SweetXml.parse/2` for more details.\nThe current recommendations are:\n```\ndoc |\u003e parse(dtd: :none) |\u003e xpath(spec, subspec)\nenum |\u003e stream_tags(tags, dtd: :none)\n```\n\n## Copyright and License\n\nCopyright (c) 2014, Frank Liu\n\nSweetXml source code is licensed under the [MIT License](https://github.com/kbrw/sweet_xml/blob/master/LICENSE).\n\n# CONTRIBUTING\n\nHi, and thank you for wanting to contribute.\nPlease refer to the centralized information available at: https://github.com/kbrw#contributing\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkbrw%2Fsweet_xml","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkbrw%2Fsweet_xml","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkbrw%2Fsweet_xml/lists"}