{"id":13509764,"url":"https://github.com/qcam/saxy","last_synced_at":"2025-05-14T13:08:10.398Z","repository":{"id":28103789,"uuid":"115469363","full_name":"qcam/saxy","owner":"qcam","description":"Fast SAX parser and encoder for XML in Elixir","archived":false,"fork":false,"pushed_at":"2024-10-22T13:45:43.000Z","size":1697,"stargazers_count":290,"open_issues_count":19,"forks_count":41,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-04-11T20:44:00.712Z","etag":null,"topics":["elixir","elixir-lang","xml","xml-builder","xml-builder-library","xml-library","xml-parser"],"latest_commit_sha":null,"homepage":"https://hexdocs.pm/saxy","language":"Elixir","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/qcam.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-12-27T01:44:33.000Z","updated_at":"2025-04-03T09:43:58.000Z","dependencies_parsed_at":"2024-01-05T22:00:12.322Z","dependency_job_id":"0aecf881-7d69-4ce5-b7fc-9405a0451e28","html_url":"https://github.com/qcam/saxy","commit_stats":{"total_commits":186,"total_committers":19,"mean_commits":9.789473684210526,"dds":"0.12903225806451613","last_synced_commit":"ba05ecf90c46f0d5fcfbc050fb6af3a96e6b9fd1"},"previous_names":[],"tags_count":18,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qcam%2Fsaxy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qcam%2Fsaxy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qcam%2Fsaxy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qcam%2Fsaxy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/qcam","download_url":"https://codeload.github.com/qcam/saxy/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254149960,"owners_count":22022851,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["elixir","elixir-lang","xml","xml-builder","xml-builder-library","xml-library","xml-parser"],"created_at":"2024-08-01T02:01:12.663Z","updated_at":"2025-05-14T13:08:05.387Z","avatar_url":"https://github.com/qcam.png","language":"Elixir","readme":"Saxy\n====\n\n[![Test suite](https://github.com/qcam/saxy/actions/workflows/test.yml/badge.svg)](https://github.com/qcam/saxy/actions/workflows/test.yml)\n[![Module Version](https://img.shields.io/hexpm/v/saxy.svg)](https://hex.pm/packages/saxy)\n\nSaxy (Sá xị) is an XML SAX parser and encoder in Elixir that focuses on speed, usability and standard compliance.\n\nComply with [Extensible Markup Language (XML) 1.0 (Fifth Edition)](https://www.w3.org/TR/xml/).\n\n## Features highlight\n\n* An incredibly fast XML 1.0 SAX parser.\n* An extremely fast XML encoder.\n* Native support for streaming parsing large XML files.\n* Parse XML documents into simple DOM format.\n* Support quick returning in event handlers.\n\n## Installation\n\nAdd `:saxy` to your `mix.exs`.\n\n```elixir\ndef deps() do\n  [\n    {:saxy, \"~\u003e 1.6\"}\n  ]\nend\n```\n\n## Overview\n\nFull documentation is available on [HexDocs](https://hexdocs.pm/saxy/).\n\nIf you never work with a SAX parser before, please check out [this\nguide][sax-guide].\n\n### SAX parser\n\nA SAX event handler implementation is required before starting parsing.\n\n```elixir\ndefmodule MyEventHandler do\n  @behaviour Saxy.Handler\n\n  def handle_event(:start_document, prolog, state) do\n    IO.inspect(\"Start parsing document\")\n    {:ok, [{:start_document, prolog} | state]}\n  end\n\n  def handle_event(:end_document, _data, state) do\n    IO.inspect(\"Finish parsing document\")\n    {:ok, [{:end_document} | state]}\n  end\n\n  def handle_event(:start_element, {name, attributes}, state) do\n    IO.inspect(\"Start parsing element #{name} with attributes #{inspect(attributes)}\")\n    {:ok, [{:start_element, name, attributes} | state]}\n  end\n\n  def handle_event(:end_element, name, state) do\n    IO.inspect(\"Finish parsing element #{name}\")\n    {:ok, [{:end_element, name} | state]}\n  end\n\n  def handle_event(:characters, chars, state) do\n    IO.inspect(\"Receive characters #{chars}\")\n    {:ok, [{:characters, chars} | state]}\n  end\n\n  def handle_event(:cdata, cdata, state) do\n    IO.inspect(\"Receive CData #{cdata}\")\n    {:ok, [{:cdata, cdata} | state]}\n  end\nend\n```\n\nThen start parsing XML documents with:\n\n```elixir\niex\u003e xml = \"\u003c?xml version='1.0' ?\u003e\u003cfoo bar='value'\u003e\u003c/foo\u003e\"\niex\u003e Saxy.parse_string(xml, MyEventHandler, [])\n{:ok,\n [{:end_document},\n  {:end_element, \"foo\"},\n  {:start_element, \"foo\", [{\"bar\", \"value\"}]},\n  {:start_document, [version: \"1.0\"]}]}\n```\n\n### Streaming parsing\n\nSaxy also accepts file stream as the input:\n\n```elixir\nstream = File.stream!(\"/path/to/file\")\n\nSaxy.parse_stream(stream, MyEventHandler, initial_state)\n```\n\nIt even supports parsing a normal stream.\n\n```elixir\nstream = File.stream!(\"/path/to/file\") |\u003e Stream.filter(\u0026(\u00261 != \"\\n\"))\n\nSaxy.parse_stream(stream, MyEventHandler, initial_state)\n```\n\n### Partial parsing\n\nSaxy can parse an XML document partially. This feature is useful when the\ndocument cannot be turned into a stream e.g receiving over socket.\n\n```elixir\n{:ok, partial} = Partial.new(MyEventHandler, initial_state)\n{:cont, partial} = Partial.parse(partial, \"\u003cfoo\u003e\")\n{:cont, partial} = Partial.parse(partial, \"\u003cbar\u003e\u003c/bar\u003e\")\n{:cont, partial} = Partial.parse(partial, \"\u003c/foo\u003e\")\n{:ok, state} = Partial.terminate(partial)\n```\n\n### Simple DOM format exporting\n\nSometimes it will be convenient to just export the XML document into simple DOM\nformat, which is a 3-element tuple including the tag name, attributes, and a\nlist of its children.\n\n`Saxy.SimpleForm` module has this nicely supported:\n\n```elixir\nSaxy.SimpleForm.parse_string(data)\n\n{\"menu\", [],\n [\n   {\"movie\",\n    [{\"id\", \"tt0120338\"}, {\"url\", \"https://www.imdb.com/title/tt0120338/\"}],\n    [{\"name\", [], [\"Titanic\"]}, {\"characters\", [], [\"Jack \u0026amp; Rose\"]}]},\n   {\"movie\",\n    [{\"id\", \"tt0109830\"}, {\"url\", \"https://www.imdb.com/title/tt0109830/\"}],\n    [\n      {\"name\", [], [\"Forest Gump\"]},\n      {\"characters\", [], [\"Forest \u0026amp; Jenny\"]}\n    ]}\n ]}\n```\n\n### XML builder\n\nSaxy offers two APIs to build simple form and encode XML document.\n\nUse `Saxy.XML` to build and compose XML simple form, then `Saxy.encode!/2`\nto encode the built element into XML binary.\n\n```elixir\niex\u003e import Saxy.XML\niex\u003e element = element(\"person\", [gender: \"female\"], \"Alice\")\n{\"person\", [{\"gender\", \"female\"}], [{:characters, \"Alice\"}]}\niex\u003e Saxy.encode!(element, [])\n\"\u003c?xml version=\\\"1.0\\\"?\u003e\u003cperson gender=\\\"female\\\"\u003eAlice\u003c/person\u003e\"\n```\n\nSee `Saxy.XML` for more XML building APIs.\n\nSaxy also provides `Saxy.Builder` protocol to help composing structs into simple form.\n\n```elixir\ndefmodule Person do\n  @derive {Saxy.Builder, name: \"person\", attributes: [:gender], children: [:name]}\n\n  defstruct [:gender, :name]\nend\n\niex\u003e jack = %Person{gender: :male, name: \"Jack\"}\niex\u003e john = %Person{gender: :male, name: \"John\"}\niex\u003e import Saxy.XML\niex\u003e root = element(\"people\", [], [jack, john])\niex\u003e Saxy.encode!(root, [])\n\"\u003c?xml version=\\\"1.0\\\"?\u003e\u003cpeople\u003e\u003cperson gender=\\\"male\\\"\u003eJack\u003c/person\u003e\u003cperson gender=\\\"male\\\"\u003eJohn\u003c/person\u003e\u003c/people\u003e\"\n```\n\n## FAQs with Saxy/XMLs\n\n### Saxy sounds cool! But I just wanted to quickly convert some XMLs into maps/JSON...\n\nSaxy does not have offer XML to maps conversion, because many awesome people\nalready made it happen 💪:\n\n* https://github.com/bennyhat/xml_json\n* https://github.com/xinz/sax_map\n\nAlternatively, this [pull request](https://github.com/qcam/saxy/pull/78) could\nserve as a good reference if you want to implement your own map-based handler.\n\n### Does Saxy work with XPath?\n\nSaxy in its core is a SAX parser, therefore Saxy does not, and likely will\nnot, offer any XPath functionality.\n\n[SweetXml][sweet_xml] is a wonderful library to work with XPath. However,\n`:xmerl`, the library used by SweetXml, is not always memory efficient and\nspeedy. You can combine the best of both sides with [Saxmerl][saxmerl], which\nis a Saxy extension converting XML documents into SweetXml compatible format.\nPlease check that library out for more information.\n\n### Saxy! Where did the name come from?\n\n![Sa xi Chuong Duong](./assets/saxi.jpg)\n\nSa Xi, pronounced like `sa-see`, is an awesome soft drink made by [Chuong Duong](http://www.cdbeco.com.vn/en).\n\n## Benchmarking\n\nNote that benchmarking XML parsers is difficult and highly depends on the complexity\nof the documents being parsed. Event I try hard to make the benchmarking suite\nfair but it's hard to avoid biases when choosing the documents to benchmark\nagainst.\n\nTherefore the conclusion in this section is only for reference purpose. Please\nfeel free to benchmark against your target documents. The benchmark suite can be found\nin [bench/](https://github.com/qcam/saxy/tree/master/bench).\n\nA rule of thumb is that we should compare apple to apple. Some XML parsers\ntarget only specific types of XML. Therefore some indicators are provided in the\ntest suite to let know of the fairness of the benchmark results.\n\nSome quick and biased conclusions from the benchmark suite:\n\n* For SAX parser, Saxy is usually 1.4 times faster than [Erlsom](https://github.com/willemdj/erlsom).\n  With deeply nested documents, Saxy is noticeably faster (4 times faster).\n* For XML builder and encoding, Saxy is usually 10 to 30 times faster than [XML Builder](https://github.com/joshnuss/xml_builder).\n  With deeply nested documents, it could be 180 times faster.\n* Saxy significantly uses less memory than XML Builder (4 times to 25 times).\n* Saxy significantly uses less memory than Xmerl, Erlsom and Exomler (1.4 times\n  10 times).\n\n## Limitations\n\n* No XSD supported.\n* No DTD supported, when Saxy encounters a `\u003c!DOCTYPE`, it skips that.\n* Only support UTF-8 encoding.\n\n## Contributing\n\nIf you have any issues or ideas, feel free to write to https://github.com/qcam/saxy/issues.\n\nTo start developing:\n\n1. Fork the repository.\n2. Write your code and related tests.\n3. Create a pull request at https://github.com/qcam/saxy/pulls.\n\n## Copyright and License\n\nCopyright (c) 2018 Cẩm Huỳnh\n\nThis software is licensed under [the MIT license](./LICENSE.md).\n\n[saxmerl]: https://github.com/qcam/saxmerl\n[sweet_xml]: https://github.com/kbrw/sweet_xml\n[sax-guide]: https://hexdocs.pm/saxy/getting-started-with-sax.html\n","funding_links":[],"categories":["XML"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fqcam%2Fsaxy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fqcam%2Fsaxy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fqcam%2Fsaxy/lists"}