{"id":16065473,"url":"https://github.com/razrfalcon/roxmltree","last_synced_at":"2025-05-14T02:08:22.951Z","repository":{"id":46614993,"uuid":"146915271","full_name":"RazrFalcon/roxmltree","owner":"RazrFalcon","description":"Represent an XML document as a read-only tree.","archived":false,"fork":false,"pushed_at":"2025-01-28T18:20:57.000Z","size":1973,"stargazers_count":469,"open_issues_count":7,"forks_count":38,"subscribers_count":9,"default_branch":"master","last_synced_at":"2025-04-22T11:42:26.975Z","etag":null,"topics":["xml"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/RazrFalcon.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE-APACHE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-08-31T16:12:31.000Z","updated_at":"2025-04-20T02:29:27.000Z","dependencies_parsed_at":"2023-02-18T17:30:42.681Z","dependency_job_id":"fb02b790-1887-401b-91f9-363aee34ad2f","html_url":"https://github.com/RazrFalcon/roxmltree","commit_stats":{"total_commits":203,"total_committers":17,"mean_commits":"11.941176470588236","dds":"0.24630541871921185","last_synced_commit":"75f088c979346960038870eca975a62f46b42f1b"},"previous_names":[],"tags_count":32,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RazrFalcon%2Froxmltree","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RazrFalcon%2Froxmltree/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RazrFalcon%2Froxmltree/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RazrFalcon%2Froxmltree/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/RazrFalcon","download_url":"https://codeload.github.com/RazrFalcon/roxmltree/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254053166,"owners_count":22006717,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["xml"],"created_at":"2024-10-09T05:12:56.113Z","updated_at":"2025-05-14T02:08:17.937Z","avatar_url":"https://github.com/RazrFalcon.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# roxmltree\n![Build Status](https://github.com/RazrFalcon/roxmltree/workflows/Rust/badge.svg)\n[![Crates.io](https://img.shields.io/crates/v/roxmltree.svg)](https://crates.io/crates/roxmltree)\n[![Documentation](https://docs.rs/roxmltree/badge.svg)](https://docs.rs/roxmltree)\n[![Rust 1.60+](https://img.shields.io/badge/rust-1.60+-orange.svg)](https://www.rust-lang.org)\n\nRepresents an [XML](https://www.w3.org/TR/xml/) document as a read-only tree.\n\n```rust\n// Find element by id.\nlet doc = roxmltree::Document::parse(\"\u003crect id='rect1'/\u003e\")?;\nlet elem = doc.descendants().find(|n| n.attribute(\"id\") == Some(\"rect1\"))?;\nassert!(elem.has_tag_name(\"rect\"));\n```\n\n## Why read-only?\n\nBecause in some cases all you need is to retrieve some data from an XML document.\nAnd for such cases, we can make a lot of optimizations.\n\n## Parsing behavior\n\nSadly, XML can be parsed in many different ways. *roxmltree* tries to mimic the\nbehavior of Python's [lxml](https://lxml.de/).\nFor more details see [docs/parsing.md](https://github.com/RazrFalcon/roxmltree/blob/master/docs/parsing.md).\n\n## Alternatives\n\n| Feature/Crate                   | roxmltree        | [libxml2]           | [xmltree]        | [sxd-document]   |\n| ------------------------------- | :--------------: | :-----------------: | :--------------: | :--------------: |\n| Element namespace resolving     | ✓                | ✓                   | ✓                | ~\u003csup\u003e1\u003c/sup\u003e    |\n| Attribute namespace resolving   | ✓                | ✓                   |                  | ✓                |\n| [Entity references]             | ✓                | ✓                   | ×                | ×                |\n| [Character references]          | ✓                | ✓                   | ✓                | ✓                |\n| [Attribute-Value normalization] | ✓                | ✓                   |                  |                  |\n| Comments                        | ✓                | ✓                   |                  | ✓                |\n| Processing instructions         | ✓                | ✓                   | ✓                | ✓                |\n| UTF-8 BOM                       | ✓                | ✓                   | ×                | ×                |\n| Non UTF-8 input                 |                  | ✓                   |                  |                  |\n| Complete DTD support            |                  | ✓                   |                  |                  |\n| Position preserving\u003csup\u003e2\u003c/sup\u003e | ✓                | ✓                   |                  |                  |\n| HTML support                    |                  | ✓                   |                  |                  |\n| Tree modification               |                  | ✓                   | ✓                | ✓                |\n| Writing                         |                  | ✓                   | ✓                | ✓                |\n| No **unsafe**                   | ✓                |                     | ✓                |                  |\n| Language                        | Rust             | C                   | Rust             | Rust             |\n| Dependencies                    | **0**            | -                   | 2                | 2                |\n| Tested version                  | 0.20.0           | Apple-provided      | 0.10.3           | 0.3.2            |\n| License                         | MIT / Apache-2.0 | MIT                 | MIT              | MIT              |\n\nLegend:\n\n- ✓ - supported\n- × - parsing error\n- ~ - partial\n- *nothing* - not supported\n\nNotes:\n\n1. No default namespace propagation.\n2. *roxmltree* keeps all node and attribute positions in the original document,\n   so you can easily retrieve it if you need it.\n   See [examples/print_pos.rs](examples/print_pos.rs) for details.\n\nThere is also `elementtree` and `treexml` crates, but they are abandoned for a long time.\n\n[Entity references]: https://www.w3.org/TR/REC-xml/#dt-entref\n[Character references]: https://www.w3.org/TR/REC-xml/#NT-CharRef\n[Attribute-Value Normalization]: https://www.w3.org/TR/REC-xml/#AVNormalize\n\n[libxml2]: http://xmlsoft.org/\n[xmltree]: https://crates.io/crates/xmltree\n[sxd-document]: https://crates.io/crates/sxd-document\n\n## Performance\n\nHere are some benchmarks comparing `roxmltree` to other XML tree libraries.\n\n```text\ntest huge_roxmltree      ... bench:   2,997,887 ns/iter (+/- 48,976)\ntest huge_libxml2        ... bench:   6,850,666 ns/iter (+/- 306,180)\ntest huge_sdx_document   ... bench:   9,440,412 ns/iter (+/- 117,106)\ntest huge_xmltree        ... bench:  41,662,316 ns/iter (+/- 850,360)\n\ntest large_roxmltree     ... bench:   1,494,886 ns/iter (+/- 30,384)\ntest large_libxml2       ... bench:   3,250,606 ns/iter (+/- 140,201)\ntest large_sdx_document  ... bench:   4,242,162 ns/iter (+/- 99,740)\ntest large_xmltree       ... bench:  13,980,228 ns/iter (+/- 229,363)\n\ntest medium_roxmltree    ... bench:     421,137 ns/iter (+/- 13,855)\ntest medium_libxml2      ... bench:     950,984 ns/iter (+/- 34,099)\ntest medium_sdx_document ... bench:   1,618,270 ns/iter (+/- 23,466)\ntest medium_xmltree      ... bench:   4,315,974 ns/iter (+/- 31,849)\n\ntest tiny_roxmltree      ... bench:       2,522 ns/iter (+/- 31)\ntest tiny_libxml2        ... bench:       8,931 ns/iter (+/- 235)\ntest tiny_sdx_document   ... bench:      11,658 ns/iter (+/- 82)\ntest tiny_xmltree        ... bench:      20,215 ns/iter (+/- 303)\n```\n\nWhen comparing to streaming XML parsers `roxmltree` is slightly slower than `quick-xml`,\nbut still way faster than `xmlrs`.\nNote that streaming parsers usually do not provide a proper string unescaping,\nDTD resolving and namespaces support.\n\n```text\ntest huge_quick_xml      ... bench:   2,997,887 ns/iter (+/- 48,976)\ntest huge_roxmltree      ... bench:   3,147,424 ns/iter (+/- 49,153)\ntest huge_xmlrs          ... bench:  36,258,312 ns/iter (+/- 180,438)\n\ntest large_quick_xml     ... bench:   1,250,053 ns/iter (+/- 21,943)\ntest large_roxmltree     ... bench:   1,494,886 ns/iter (+/- 30,384)\ntest large_xmlrs         ... bench:  11,239,516 ns/iter (+/- 76,937)\n\ntest medium_quick_xml    ... bench:     206,232 ns/iter (+/- 2,157)\ntest medium_roxmltree    ... bench:     421,137 ns/iter (+/- 13,855)\ntest medium_xmlrs        ... bench:   3,975,916 ns/iter (+/- 44,967)\n\ntest tiny_quick_xml      ... bench:       2,233 ns/iter (+/- 70)\ntest tiny_roxmltree      ... bench:       2,522 ns/iter (+/- 31)\ntest tiny_xmlrs          ... bench:      17,155 ns/iter (+/- 429)\n```\n\n### Notes\n\nThe benchmarks were taken on a Apple M1 Pro.\nYou can try running the benchmarks yourself by running `cargo bench` in the `benches` dir.\n\n- Since all libraries have a different XML support, benchmarking is a bit pointless.\n- We bench *libxml2* using the *[rust-libxml]* wrapper crate\n\n[xml-rs]: https://crates.io/crates/xml-rs\n[quick-xml]: https://crates.io/crates/quick-xml\n[rust-libxml]: https://github.com/KWARC/rust-libxml\n\n## Memory overhead\n\n`roxmltree` tries to use as little memory as possible to allow parsing\nvery large (multi-GB) XML files.\n\nThe peak memory usage doesn't directly correlate with the file size\nbut rather with the amount of nodes and attributes a file has.\nHow many attributes had to be normalized (i.e. allocated).\nAnd how many text nodes had to be preprocessed (i.e. allocated).\n\n`roxmltree` never allocates element and attribute names, processing instructions\nand comments.\n\nBy disabling the `positions` feature, you can shave 8 bytes from each node and attribute.\n\nOn average, the overhead is around 6-8x the file size.\nFor example, our 1.1GB sample XML will peak at 7.6GB RAM with default features enabled\nand at 6.8GB RAM when `positions` is disabled.\n\n## Safety\n\n- This library must not panic. Any panic should be considered a critical bug and reported.\n- This library forbids `unsafe` code.\n\n## API\n\nThis library uses Rust's idiomatic API based on iterators.\nIn case you are more familiar with browser/JS DOM APIs - you can check out\n[tests/dom-api.rs](tests/dom-api.rs) to see how it can be mapped onto the Rust one.\n\nBuilt on top of this API, a mapping to the [Serde data model](https://serde.rs/data-model.html)\nis available via the [`serde-roxmltree` crate](https://crates.io/crates/serde-roxmltree).\n\n## License\n\nLicensed under either of\n\n- [Apache License v2.0](LICENSE-APACHE)\n- [MIT license](LICENSE-MIT)\n\nat your option.\n\n## Contribution\n\nUnless you explicitly state otherwise, any contribution intentionally submitted\nfor inclusion in the work by you, as defined in the Apache-2.0 license, shall be\ndual licensed as above, without any additional terms or conditions.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frazrfalcon%2Froxmltree","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frazrfalcon%2Froxmltree","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frazrfalcon%2Froxmltree/lists"}