https://github.com/awolverp/markupever
The fast, most optimal, and correct HTML & XML parsing library for Python written in Rust.
https://github.com/awolverp/markupever
html5ever library markup-languages parser python rust scraping selectors web-scraping
Last synced: 4 days ago
JSON representation
The fast, most optimal, and correct HTML & XML parsing library for Python written in Rust.
- Host: GitHub
- URL: https://github.com/awolverp/markupever
- Owner: awolverp
- License: mpl-2.0
- Created: 2024-12-30T15:32:00.000Z (about 1 year ago)
- Default Branch: master
- Last Pushed: 2025-05-22T07:46:20.000Z (8 months ago)
- Last Synced: 2025-10-27T05:24:00.363Z (3 months ago)
- Topics: html5ever, library, markup-languages, parser, python, rust, scraping, selectors, web-scraping
- Language: Rust
- Homepage: https://awolverp.github.io/markupever
- Size: 1.32 MB
- Stars: 25
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
The fast, most optimal, and correct HTML & XML parsing library
Documentation | Releases | Benchmarks






------
MarkupEver is a modern, fast (high-performance), XML & HTML languages parsing library written in Rust.
**KEY FEATURES:**
* ๐ **Fast**: Very high performance and fast (thanks to **[html5ever](https://github.com/servo/html5ever)** and **[selectors](https://github.com/servo/stylo/tree/main/selectors)**).
* ๐ฅ **Easy**: Designed to be easy to use and learn. Completion everywhere.
* โจ **Low-Memory**: Written in Rust. Uses low memory. Don't worry about memory leaks. Uses Rust memory allocator.
* ๐งถ **Thread-safe**: Completely thread-safe.
* ๐ฏ **Quering**: Use your **CSS** knowledge for selecting elements from a HTML or XML document.
* โก **Streaming**: Incremental/streaming parsing support.
## Installation
You can install MarkupEver by using **pip**:
It's recommended to use virtual environments.
```console
$ pip3 install markupever
```
## Example
### Parse
Parsing a HTML content and selecting elements:
```python
import markupever
dom = markupever.parse_file("file.html", "html")
# Or parse a HTML content directly:
# dom = markupever.parse("... content ...", "html")
for element in dom.select("div.section > p:child-nth(1)"):
print(element.text())
```
### Create DOM
Creating a DOM from zero:
```python
from markupever import dom
dom = dom.TreeDom()
root: dom.Document = dom.root()
root.create_doctype("html")
html = root.create_element("html", {"lang": "en"})
body = html.create_element("body")
body.create_text("Hello Everyone ...")
print(root.serialize())
#
#
# Hello Everyone ...
#
```