https://github.com/picomet/htmst
HTML to AST with positions
https://github.com/picomet/htmst
ast html parser python
Last synced: 5 months ago
JSON representation
HTML to AST with positions
- Host: GitHub
- URL: https://github.com/picomet/htmst
- Owner: picomet
- License: mit
- Created: 2024-11-22T10:08:49.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-12-29T16:58:36.000Z (over 1 year ago)
- Last Synced: 2025-10-30T10:46:28.175Z (8 months ago)
- Topics: ast, html, parser, python
- Language: Python
- Homepage: https://pypi.org/project/htmst
- Size: 66.4 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
Awesome Lists containing this project
README
# htmst

[](https://pypi.org/project/htmst)

htmst is a python library for parsing html into AST with positions.
## Installation
```bash
uv add htmst
```
or
```bash
pip install htmst
```
## Usage
```python
from htmst import HtmlAst
html = """hi"""
ast = HtmlAst(html)
print(ast.root.children[0].tag) # span
print(ast.root.children[0].start.row) # 0
print(ast.root.children[0].start.col) # 0
print(ast.root.children[0].end.row) # 0
print(ast.root.children[0].end.col) # 25
print(ast.root.children[0].attrs[0].name) # foo
print(ast.root.children[0].attrs[0].value) # bar
print(ast.root.children[0].attrs[0].start.row) # 0
print(ast.root.children[0].attrs[0].start.col) # 6
print(ast.root.children[0].attrs[0].end.row) # 0
print(ast.root.children[0].attrs[0].end.col) # 15
```
### Nodes
- `DoubleTagNode`: represents double tags
- `SingleTagNode`: represents single tags
- `AttrNode`: represents attributes
- `TextNode`: represents texts
- `CommentNode`: represents comments
- `DoctypeNode`: represents doctypes
Each node has a `start` and `end` position.
## Contributing
Contributions are welcome! Please read the [contributing guidelines](CONTRIBUTING.md) for more information.
## License
This project is licensed under the [MIT License](LICENSE).