https://github.com/suhteevah/wraith-dom
no_std HTML parser with CSS selectors and Cloudflare challenge solver
https://github.com/suhteevah/wraith-dom
browser-engine cloudflare-bypass css-selectors html-parser no-std rust web-scraping
Last synced: 12 days ago
JSON representation
no_std HTML parser with CSS selectors and Cloudflare challenge solver
- Host: GitHub
- URL: https://github.com/suhteevah/wraith-dom
- Owner: suhteevah
- License: apache-2.0
- Created: 2026-04-02T17:07:38.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2026-06-09T09:32:48.000Z (17 days ago)
- Last Synced: 2026-06-09T11:16:05.612Z (17 days ago)
- Topics: browser-engine, cloudflare-bypass, css-selectors, html-parser, no-std, rust, web-scraping
- Language: Rust
- Size: 49.8 KB
- Stars: 3
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE-APACHE
Awesome Lists containing this project
README
# wraith-dom
[](https://doc.rust-lang.org/reference/names/preludes.html#the-no_std-attribute)
[](LICENSE-MIT)
A minimal `#![no_std]` HTML parser with CSS selectors and Cloudflare challenge solver.
Designed for environments without the standard library: bare-metal systems, WebAssembly, embedded devices, and anywhere you need lightweight HTML processing without pulling in a full browser engine.
## Features
- **HTML parsing** -- tokenizer and tree builder producing a flat node arena with parent/child indices
- **CSS selectors** -- tag, id, class, attribute presence/value, descendant combinators, comma-separated alternatives
- **Form detection** -- extract all forms and inputs; heuristic login/OAuth form finder
- **Text extraction** -- visible text (skipping script/style), page title, and link extraction
- **Cloudflare IUAM bypass** (optional, behind `cloudflare` feature) -- detect and solve Cloudflare "Under Attack Mode" challenge pages via [js-lite](https://github.com/suhteevah/js-lite)
## Usage
Add to your `Cargo.toml`:
```toml
[dependencies]
wraith-dom = "0.1"
```
To enable Cloudflare challenge solving:
```toml
[dependencies]
wraith-dom = { version = "0.1", features = ["cloudflare"] }
```
### Parse HTML and query elements
```rust
use wraith_dom::{parse, Selector, select};
let doc = parse("
Hello
World
");
// Select by tag
let sel = Selector::parse("p").unwrap();
let matches = select(&doc, &sel);
assert_eq!(matches.len(), 2);
// Select by class
let sel = Selector::parse(".intro").unwrap();
let matches = select(&doc, &sel);
assert_eq!(matches.len(), 1);
// Get text content
let text = doc.inner_text(matches[0]);
assert_eq!(text, "Hello");
```
### Extract forms
```rust
use wraith_dom::{parse, find_forms, find_login_form};
let doc = parse(r#"
Sign In
"#);
let forms = find_forms(&doc);
assert_eq!(forms.len(), 1);
assert_eq!(forms[0].action, "/login");
assert_eq!(forms[0].method, "POST");
// Heuristic login form detection
let login = find_login_form(&doc).unwrap();
assert!(login.inputs.iter().any(|i| i.input_type == "password"));
```
### Extract text and links
```rust
use wraith_dom::{parse, extract_text, extract_title, extract_links};
let doc = parse(r#"
My Page
Welcome
Visit Example
var x = 1;
"#);
let title = extract_title(&doc);
assert_eq!(title, Some("My Page".into()));
let text = extract_text(&doc);
assert!(text.contains("Welcome"));
assert!(!text.contains("var x")); // scripts are excluded
let links = extract_links(&doc);
assert_eq!(links[0].0, "https://example.com");
assert_eq!(links[0].1, "Example");
```
### CSS selector syntax
| Pattern | Matches |
|---------|---------|
| `p` | All `
` elements |
| `#main` | Element with `id="main"` |
| `.active` | Elements with class `active` |
| `[type]` | Elements with a `type` attribute |
| `[type="email"]` | Elements where `type="email"` |
| `form input` | `` elements inside a `
` (descendant) |
| `input, select` | `` or `` elements |
| `button.submit` | `` elements with class `submit` |
| `input[type="email"]` | `` elements where `type="email"` |
## no_std
This crate is `#![no_std]` and requires only `alloc`. It has no dependencies beyond `log` (for optional debug logging). The `cloudflare` feature adds a dependency on `js-lite` for JavaScript evaluation.
## License
Licensed under either of
- Apache License, Version 2.0 ([LICENSE-APACHE](LICENSE-APACHE) or )
- MIT license ([LICENSE-MIT](LICENSE-MIT) or )
at your option.
## Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
---
## Support This Project
If you find this project useful, consider buying me a coffee! Your support helps me keep building and sharing open-source tools.
[](https://www.paypal.me/baal_hosting)
**PayPal:** [baal_hosting@live.com](https://paypal.me/baal_hosting)
Every donation, no matter how small, is greatly appreciated and motivates continued development. Thank you!