An open API service indexing awesome lists of open source software.

https://github.com/easonzero/sthe

A library to provide an easy way to extract data from HTML.
https://github.com/easonzero/sthe

Last synced: 2 months ago
JSON representation

A library to provide an easy way to extract data from HTML.

Awesome Lists containing this project

README

        

# STHE
A library to provide an easy way to extract data from HTML.

## Example

```rust
// build extract option by toml
let opt: ExtractOpt = toml::from_str(r#"
target = "href"
selector = "a"
"#).unwrap();

// extract
let extract = extract_fragment("", &opt.compile().unwrap());

// serialize result
let extract_value = toml::Value::try_from(extract).unwrap();
let expect_value = toml::from_str("text = \"www.xxx.com\"").unwrap();

assert_eq!(extract_value, expect_value);
```

see also [examples/crawler.rs](examples/crawler.rs), run by `cargo run --example crawler -- -c examples/opt.toml`.