Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/GULPF/nimquery
Nim library for querying HTML using CSS-selectors (like JavaScripts document.querySelector)
https://github.com/GULPF/nimquery
html nim scraping web
Last synced: 3 months ago
JSON representation
Nim library for querying HTML using CSS-selectors (like JavaScripts document.querySelector)
- Host: GitHub
- URL: https://github.com/GULPF/nimquery
- Owner: GULPF
- License: mit
- Created: 2017-08-13T20:31:07.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2022-12-06T20:36:52.000Z (almost 2 years ago)
- Last Synced: 2024-05-12T09:34:27.640Z (6 months ago)
- Topics: html, nim, scraping, web
- Language: Nim
- Homepage:
- Size: 126 KB
- Stars: 131
- Watchers: 8
- Forks: 8
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: changelog.md
- License: LICENSE
Awesome Lists containing this project
- awesome-nim - Nimquery - Library for querying HTML using CSS selectors, like JavaScript's `document.querySelector`. (Web / HTML Parsers)
README
# Nimquery ![CI](https://github.com/GULPF/nimquery/workflows/CI/badge.svg)
A library for querying HTML using CSS selectors, like JavaScripts `document.querySelector`/`document.querySelectorAll`.## Installation
Nimquery is available on Nimble:
```
nimble install nimquery
```## Usage
```nim
from xmltree import `$`
from htmlparser import parseHtml
import nimquerylet html = """
Example
1
2
3
4
"""
let xml = parseHtml(html)
let elements = xml.querySelectorAll("p:nth-child(odd)")
echo elements
# => @[1
,3
]
```## API
```nim
proc querySelectorAll*(root: XmlNode,
queryString: string,
options: set[QueryOption] = DefaultQueryOptions): seq[XmlNode]
```
Get all elements matching `queryString`.
Raises `ParseError` if parsing of `queryString` fails.
See [Options](#options) for information about the `options` parameter.- - -
```nim
proc querySelector*(root: XmlNode,
queryString: string,
options: set[QueryOption] = DefaultQueryOptions): XmlNode
```
Get the first element matching `queryString`, or `nil` if no such element exists.
Raises `ParseError` if parsing of `queryString` fails.
See [Options](#options) for information about the `options` parameter.- - -
```nim
proc parseHtmlQuery*(queryString: string,
options: set[QueryOption] = DefaultQueryOptions): Query
```
Parses a query for later use.
Raises `ParseError` if parsing of `queryString` fails.
See [Options](#options) for information about the `options` parameter.- - -
```nim
proc exec*(query: Query,
root: XmlNode,
single: bool): seq[XmlNode]
```
Execute an already parsed query. If `single = true`, it will never return more than one element.### Options
The `QueryOption` enum contains flags for configuring the behavior when parsing/searching:- `optUniqueIds`: Indicates if id attributes should be assumed to be unique.
- `optSimpleNot`: Indicates if only simple selectors are allowed as an argument to the `:not(...)` psuedo-class. Note that combinators are not allowed in the argument even if this flag is excluded.
- `optUnicodeIdentifiers`: Indicates if unicode characters are allowed inside identifiers. Doesn't affect strings where unicode is always allowed.The default options is defined as `const DefaultQueryOptions* = { optUniqueIds, optUnicodeIdentifiers, optSimpleNot }`.
Below is an example of using the options parameter to allow a complex `:not(...)` selector.
```nim
import xmltree
import htmlparser
import streams
import nimquerylet html = """
Example
1
2
3
4
"""
let xml = parseHtml(newStringStream(html))
let options = DefaultQueryOptions - { optSimpleNot }
let elements = xml.querySelectorAll("p:not(.maybe-skip:nth-child(even))", options)
echo elements
# => @[1
,3
,4
]
```## Unsupported selectors
Nimquery supports all [CSS3 selectors](https://www.w3.org/TR/css3-selectors) except the following: `:root`, `:link`, `:visited`, `:active`, `:hover`, `:focus`, `:target`, `:lang(...)`, `:enabled`, `:disabled`, `:checked`, `::first-line`, `::first-letter`, `::before`, `::after`. These selectors will not be implemented because they don't make much sense in the situations where Nimquery is useful.