Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/guzba/smelly

Sometimes you have to parse XML 💩
https://github.com/guzba/smelly

nim xml xml-parser

Last synced: 19 days ago
JSON representation

Sometimes you have to parse XML 💩

Awesome Lists containing this project

README

        

# Smelly

`nimble install smelly`

[API reference](https://guzba.github.io/smelly/)

Sometimes you need to parse XML. Pinch your nose and get it over with.

This package is an alternative to Nim's standard library XML parser (std/xmlparser + std/xmltree).

## Using Smelly

```nim
import smelly

let s = """



Some text content here

"""

let root = parseXml(s)

for child in root.children:
case child.kind:
of ElementNode:
echo child.tag, ' ', child.attributes["stroke-width"]
of TextNode:
echo child.content
```

```
rect 2
ellipse 20
Some text content here
```

## Why create an alternative?

After working with Nim's standard lib XML parsing for a while I have found some sources of frustration that I want to avoid.

Nim's std/xmltree uses `[]` on a node to access it's children and uses `.attr[]` to access attributes. This isn't so bad (it makes accessing deep into a node tree easy), however I always have to re-learn if `[]` accesses children or attributes after I've been away from XML parsing for a bit. With Smelly there is no amibguity, it's just `.children[]` and `.attributes[]`.

Even more annoyingly, Nim's std/xml considers every single entity encoding (eg `<`) to be independent elements in the node tree instead of just a text encoding detail.

If an encoded entity is present:

```nim
import std/xmlparser, std/xmltree

let root = parseXml("1 < 2")
echo root.tag # thing
echo root.len # 4 ?????
echo '"', root[0], '"' # "1 " ?????
```

And if an encoded entity is not present:

```nim
import std/xmlparser, std/xmltree

let root = parseXml("1 or 2")
echo root.tag # thing
echo root.len # 1
echo '"', root[0], '"' # "1 or 2"
```
This drastic difference in behavior based on the presence of an encoded entity is not cool with me.

Here is how Smelly handles this:

```nim
import smelly

let root = parseXml("1 < 2")
echo root.tag # thing
echo root.children.len # 1
echo '"', root.children[0].content, '"' # "1 < 2"
```

And if an encoded entity is not present:

```nim
import smelly

let root = parseXml("1 or 2")
echo root.tag # thing
echo root.children.len # 1
echo '"', root.children[0].content, '"' # "1 or 2"
```

## Performance

Smelly is between 2x and 7x faster than `std/xmlparser` depending on the XML input.

See `tests/bench.nim` to test this for yourself.

## Testing

`nimble test`