https://github.com/guzba/smelly

Sometimes you have to parse XML 💩
https://github.com/guzba/smelly

nim xml xml-parser

Last synced: about 1 month ago
JSON representation

Sometimes you have to parse XML 💩

Host: GitHub
URL: https://github.com/guzba/smelly
Owner: guzba
License: mit
Created: 2024-03-10T02:06:19.000Z (about 1 year ago)
Default Branch: master
Last Pushed: 2024-12-22T07:37:26.000Z (5 months ago)
Last Synced: 2025-04-09T16:20:40.227Z (about 1 month ago)
Topics: nim, xml, xml-parser
Language: Nim
Homepage:
Size: 122 KB
Stars: 8
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # Smelly

`nimble install smelly`

[API reference](https://guzba.github.io/smelly/)

Sometimes you need to parse XML. Pinch your nose and get it over with.

This package is an alternative to Nim's standard library XML parser (std/xmlparser + std/xmltree).

## Using Smelly

```nim

import smelly

let s = """

  

  

  Some text content here

"""

let root = parseXml(s)

for child in root.children:

  case child.kind:

  of ElementNode:

    echo child.tag, ' ', child.attributes["stroke-width"]

  of TextNode:

    echo child.content

```

```

rect 2

ellipse 20

Some text content here

```

## Why create an alternative?

After working with Nim's standard lib XML parsing for a while I have found some sources of frustration that I want to avoid.

Nim's std/xmltree uses `[]` on a node to access it's children and uses `.attr[]` to access attributes. This isn't so bad (it makes accessing deep into a node tree easy), however I always have to re-learn if `[]` accesses children or attributes after I've been away from XML parsing for a bit. With Smelly there is no amibguity, it's just `.children[]` and `.attributes[]`.

Even more annoyingly, Nim's std/xml considers every single entity encoding (eg `<`) to be independent elements in the node tree instead of just a text encoding detail.

If an encoded entity is present:

```nim

import std/xmlparser, std/xmltree

let root = parseXml("1 < 2")

echo root.tag # thing

echo root.len # 4 ?????

echo '"', root[0], '"' # "1 " ?????

```

And if an encoded entity is not present:

```nim

import std/xmlparser, std/xmltree

let root = parseXml("1 or 2")

echo root.tag # thing

echo root.len # 1

echo '"', root[0], '"' # "1 or 2"

```

This drastic difference in behavior based on the presence of an encoded entity is not cool with me.

Here is how Smelly handles this:

```nim

import smelly

let root = parseXml("1 < 2")

echo root.tag # thing

echo root.children.len # 1

echo '"', root.children[0].content, '"' # "1 < 2"

```

And if an encoded entity is not present:

```nim

import smelly

let root = parseXml("1 or 2")

echo root.tag # thing

echo root.children.len # 1

echo '"', root.children[0].content, '"' # "1 or 2"

```

## Performance

Smelly is between 2x and 7x faster than `std/xmlparser` depending on the XML input.

See `tests/bench.nim` to test this for yourself.

## Testing

`nimble test`

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/guzba/smelly

Awesome Lists containing this project

README