https://github.com/madeindjs/crystagiri
An Html parser library for Crystal (like Nokogiri for Ruby)
https://github.com/madeindjs/crystagiri
crystal html-parser-library
Last synced: 6 months ago
JSON representation
An Html parser library for Crystal (like Nokogiri for Ruby)
- Host: GitHub
- URL: https://github.com/madeindjs/crystagiri
- Owner: madeindjs
- License: mit
- Created: 2016-12-06T15:41:15.000Z (almost 9 years ago)
- Default Branch: master
- Last Pushed: 2020-05-01T17:29:09.000Z (over 5 years ago)
- Last Synced: 2025-04-09T01:36:58.967Z (6 months ago)
- Topics: crystal, html-parser-library
- Language: Crystal
- Homepage: https://madeindjs.github.io/Crystagiri/
- Size: 76.2 KB
- Stars: 135
- Watchers: 8
- Forks: 9
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Crystagiri
An HTML parser library for Crystal like the amazing [Nokogiri](https://github.com/sparklemotion/nokogiri) Ruby gem.
> I won't pretend that **Crystagiri** does much as **Nokogiri**. All help is welcome! :)
## Installation
Add this to your application's `shard.yml`:
```yaml
dependencies:
crystagiri:
github: madeindjs/crystagiri
```and then run
```bash
$ shards install
```## Usage
```crystal
require "crystagiri"
```Then you can simply instantiate a `Crystagiri::HTML` object from an HTML `String` like this
```crystal
doc = Crystagiri::HTML.new "Crystagiri is awesome!!
"
```... or directly load it from a Web URL or a pathname:
```crystal
doc = Crystagiri::HTML.from_file "README.md"
doc = Crystagiri::HTML.from_url "http://example.com/"
```> Also you can specify `follow: true` flag if you want to follow redirect URL
Then you can search all [`XML::Node`](https://crystal-lang.org/api/XML/Node.html)s from the `Crystagiri::HTML` instance. The tags found will be `Crystagiri::Tag` objects with the `.node` property:
* CSS query
```Crystal
puts doc.css("li > strong.title") { |tag| puts tag.node}
# => ..
# => ..
```> **Known limitations**: Currently, you can't use CSS queries with complex search specifiers like `:nth-child`
* HTML tag
```Crystal
doc.where_tag("h2") { |tag| puts tag.content }
# => Development
# => Contributing
```* HTML id
```Crystal
puts doc.at_id("main-content").tagname
# => div
```* HTML class attribute
```Crystal
doc.where_class("summary") { |tag| puts tag.node }
# =>..
# =>..
# =>..
```## Benchmark
I know you love benchmarks between **Ruby** & **Crystal**, so here's one:
```ruby
require "nokogiri"
t1 = Time.now
doc = Nokogiri::HTML File.read("spec/fixture/HTML.html")
1..100000.times do
doc.at_css("h1")
doc.css(".step-title"){ |tag| tag }
end
puts "executed in #{Time.now - t1} milliseconds"
```> executed in 00:00:11.10 seconds with Ruby 2.6.0 with RVM on old Mac
```crystal
require "crystagiri"
t = Time.now
doc = Crystagiri::HTML.from_file "./spec/fixture/HTML.html"
1..100000.times do
doc.at_css("h1")
doc.css(".step-title") { |tag| tag }
end
puts "executed in #{Time.now - t} milliseconds"
```> executed in 00:00:03.09 seconds on Crystal 0.27.2 on LLVM 6.0.1 with release flag
Crystagiri is more than **two time faster** than Nokogiri!!
## Development
Clone this repository and navigate to it:
```bash
$ git clone https://github.com/madeindjs/crystagiri.git
$ cd crystagiri
```You can generate all documentation with
```bash
$ crystal doc
```And run **spec** tests to ensure everything works correctly
```bash
$ crystal spec
```## Contributing
Do you like this project? [here](https://github.com/madeindjs/Crystagiri/issues/) you can find
some issues to get started.Contributing is simple:
1. Fork it ( https://github.com/madeindjs/crystagiri/fork )
2. Create your feature branch `git checkout -b my-new-feature`
3. Commit your changes `git commit -am "Add some feature"`
4. Push to the branch `git push origin my-new-feature`
5. Create a new Pull Request## Contributors
See the [list on Github](https://github.com/madeindjs/Crystagiri/graphs/contributors)