https://github.com/rgladwell/microtesia
Simple microdata parsing library for Scala.
https://github.com/rgladwell/microtesia
html html-parsing html5 microdata parsing-library scala
Last synced: 5 months ago
JSON representation
Simple microdata parsing library for Scala.
- Host: GitHub
- URL: https://github.com/rgladwell/microtesia
- Owner: rgladwell
- License: lgpl-3.0
- Created: 2015-10-07T13:43:51.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2020-02-03T09:46:58.000Z (about 6 years ago)
- Last Synced: 2025-07-05T06:54:07.486Z (8 months ago)
- Topics: html, html-parsing, html5, microdata, parsing-library, scala
- Language: Scala
- Homepage: http://rgladwell.github.io/microtesia/latest/api/#microtesia.package
- Size: 1.3 MB
- Stars: 3
- Watchers: 3
- Forks: 0
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# Microtesia [](https://travis-ci.org/rgladwell/microtesia) [  ](https://bintray.com/rgladwell/maven/microtesia/_latestVersion) [](https://app.updateimpact.com/latest/702556651743481856/microtesia)
[Microdata](http://www.w3.org/TR/microdata/) parsing library for
Scala.
To install add the following line to your SBT configuration:
```
libraryDependencies += "me.gladwell.microtesia" %% "microtesia" % "0.5.1"
```
To use simply put the Microtesia API in scope and call the `parseMicrodata`
method as follows:
```scala
scala> import microtesia._
import microtesia._
scala> parseMicrodata("""
| Avatar
| """)
res0: scala.util.Try[microtesia.MicrodataDocument] = Success(MicrodataDocument(List(MicrodataItem(List((name,MicrodataString(Avatar))),Some(http://schema.org/Movie),None))))
```
See the [API reference](http://rgladwell.github.io/microtesia/latest/api) for
more information.
Once the HTML has been parsed, you can extract microdata values using for-comprehensions:
```scala
scala> import microtesia._
import microtesia._
scala> val items = List(MicrodataItem(properties = Seq(("name", MicrodataString("Brian")))))
scala> for {
| MicrodataItem(properties, _, _) <- items
| MicrodataProperty("name", MicrodataString(string)) <- properties
| } yield string
res1: List[String] = List(Brian)
```
See [MicrodataValueSpec.scala](https://github.com/rgladwell/microtesia/blob/master/src/test/scala/microtesia/MicrodataValueSpec.scala) for more examples of microdata for-comprehensions.
You can also query microdata using an XPath-like syntax:
```scala
scala> import microtesia._
import microtesia._
scala> val item = MicrodataItem(properties = Seq(("name", MicrodataString("Brian"))))
scala> item \ "name"
res1: MicrodataQuery = QueryResults(List(MicrodataString(Brian)))
```
## Readers
While you can use the for-comprehensions to write custom parsers, microtesia provides a `formats` API (based on [https://github.com/milessabin/shapeless](shapeless)) to automatically de-serialise `MicrodataValue` instances into value types and case classes:
```scala
scala> import microtesia._, formats._
import microtesia._
import formats._
scala> case class Person(name: String, age: Int, adult: Boolean)
defined class Person
scala> MicrodataItem(
| Seq(
| ("name", MicrodataString("hello")),
| ("age", MicrodataString("13")),
| ("adult", MicrodataString("true"))
| )
| ).convertTo[Person]
res0: scala.util.Try[Person] = Success(Person(hello,13,true))
```
## Releasing
To release a new, [tagged](https://git-scm.com/book/en/v2/Git-Basics-Tagging) version of Microtesia, execute the following:
```sh
$ sbt +publish
$ sbt ghpages-push-site
```
## License
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License as
published by the Free Software Foundation, either version 3 of the
License, or (at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with this program. If not, see
.
Copyright 2015-2018 [Ricardo Gladwell](http://gladwell.me).