Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/teverett/htmlparser
HTML Parser
https://github.com/teverett/htmlparser
antlr html-parser java
Last synced: about 1 month ago
JSON representation
HTML Parser
- Host: GitHub
- URL: https://github.com/teverett/htmlparser
- Owner: teverett
- License: gpl-2.0
- Created: 2014-04-12T21:02:51.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2021-08-02T16:06:18.000Z (over 3 years ago)
- Last Synced: 2024-10-14T12:48:54.156Z (3 months ago)
- Topics: antlr, html-parser, java
- Language: HTML
- Size: 451 KB
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
[![Travis](https://travis-ci.org/teverett/HTMLParser.svg?branch=master)](https://travis-ci.org/teverett/HTMLParser)
[![Codacy Badge](https://api.codacy.com/project/badge/Grade/9ebea7ee219e4210bf17ac5f99b73303)](https://www.codacy.com/app/teverett/HTMLParser?utm_source=github.com&utm_medium=referral&utm_content=teverett/HTMLParser&utm_campaign=Badge_Grade)HTMLParser
==========A simple HTML Parser using [ANTLR4](http://www.antlr.org/)
Maven Coordinates
--------
com.khubla.htmlparser
htmlparser
1.0
jar
compile
Fetching and Validating a Page
---------HTMLParser can be used as a command-line jar file to fetch a single page and parse it. Parse errors will be logged to the console. For example
sh fetch.sh http://www.slashdot.orgExample Usage of the Library
---------To parse an arbitrary HTML document using the callback parser, provide an implementation of [HTMLParserListener](https://github.com/teverett/HTMLParser/blob/master/src/main/java/com/khubla/htmlparser/grammar/HTMLParserListener.java) along with an InputStream of HTML to [HTMLDocumentParser:parse](https://github.com/teverett/HTMLParser/blob/master/src/main/java/com/khubla/htmlparser/HTMLDocumentParser.java)
final InputStream inputStream = TestTreeWalk.class.getResourceAsStream("/example1.html");
final HTMLParserListener htmlParserListener = new ExampleListener();
HTMLDocumentParser.parse(inputStream, htmlParserListener);Licensing
---------HTMLParser is licensed under the [GPLv2](https://github.com/teverett/HTMLParser/blob/master/LICENSE)