https://github.com/wmentor/html
HTML data fetcher
https://github.com/wmentor/html
go go-lib go-library golang golang-library html html-parser parser
Last synced: 11 months ago
JSON representation
HTML data fetcher
- Host: GitHub
- URL: https://github.com/wmentor/html
- Owner: wmentor
- License: mit
- Created: 2020-03-11T20:22:48.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2023-12-12T06:48:34.000Z (over 2 years ago)
- Last Synced: 2025-01-14T06:54:11.669Z (over 1 year ago)
- Topics: go, go-lib, go-library, golang, golang-library, html, html-parser, parser
- Language: Go
- Size: 24.4 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# HTML
[](https://coveralls.io/github/wmentor/html?branch=master)
[](https://goreportcard.com/report/github.com/wmentor/html)
[](https://pkg.go.dev/github.com/wmentor/html)
[](https://opensource.org/licenses/MIT)
Simple HTML parser and data fetcher library written on Golang under MIT License.
## Require
* Golang (version >= 1.20)
* golang.org/x/net
## Install
```
go get github.com/wmentor/html
```
## Usage
### Fetch data from URL
```golang
package main
import (
"fmt"
"time"
"github.com/wmentor/html"
)
func main() {
src := "https://edition.cnn.com"
parser := html.New()
opts := &html.GetOpts{
Agent:"Mozilla/5.0 (compatible; MSIE 10.0)",
Timeout: time.Second*60,
}
parser.Get(src,opts)
fmt.Println( string(parser.Text()) )
parser.EachLink(func(link string) {
fmt.Println("url=" + link)
} )
parser.EachImage(func(link string) {
fmt.Println("img=" + link)
} )
parser.EachIframe(func(link string) {
fmt.Println("iframe=" + link)
} )
}
```
### Fetch data from file/stdin
```golang
package main
import (
"fmt"
"os"
"github.com/wmentor/html"
)
func main() {
parser := html.New()
parser.Parse(os.Stdin) // io.Reader
fmt.Println( string(parser.Text()) )
parser.EachLink(func(link string) {
fmt.Println("url=" + link)
} )
parser.EachImage(func(link string) {
fmt.Println("img=" + link)
} )
parser.EachIframe(func(link string) {
fmt.Println("iframe=" + link)
} )
}
```