Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/siddhant-vij/html-link-parser
Go-based package to parse links (<a> tags) from an HTML file.
https://github.com/siddhant-vij/html-link-parser
dfs gophercises io-readers parsing-html recursion
Last synced: about 2 months ago
JSON representation
Go-based package to parse links (<a> tags) from an HTML file.
- Host: GitHub
- URL: https://github.com/siddhant-vij/html-link-parser
- Owner: siddhant-vij
- License: mit
- Created: 2024-04-09T05:57:02.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2024-04-09T07:40:53.000Z (9 months ago)
- Last Synced: 2024-06-19T14:47:52.091Z (7 months ago)
- Topics: dfs, gophercises, io-readers, parsing-html, recursion
- Language: Go
- Homepage:
- Size: 8.79 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# HTML Link Parser
[Gophercises](https://gophercises.com/) Exercise Details:
In this exercise your goal is create a package that makes it easy to parse an HTML file and extract all of the links (`...` tags). For each extracted link you should return a data structure that includes the `href`.
Links will be nested in different HTML elements, and it is very possible that you will have to deal with HTML similar to code below.
```html
Something in a span
Text not in a span
Bold text!
```In situations like these we want to get output that looks roughly like:
```go
Link{
Href: "/dog",
}
```Once you have a working program, try to write some tests for it to practice using the testing package in go.
## Technical Notes
- Use the `x/net/html` package. Package html implements an HTML5-compliant tokenizer and parser.
- Ignore nested links. Eg with following HTML:
```html
Something here nested dog link
```
It is okay if your code returns only the outside link - for the purposes of this exercise.
*Include the nested links as well in the output.*
- Test the code with example files included in the project repository. *Improve your tests and edge-case coverage.* Add Examples and Documentation for the code. Run the following in this order, using go tooling:
- tests
- go test
- coverage
- go test -cover
- go test -coverprofile coverage.out
- coverage shown in web browser
- go tool cover -html=coverage.out
- examples shown in documentation in a web browser
- godoc -http=:8080