An open API service indexing awesome lists of open source software.

https://github.com/demogorgonz/broken-link-checker

Dockerized approach for broken link checking
https://github.com/demogorgonz/broken-link-checker

broken check checker checking link

Last synced: 4 months ago
JSON representation

Dockerized approach for broken link checking

Awesome Lists containing this project

README

          

# broken-link-checker

Dockerized approach for broken link checking

# Description

Based on [broken-link-checker/blc](https://www.npmjs.com/package/broken-link-checker).
Image comes with integrated web server in case you are testing web pages generated by static website generators, e.g mkdocs, hugo, etc..

___

## Examples:

1. Mount local files to webserver and run tests:

```bash
docker run -d --name broken-link-checker -v $(pwd)/site:/var/www/localhost/htdocs/site filips92/broken-link-checker
docker exec -ti broken-link-checker /bin/bash -c 'set -o pipefail; blc -eor http://localhost/site';echo "Container exit code (broken control DEBUG): $?" | grep -v "0 broken" | grep --color=auto -B 5 -A 5 broken
docker rm -f broken-link-checker
```

2. Start container which you can query multiple times to check multiple sources

```bash
docker run -d --name broken-link-checker -v $(pwd)/site:/var/www/localhost/htdocs/site filips92/broken-link-checker

docker exec -ti broken-link-checker /bin/bash -c 'set -o pipefail; blc -eor http://somewebsite.tld';echo "Container exit code (broken control DEBUG): $?" | grep -v "0 broken" | grep --color=auto -B 5 -A 5 broken
```

Where `somewebsite.tld` is FQDN.

---

## Build image yourself

From inside git folder `broken-link-checker` execute:

```bash
docker build -t broken-link-checker .

docker run -d --name broken-link-checker -v $(pwd)/site:/var/www/localhost/htdocs/site link-checker
```

---

## [BLC Usage](https://www.npmjs.com/package/broken-link-checker) :

```bash
Usage
blc [OPTIONS] [ARGS]

Options
--exclude A keyword/glob to match links against. Can be used multiple times.
--exclude-external, -e Will not check external links.
--exclude-internal, -i Will not check internal links.
--filter-level The types of tags and attributes that are considered links.
0: clickable links
1: 0 + media, iframes, meta refreshes
2: 1 + stylesheets, scripts, forms
3: 2 + metadata
Default: 1
--follow, -f Force-follow robot exclusions.
--get, -g Change request method to GET.
--help, -h, -? Display this help text.
--input URL to an HTML document.
--host-requests Concurrent requests limit per host.
--ordered, -o Maintain the order of links as they appear in their HTML document.
--recursive, -r Recursively scan ("crawl") the HTML document(s).
--requests Concurrent requests limit.
--user-agent The user agent to use for link checks.
--verbose, -v Display excluded links.
--version, -V Display the app version.

Arguments
INPUT Alias to --input
```