Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/sohunn/status-crawler
A tool to detect dead links on a website and summarize their HTTP statuses in a clear table, written using Golang.
https://github.com/sohunn/status-crawler
concurrent-programming deadlink-finder golang playwright webscraper webscraping
Last synced: 7 days ago
JSON representation
A tool to detect dead links on a website and summarize their HTTP statuses in a clear table, written using Golang.
- Host: GitHub
- URL: https://github.com/sohunn/status-crawler
- Owner: sohunn
- License: mit
- Created: 2024-12-01T08:37:24.000Z (2 months ago)
- Default Branch: master
- Last Pushed: 2024-12-10T14:52:24.000Z (about 2 months ago)
- Last Synced: 2024-12-31T07:42:16.553Z (about 1 month ago)
- Topics: concurrent-programming, deadlink-finder, golang, playwright, webscraper, webscraping
- Language: Go
- Homepage: https://sohunn.me
- Size: 8.79 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# StatusCrawler
This is a simple tool used to detect dead links on a website and summarize their HTTP statuses in a clear table, written in Golang.## Features✨
- Supports and validates links using `http` and `https` schemes.
- Uses [playwright](https://pkg.go.dev/github.com/playwright-community/playwright-go) to perform efficient web scraping.
- Leverages the power of go-routines with mutexes, wait groups and distributed locking mechanisms to increase performance and concurrency 🚀
- Clean summary in a tabular format.## How to use❓
- Make sure you have the latest version of [go](https://go.dev/dl/) installed.- Clone the repository using the following command:
```
git clone https://github.com/sohunn/status-crawler.git
```- Install dependencies:
```
go mod tidy
```- Make sure to install the browsers and OS dependencies:
```
go run github.com/playwright-community/playwright-go/cmd/playwright@latest install --with-deps
```- From the root of the project:
```
go run ./
```## Example
```
go run ./ "https://sohunn.me"
```## Building 🛠️
Check your Go env variables (`GOOS` and `GOPATH`) to make sure you are building the executable for the right platform. Once verified, run:
```
go build -o crawler.exe ./
```**Note:** You can call your executable whatever you want. I have specified `crawler` in the example
Once done, simply run the executable with the arguments like you normally would.
```
crawler.exe "https://sohunn.me"
```