Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/pocka/ccht
A simple command-line tool to crawl and test your website resources' HTTP status code, like broken link checker
https://github.com/pocka/ccht
crawlers puppeteer
Last synced: about 1 month ago
JSON representation
A simple command-line tool to crawl and test your website resources' HTTP status code, like broken link checker
- Host: GitHub
- URL: https://github.com/pocka/ccht
- Owner: pocka
- License: mit
- Created: 2020-12-29T02:15:41.000Z (almost 4 years ago)
- Default Branch: master
- Last Pushed: 2022-07-02T13:34:04.000Z (over 2 years ago)
- Last Synced: 2024-09-14T14:48:02.091Z (2 months ago)
- Topics: crawlers, puppeteer
- Language: TypeScript
- Homepage: https://pocka.github.io/ccht/
- Size: 1.64 MB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# ccht
[![npm](https://img.shields.io/npm/v/ccht)](https://www.npmjs.com/package/ccht)
[![Test and Lint Workflow Status](https://img.shields.io/github/workflow/status/pocka/ccht/Test%20and%20Lint?label=test)](https://github.com/pocka/ccht/actions?query=workflow%3A%22Test+and+Lint%22)
[![Publish Package Workflow Status](https://img.shields.io/github/workflow/status/pocka/ccht/Publish%20package?label=publish)](https://github.com/pocka/ccht/actions?query=workflow%3A%22Publish+package%22)> Command-line Crawling HTTP Testing tool
ccht is a simple command-line tool to crawl and test your website resources' HTTP status code, like broken link checker.
## Installation
You can skip installation if you use `npx` for one-time invocation.
```sh
$ npm i -D ccht# or
$ yarn add -D ccht
```## Usage
```
ccht [options]
``````sh
# to crawl and test "https://example.com"
$ npx ccht 'https://example.com'# to show help
$ npx ccht --help
```ccht will crawl the site starting from the given URL.
## Options
To see more options, run `npx ccht --help`.
### Global Options
#### `--crawler `
Choose crawler. Available crawlers:
##### `node-http`
Default. Crawls pages by using Node.js' HTTP module and [cheerio](https://www.npmjs.com/package/cheerio).
##### `puppeteer`
Crawls pages by using a real browser through [Puppeteer](https://pptr.dev/).
You need to install puppeteer (`npm i -D puppeteer`) or configure your environment (browser DevTool protocol connection, executable.)#### `--reporter `
Specify reporter, which formats and outputs the test result.
##### `code-frame`
Default. Outputs human-friendly visuallized result.
##### `json`
Prints JSON string.
Useful for a programmatic access to results.```sh
$ npx ccht 'https://example.com' --reporter=json | jq
```#### `--include `
A comma separated list of a URL to include in a result.
Any URLs forward-matching will be crawled and be reported.Defaults to the given URL.
For example, given `npx ccht 'https://example.com'` then `--include` will be `https://example.com`.#### `--exclude `
A comma separated list of a URL to exclude from a result.
Any URLs forward-matching will be skipped nor be removed from a result.#### `--expected-status `
A comma separated list of an expected HTTP status code for pages.
Any pages responded with other status codes result in error (`unexpected_status`).Defaults to `200`.
#### `--exit-error-severity `
Change which severities occurs exit status `1`.
Available severities are below:- `danger`
- `warning`
- `info`
- `debug`Defaults to `danger`.
### Crawler Options
#### `--timeout `
Timeout for each page to load/response in crawling phase.
This option value is directly goes to `node-fetch`'s or `puppeteer`'s one.Defaults to `3000` (3s).
#### `--concurrency `
How many connection can exist at the same time, a size for connection pool.
Defaults to `1`.