Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/nagilum/focus
Simple CLI tool, written in C#, to crawl a site and log the responses.
https://github.com/nagilum/focus
cli crawl crawler csharp playwright
Last synced: about 2 months ago
JSON representation
Simple CLI tool, written in C#, to crawl a site and log the responses.
- Host: GitHub
- URL: https://github.com/nagilum/focus
- Owner: nagilum
- License: mit
- Created: 2024-04-16T13:19:40.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2024-05-02T20:32:41.000Z (8 months ago)
- Last Synced: 2024-05-03T01:35:05.390Z (8 months ago)
- Topics: cli, crawl, crawler, csharp, playwright
- Language: C#
- Homepage:
- Size: 65.4 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# Focus
Simple CLI tool to crawl a site and log the responses.
## Clone and Build
```shell
git clone https://github.com/nagilum/focus
cd Focus
dotnet build -c Release
```## Usage
```shell
focus https://example.com
```## Parameters
### Set Rendering Engine
Focus uses Playwright behind the scenes for all HTML related requests.
You can select between using `chromium`, `firefox`, and `webkit` as the rendering engine to use.
Focus defaults to using `chromium`.
To set the rendering engine, use the `-e` option.```shell
focus https://example.com -e firefox
```*This will set the rendering engine to `Firefox`.*
### Set Retry Attempts
You can set it so that Focus will retry failed requests `n` number of times.
A failed request is either where an error caused it to not complete, or if the response HTTP status code is not in the 2xx range.
By default Focus will not retry failed requests.
To set retry atttempt, use the `-r` option.```shell
focus https://example.com -r 1
```*This will retry all failed requests `1` time.*
### Set Request Timeout
You can set the request timeout for all requests.
The default timeout is `10` seconds.
Set the timeout to `0` to disable it.
To set a new timeout, use the `-t` option.```shell
focus https://example.com -t 3
```*This will set the request timeout to `3` seconds.*
```shell
focus https://example.com -t 0
```*This will disable the timeout feature.*
### Add Multiple URLs
You can setup Focus to crawl more than one URL, by simply adding more URLs to the parameter list.
```shell
focus https://example.com https://another-domain.com https://example.com/some-hidden-page
```This will add those 3 URLs to the queue from the get-go and crawl from there.