https://github.com/victormartinez/shub_cli
A CLI for dealing with the features of ScrapingHub
https://github.com/victormartinez/shub_cli
cli crawler scrapinghub scrapinghub-api scrapy shub-cli spider spiders
Last synced: 4 months ago
JSON representation
A CLI for dealing with the features of ScrapingHub
- Host: GitHub
- URL: https://github.com/victormartinez/shub_cli
- Owner: victormartinez
- License: mit
- Created: 2016-09-25T18:00:40.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2021-04-20T17:05:34.000Z (about 5 years ago)
- Last Synced: 2025-12-29T07:26:20.978Z (6 months ago)
- Topics: cli, crawler, scrapinghub, scrapinghub-api, scrapy, shub-cli, spider, spiders
- Language: Python
- Homepage:
- Size: 58.6 KB
- Stars: 16
- Watchers: 1
- Forks: 0
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Scrapinghub CLI
A Command Line Interface at your hands to deal with the features of ScrapingHub.
[](https://landscape.io/github/victormartinez/shub_cli/master) [](https://travis-ci.org/victormartinez/shub_cli)

[Python Package Index](https://pypi.python.org/pypi/shub-cli)
### Install
You must install it using pip...
```
$ pip install shub-cli
```
... or [pipsi](https://github.com/mitsuhiko/pipsi)
```
$ pipsi install shub-cli
```
### Configuration
Shub CLI will look for the `.scrapinghub.yml` created by [ScrapingHub](https://doc.scrapinghub.com/shub.html?highlight=yml#quickstart) in your home directory and read the default API_KEY and PROJECT_ID.
If you do not have that file, set it up according to the example below:
```
~/.scrapinghub.yml
apikeys:
default:
projects:
default:
```
### Start
If you set up ~/.scrapinghub.yml file
```
$ shub-cli repl
```
Otherwise...
```
$ shub-cli -api -project repl
```
If you just want to run a command
```
$ shub-cli [credentials|spiders|job|jobs|schedule]
```
### Cheatsheet
```
> credentials
> spiders
> job [-show|-cancel|-delete id]
> jobs [-spider spider] [-tag tag] [-lacks tag] [-state pending|running|finished|deleted] [-count count]
> schedule [-spider spider] [-tags tag1,tag2] [-priority 1|2|3|4]
```
### Commands
#### Credentials
Check what credentials are being used to connect to Scrapinghub.
```
> credentials
```
#### Spiders
List all spiders available.
```
> spiders
```
#### Jobs
List the last 10 jobs or the ones according to your criteria.
```
> jobs
> jobs -spider -tag -lacks -state <[pending,finished,running,deleted]> -count <[0,1000]>
```
Example:
```
> jobs
> jobs -spider example -tag production -lacks consumed -state finished -count 100
```
**Attention:** By default, shub-cli will prompt the last 10 jobs. To override that behaviour use the -count parameter with the number of jobs you intend to show.
#### Job
Show, delete or cancel a id.
```
> job -show
> job -show --with-log
> job -delete
> job -cancel
```
Example:
```
> job -show 11/23/19801
> job -show 11/23/19801 --with-log
> job -delete 11/23/19801
> job -cancel 11/23/19801
```
#### Schedule
Schedule a spider execution.
```
> schedule -spider -priority <[1,2,3,4]> -tags
```
Example:
```
> schedule -spider my-spider
> schedule -spider my-spider -priority 4 -tags production,periodic
> schedule -spider my-spider -priority 3 -tags test
```
### Help:
For help or suggestion please open an issue at the [Github Issues page](https://github.com/victormartinez/shub_cli/issues).