Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/rly0nheart/tarantula

Python web crawler tool
https://github.com/rly0nheart/tarantula

crawling scraping web-crawler web-scraping

Last synced: 3 months ago
JSON representation

Python web crawler tool

Awesome Lists containing this project

README

        

![Python Version](https://img.shields.io/badge/python-3.x-blue?style=flat&logo=python)
![OS](https://img.shields.io/badge/OS-GNU%2FLinux-red?style=flat&logo=linux)
![GitHub](https://img.shields.io/github/license/rlyonheart/tarantula?style=flat)
![GitHub repo size](https://img.shields.io/github/repo-size/rlyonheart/tarantula)
![Lines of code](https://img.shields.io/tokei/lines/github/rlyonheart/tarantula)
![CodeFactor](https://www.codefactor.io/repository/github/rlyonheart/tarantula/badge)
![Twitter](https://img.shields.io/twitter/follow/rly0nheart?&style=flat&logo=twitter)
[![asciicast](https://asciinema.org/a/446985.svg)](https://asciinema.org/a/446985)

Python web crawler tool.
scrapes internal and external urls

# Installation
**Clone this repo:**
```
git clone https://github.com/rlyonheart/tarantula.git
```

```
cd tarantula
```

```
pip install -r requirements.txt
```

# Optional Arguments
| Flag | MetaVar| Usage|
| ------------- |:----------------------:|:---------:|
| -c/--count | **NUMBER** | *Number of links to crawl (default is 30)* |
| -v/--verbose | | *run tarantula in verbose mode* |