https://github.com/gabrielolobo/crawley

This project is designed to run crawlers and process the results based on the specified output format. It takes command-line arguments to select the crawler and output format.
https://github.com/gabrielolobo/crawley

crawler poetry python scrapping

Last synced: about 1 year ago
JSON representation

This project is designed to run crawlers and process the results based on the specified output format. It takes command-line arguments to select the crawler and output format.

Host: GitHub
URL: https://github.com/gabrielolobo/crawley
Owner: Gabrielolobo
Created: 2023-05-29T21:08:37.000Z (about 3 years ago)
Default Branch: main
Last Pushed: 2023-06-07T03:50:23.000Z (about 3 years ago)
Last Synced: 2025-02-28T23:28:40.644Z (over 1 year ago)
Topics: crawler, poetry, python, scrapping
Language: Python
Homepage:
Size: 35.2 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Crawley

This project allows you to run crawlers and save the results in various formats.

# Installing and Running

## 1. Clone Repository

- git clone https://github.com/your/repository.git

## 2. Install with Poetry

### Inside the Crawley directory, do:

- poetry install

To install the enviroment via poetry.

### Then, do:

- poetry shell

To activate the virtual enviroment.

### This will setup the enviroment to run the cli.

## 3. Running Crawley

### Available Crawlers:

- VultrCrawler
- HostgatorCrawler

### Arguments:

- print
- save_json
- save_csv

### Inside Crawley, you will run the command:

-python -m crawley.cli (Crawler) (argument)

### To print the required information crawled from the website:

- python -m crawley.cli (Crawler) print"

This should print the required information crawled from the website.

### To save file in .json, execute:

- python -m crawley.cli (Crawler) (argument) --filename (output).json

The .json file will be saved in the current directory

* You can choose the name of your file by switching in (output).

### To save a file in .csv, execute:

- python -m crawley.cli (Crawler) (argument) --filename (output).csv

The .csv file will be saved in the current directory

* You can choose the name of your file by switching in (output).

## Final Note

That's the end of it. I hope this application can live up to
the standards of good python Implementation.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/gabrielolobo/crawley

Awesome Lists containing this project

README