Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/william1nguyen/carlist-crawler-python

A Pipeline for extracting data from Carlist.my and load to ElasticSearch
https://github.com/william1nguyen/carlist-crawler-python

crawling-python elasticsearch etl-pipeline scrapy

Last synced: about 1 month ago
JSON representation

A Pipeline for extracting data from Carlist.my and load to ElasticSearch

Awesome Lists containing this project

README

        

## CRAWLY

Scrapy Scripts for scraping data

### Run Commands

* Clone Project and go to `crawly/`

```
$ git clone [email protected]:natalieconan/crawly.git
$ cd crawly
```

* (Optional) To install pipenv with `Homebrew`:
```
$ brew install pipenv
```

* Activate Python Virtual Env using `Pipenv` and install packages
```
$ pipenv shell
$ pipenv install
```

* Finally, run spider for crawling
```
$ scrapy crawl ${spider_name}
```

In this case `spider_name = carlist`, so run this command to start crawling:

```
$ scrapy crawl carlist
```