https://github.com/anthonysigogne/scrapy

A list of simple scrapers made with Scrapy
https://github.com/anthonysigogne/scrapy

crawler elasticsearch python scrapy spider

Last synced: 2 months ago
JSON representation

A list of simple scrapers made with Scrapy

Host: GitHub
URL: https://github.com/anthonysigogne/scrapy
Owner: AnthonySigogne
License: mit
Created: 2017-07-20T22:11:38.000Z (almost 9 years ago)
Default Branch: master
Last Pushed: 2017-07-22T05:10:43.000Z (almost 9 years ago)
Last Synced: 2025-02-28T12:07:40.691Z (over 1 year ago)
Topics: crawler, elasticsearch, python, scrapy, spider
Language: Python
Homepage:
Size: 7.81 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Scrapy
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) ![Python 3.5](https://img.shields.io/badge/python-3.5-blue.svg)

This repository contains a list of simple scrapers made with Scrapy :

- basicspider.py - A simple spider that scraps data from a list of pages :
```
$ scrapy runspider basicspider.py -a file=list_pages.txt -o data.csv
```

- inscriptspider.py - A simple spider that scraps data from a list of pages, and launched inside a Python script via CrawlerProcess :
```
$ scrapy runspider inscriptspider.py url1 url2 url3 ... urlx
```

- basicrawler.py - A simple crawler that scraps data from a list of domains :
```
$ scrapy runspider basicrawler.py -a file=list_pages.txt -o data.csv
```

- persistencespider.py - A simple spider that scraps data from a list of pages, and saves it in the Elasticsearch database running at http://localhost:9200/ :
```
$ scrapy runspider persistencespider.py -a file=list_pages.txt
```

- persistencecrawler.py - A simple crawler that scraps data from a list of domains, and saves it in the Elasticsearch database running at http://localhost:9200/ :
```
$ scrapy runspider persistencecrawler.py -a file=list_pages.txt
```

## LICENCE
MIT

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/anthonysigogne/scrapy

Awesome Lists containing this project

README