https://github.com/voliveirajr/seleniumcrawler
An example using Selenium webdrivers for python and Scrapy framework to create a web scraper to crawl an ASP site
https://github.com/voliveirajr/seleniumcrawler
asp-net python scraper scraping scraping-websites scrapper scrapy selenium selenium-webdriver webcrawler webcrawling
Last synced: 8 months ago
JSON representation
An example using Selenium webdrivers for python and Scrapy framework to create a web scraper to crawl an ASP site
- Host: GitHub
- URL: https://github.com/voliveirajr/seleniumcrawler
- Owner: voliveirajr
- License: gpl-3.0
- Created: 2013-08-26T16:18:50.000Z (almost 13 years ago)
- Default Branch: master
- Last Pushed: 2019-02-28T14:18:19.000Z (over 7 years ago)
- Last Synced: 2025-04-08T19:52:36.047Z (about 1 year ago)
- Topics: asp-net, python, scraper, scraping, scraping-websites, scrapper, scrapy, selenium, selenium-webdriver, webcrawler, webcrawling
- Language: Python
- Size: 20.5 KB
- Stars: 127
- Watchers: 9
- Forks: 45
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
seleniumcrawler
===============
This is a Webcrawler based on Scrapy and Selenium frameworks
This spider crawls thru the directferries.com website in order to generate a json file with all tickets available for one
of these directions Dublin-Liverpool / Liverpool-Dublin with departure tomorrow and returning in 3 days.
HOW TO EXECUTE:
Is required an environment with the following requirements installed:
-Python 2.7
-Scrapy 0.18
-Selenium web-drivers
To execute the crawler the following command should be executed from the project path
scrapy crawl crawlermate_selenium -a category=[dublin or liverpool] -o [filename] -t json
for an example, to generate tickets for dublin to liverpool on items.json file you should execute
scrapy crawl crawlermate_selenium -a category=dublin -o items.json -t json
References:
http://docs.seleniumhq.org/
http://http://scrapy.org/