Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/igapyon/selecrawler
Simple selenium based web crawler
https://github.com/igapyon/selecrawler
chrome crawler java selenium web
Last synced: 18 days ago
JSON representation
Simple selenium based web crawler
- Host: GitHub
- URL: https://github.com/igapyon/selecrawler
- Owner: igapyon
- License: lgpl-3.0
- Created: 2017-04-01T13:00:45.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2017-04-16T12:46:35.000Z (almost 8 years ago)
- Last Synced: 2024-11-10T05:24:58.687Z (3 months ago)
- Topics: chrome, crawler, java, selenium, web
- Language: Java
- Homepage: https://igapyon.github.io/selecrawler/
- Size: 94.7 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# selecrawler
Simple utility kit for Java like web crawler using selenium and chrome.
## About robots rule
You will need to learn robots rule if you want to create web crawler.
You can find robots rule link below:
* https://en.wikipedia.org/wiki/Robots_exclusion_standard
## Note Maven
```sh
mvn archetype:generate -DgroupId=jp.igapyon.selecrawler -DartifactId=selecrawler -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false
```