Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/mariusvanderwijden/swcl

Simple Webcrawler
https://github.com/mariusvanderwijden/swcl

Last synced: 17 days ago
JSON representation

Simple Webcrawler

Awesome Lists containing this project

README

        

# swcl
Simple Webcrawler

This is my first github project, so be gentle ;)

SWCL is a simple webcrawler which is supposed to crawl sites from a specified base-url on with different options
these options are:

-crawl only subsites of the base url
-crawl only subsites of the base url and save them to a local directory
-crawl every link found on the site and recursively on the sites
-crawl based on a dictionary (works like a spider)
-crawl based on a dictionary and save everything to a local directory
-save the found urls to a database or print them to the command line
-works on multiple cores(can be specified) to use the whole bandwitdth
-specific recursion depth specifiable

Doesn't work with js-parameters!