https://github.com/shou/crawler
Concurrent Haskell website crawler outputting sitemaps
https://github.com/shou/crawler
Last synced: 11 months ago
JSON representation
Concurrent Haskell website crawler outputting sitemaps
- Host: GitHub
- URL: https://github.com/shou/crawler
- Owner: Shou
- License: bsd-3-clause
- Created: 2016-10-04T11:26:43.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2016-11-11T08:50:00.000Z (over 9 years ago)
- Last Synced: 2025-01-10T01:22:44.831Z (over 1 year ago)
- Language: Haskell
- Homepage:
- Size: 31.3 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
### A concurrent Haskell website crawler.
## Install
You need Stack to install this Haskell program.
[Get it here.](https://docs.haskellstack.org/en/stable/README/#how-to-install)
Once you're in the the project git directory, run `stack install`.
You can run the small tests with `stack build --test`.
## Run
The program takes a list of homepage URLs to crawl as a command-line argument.
`crawler 'http://example.com/'`
Optionally you can specify the amount of threads to use.
`crawler -t 10 'http://example.com/'`
## Design
[DESIGN.md](DESIGN.md)
## License
BSD3. See the LICENSE file for more information.