Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/lerouxrgd/sws
Sitemap Web Scraper
https://github.com/lerouxrgd/sws
Last synced: about 2 months ago
JSON representation
Sitemap Web Scraper
- Host: GitHub
- URL: https://github.com/lerouxrgd/sws
- Owner: lerouxrgd
- License: apache-2.0
- Created: 2021-06-26T16:50:31.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2023-12-22T21:20:43.000Z (about 1 year ago)
- Last Synced: 2024-04-23T22:39:12.733Z (8 months ago)
- Language: Rust
- Homepage: https://lerouxrgd.github.io/sws/
- Size: 216 KB
- Stars: 4
- Watchers: 3
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE-APACHE
Awesome Lists containing this project
README
# Sitemap Web Scraper
Sitemap Web Scraper (sws) is a tool for simple, flexible, and yet performant web
pages scraping.It consists of a CLI written in Rust that crawls web pages and executes a
[Lua JIT][lua-jit] script to scrap them, outputting results to a [CSV][] file.```sh
sws crawl --script examples/fandom_mmh7.lua -o result.csv
```Check out the [doc][sws-doc] for more details.
[lua-jit]: https://luajit.org/luajit.html
[csv]: https://en.wikipedia.org/wiki/Comma-separated_values
[sws-doc]: https://lerouxrgd.github.io/sws/