https://github.com/supercuber/pastebin-crawler
https://github.com/supercuber/pastebin-crawler
Last synced: 8 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/supercuber/pastebin-crawler
- Owner: SuperCuber
- Created: 2022-07-17T11:32:07.000Z (almost 4 years ago)
- Default Branch: master
- Last Pushed: 2022-07-17T14:29:14.000Z (almost 4 years ago)
- Last Synced: 2025-02-05T20:57:29.979Z (over 1 year ago)
- Language: Python
- Size: 6.84 KB
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
### Setup
1. Install docker
1. In the root of the repository, run `docker-compose build`.
This step will need to be re-executed if the code changes.
### Usage
1. Run `docker-compose run --rm crawler [ARGS]` to run the crawler,
for example `docker-compose run --rm crawler --help` to see the options available.
1. To stop the crawling send SIGINT (for example by pressing CTRL-C)
1. To use crawling results, connect to `mongodb://localhost:27018/` (database: crawler, collection: pastes)
### Cleanup
1. To turn off the DB run `docker-compose down`.
Note that this will preserve the volume containing the results.
1. To remove the volume, run `docker-compose down --volumes`