An open API service indexing awesome lists of open source software.

https://github.com/feluelle/pastebin-crawler-lib

A library for web crawling http://pastebin.com
https://github.com/feluelle/pastebin-crawler-lib

pastebin web-crawling

Last synced: about 1 year ago
JSON representation

A library for web crawling http://pastebin.com

Awesome Lists containing this project

README

          

# PastebinCrawlerLib
A library for web crawling http://pastebin.com

## Is this legal?
According to pastebin's offical website it is.
See http://pastebin.com/faq#17
```
I got blocked! Can I scrape your website?

Yes, but we do limit the amount of requests that people can make,
so it is very possible that you get blocked from time to time.
If you want to scrape our platform more intensely,
we have a custom scraper API available where we can whitelist your IP, so you don't get blocked anymore.
This feature is only available for LIFETIME PRO users. To learn more, visit our scraping page.
```
See its FAQ section http://pastebin.com/faq for more information.

## ToDo
- pastebin-crawler-lib [contains PastebinCrawler, PastebinDocument]

- text-analystics-api [returns {name:string, result:ICollection}]
if contains password then split by ' '
if contains email then

- pastebin-data-sniffer (+ Filter)