Projects in Awesome Lists tagged with storm-crawler
A curated list of projects in awesome lists tagged with storm-crawler .
https://github.com/commoncrawl/news-crawl
News crawling with StormCrawler - stores content as WARC
apache-storm common-crawl commoncrawl crawler news storm-crawler warc web-crawler
Last synced: 10 May 2025
https://github.com/tokenmill/crawling-framework
Easily crawl news portals or blog sites using Storm Crawler.
crawler crawling crawling-framework elasticsearch java scraping storm storm-crawler vaadin
Last synced: 22 Apr 2025
https://github.com/tokenmill/crawling-framework-example
Demonstration on how to use the Crawling Framework to setup a simple science news crawler and store results in ElasticSearch. Use this configuration to set up your own crawler.
crawler crawling-framework elasticsearch storm-crawler
Last synced: 24 Feb 2025