Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/m-osource/cassiopeiabot

C++ multithread Linux Web Crawler
https://github.com/m-osource/cassiopeiabot

algorithm berkeleydb bot cassiopeia cplusplus crawler download engine hashing html-parser information-retrieval link-analysis multithread open-source regex search web web-crawler webcrawler www

Last synced: 15 days ago
JSON representation

C++ multithread Linux Web Crawler

Awesome Lists containing this project

README

        

Fork of the WIRE-Nic open source project which is itself a fork of the WIRE open source project (Web Information Retrieval Environment) that was developed by Center for Web Research from University of Chile.
More information about it can be found at http://www.cwr.cl/projects/WIRE/ and https://sourceforge.net/projects/wire-nic/.

CassiopeiaBot was tested by downloading Sardinian web contents, organizing them internally with the primary objective of allowing the search engine to provide relevant results and a drastic reduction in latency times thanks also to the strategy adopted for their indexing.