https://github.com/moredure/drum
Golang implementation of the disk repository with update management (DRUM) framework as presented by Hsin-Tsang Lee, Derek Leonard, Xiaoming Wang, and Dmitri Loguinov in the paper "IRLbot: Scaling to 6 Billion Pages and Beyond"
https://github.com/moredure/drum
drum golang url webcrawler
Last synced: about 1 month ago
JSON representation
Golang implementation of the disk repository with update management (DRUM) framework as presented by Hsin-Tsang Lee, Derek Leonard, Xiaoming Wang, and Dmitri Loguinov in the paper "IRLbot: Scaling to 6 Billion Pages and Beyond"
- Host: GitHub
- URL: https://github.com/moredure/drum
- Owner: moredure
- License: mit
- Created: 2021-05-18T18:35:35.000Z (about 5 years ago)
- Default Branch: main
- Last Pushed: 2021-05-28T12:04:27.000Z (about 5 years ago)
- Last Synced: 2025-03-17T22:08:53.721Z (over 1 year ago)
- Topics: drum, golang, url, webcrawler
- Language: Go
- Homepage:
- Size: 211 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# DRUM
[](https://godoc.org/github.com/moredure/drum)
Golang implementation of the disk repository with update management (DRUM) framework as presented by Hsin-Tsang Lee, Derek Leonard, Xiaoming Wang, and Dmitri Loguinov in the paper "IRLbot: Scaling to 6 Billion Pages and Beyond". Disk-based bloom filter alternative with storage capabilities.
Credits
- [DRUM - A C++ Implementation for the URL-seen Test of a Web Crawler](https://www.codeproject.com/Articles/36221/DRUM-A-C-Implementation-for-the-URL-seen-Test-of-a)
- [Scaling to 6 Billion Pages and Beyond](https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.145.7075&rep=rep1&type=pdf)
- [JDRUM](https://github.com/RovoMe/JDrum)