https://github.com/shiyanhui/dht

BitTorrent DHT Protocol && DHT Spider.
https://github.com/shiyanhui/dht

bittorrent-dht-protocol dht go spider

Last synced: 10 months ago
JSON representation

BitTorrent DHT Protocol && DHT Spider.

Host: GitHub
URL: https://github.com/shiyanhui/dht
Owner: shiyanhui
License: mit
Created: 2016-08-07T05:35:02.000Z (over 9 years ago)
Default Branch: master
Last Pushed: 2021-08-17T01:48:19.000Z (over 4 years ago)
Last Synced: 2025-04-13T23:55:16.357Z (11 months ago)
Topics: bittorrent-dht-protocol, dht, go, spider
Language: Go
Homepage: http://bthub.io
Size: 10.3 MB
Stars: 2,751
Watchers: 122
Forks: 489
Open Issues: 13
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-hacking-lists - shiyanhui/dht - BitTorrent DHT Protocol && DHT Spider. (Go)
awesome-crawler - dht - BitTorrent DHT Protocol && DHT Spider. (Go)
awesome-crawler-cn - dht - 支持DHT协议的网络爬虫. (Go)
my-awesome - shiyanhui/dht - dht-protocol,dht,go,spider pushed_at:2021-08 star:2.8k fork:0.5k BitTorrent DHT Protocol && DHT Spider. (Go)
awesome-golang-repositories - dht

README

          ![](https://raw.githubusercontent.com/shiyanhui/dht/master/doc/screen-shot.png)

See the video on the [Youtube](https://www.youtube.com/watch?v=AIpeQtw22kc).

[中文版README](https://github.com/shiyanhui/dht/blob/master/README_CN.md)

## Introduction

DHT implements the bittorrent DHT protocol in Go. Now it includes:

- [BEP-3 (part)](http://www.bittorrent.org/beps/bep_0003.html)

- [BEP-5](http://www.bittorrent.org/beps/bep_0005.html)

- [BEP-9](http://www.bittorrent.org/beps/bep_0009.html)

- [BEP-10](http://www.bittorrent.org/beps/bep_0010.html)

It contains two modes, the standard mode and the crawling mode. The standard

mode follows the BEPs, and you can use it as a standard dht server. The crawling

mode aims to crawl as more metadata info as possiple. It doesn't follow the

standard BEPs protocol. With the crawling mode, you can build another [BTDigg](http://btdigg.org/).

[bthub.io](http://bthub.io) is a BT search engine based on the crawling mode.

## Installation

    go get github.com/shiyanhui/dht

## Example

Below is a simple spider. You can move [here](https://github.com/shiyanhui/dht/blob/master/sample)

to see more samples.

```go

import (

    "fmt"

    "github.com/shiyanhui/dht"

)

func main() {

    downloader := dht.NewWire(65535)

    go func() {

        // once we got the request result

        for resp := range downloader.Response() {

            fmt.Println(resp.InfoHash, resp.MetadataInfo)

        }

    }()

    go downloader.Run()

    config := dht.NewCrawlConfig()

    config.OnAnnouncePeer = func(infoHash, ip string, port int) {

        // request to download the metadata info

        downloader.Request([]byte(infoHash), ip, port)

    }

    d := dht.New(config)

    d.Run()

}

```

## Download

You can download the demo compiled binary file [here](https://github.com/shiyanhui/dht/files/407021/spider.zip).

## Note

- The default crawl mode configure costs about 300M RAM. Set **MaxNodes**

  and **BlackListMaxSize** to fit yourself.

- Now it cant't run in LAN because of NAT.

## TODO

- [ ] NAT Traversal.

- [ ] Implements the full BEP-3.

- [ ] Optimization.

## FAQ

#### Why it is slow compared to other spiders ?

Well, maybe there are several reasons.

- DHT aims to implements the standard BitTorrent DHT protocol, not born for crawling the DHT network.

- NAT Traversal issue. You run the crawler in a local network.

- It will block ip which looks like bad and a good ip may be mis-judged.

## License

MIT, read more [here](https://github.com/shiyanhui/dht/blob/master/LICENSE)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/shiyanhui/dht

Awesome Lists containing this project

README