Giter Club home page Giter Club logo

dht's Introduction

DHT 网络爬虫

GitHub GitHub release (latest by date)

架构设计

DHT Server -> Redis

Redis <- Peer -> (Mongodb && local)

模块解释

  • dht-common 公共变量和方法
  • dht-fresh hash的7天内统计日活跃数
  • dht-krpc krpc 协议实现
  • dht-peer Peer 客户端实现(TCP),实现端与端之间的数据交互,来实现获取对端的metadata数据和存储
  • dht-routing-table 为dht-server 实现的内部路由表
  • dht-server 负责基于UDP传输协议的DHT网络传输Bencode编码的服务器

config.properties 配置文件

DHT Server

server.port=6881                #监听端口
server.nodes.min=20             #node节点最少数量
server.nodes.max=3000           #node节点最大数量
server.findNode.interval=60     #执行find_node方法时间间隔(单位秒)
server.ping.interval=300        #执行ping方法时间间隔(单位秒)
server.removeNode.interval=300  #执行删除失效节点时间间隔(单位秒)
server.fresh=false              #是否开启hash统计 需要开启fresh 不然redis list数据会被占满
redis.host=127.0.0.1            #redis地址
redis.port=6379                 #redis端口
redis.password=                 #redis密码
redis.database=0                #redis Database

Peer

peers.core.pool.size=5          #peer核心线程数
peers.maximum.pool.size=10      #peer最大线程数
redis.host=127.0.0.1            #redis地址
redis.port=6379                 #redis端口
redis.password=                 #redis密码
redis.database=0                #redis Database
mongodb.url=                    #mongodb url

实现协议

✔️ DHT Protocol

✔️ Extension for Peers to Send Metadata Files

✔️ Extension Protocol

运行

jar包和config.properties配置文件要在同一目录

java  -jar dht-server-1.0-SNAPSHOT-jar-with-dependencies.jar &
java  -jar dht-peer-1.0-SNAPSHOT-jar-with-dependencies.jar &

Docker

运行在Docker

dht-server

dht-peer

ENV 环境变量配置

DHT Server

PORT = 6881                 #端口
MIN_NODES = 20              #node节点最少数量
MAX_NODES = 5000            #node节点最大数量
FRESH = false               #是否开启hash统计 需要开启fresh 不然redis list数据会被占满
REDIS_HOST = 127.0.0.1      #redis地址
REDIS_PORT = 6379           #redis端口
REDIS_PASSWORD = ''         #redis密码
REDIS_DATABASE = 0          #redis Database

DHT Peer

REDIS_HOST = 127.0.0.1              #redis地址
REDIS_PORT = 6379                   #redis端口
REDIS_PASSWORD = ''                 #redis密码
REDIS_DATABASE = 0                  #redis Database
MONGODB_URL = 'mongodb://localhost' #mongodb url

快速运行

docker

docker run -d --name redis --network host redis:5.0.10
docker run -d --name dht-server --network host zpqsunny/dht-server:latest
docker run -d --name mongo --network host -v /docker/mongo/db:/data/db -e MONGO_INITDB_ROOT_USERNAME=admin -e MONGO_INITDB_ROOT_PASSWORD=admin mongo:4.4.1
docker run -d --name dht-peer --network host -v /metadata:/metadata -e MONGODB_URL="mongodb://admin:[email protected]:27017/?authSource=admin" -e REDIS_HOST=127.0.0.1 -e REDIS_PORT=6379 zpqsunny/dht-peer:latest

docker-compose

services:
  redis:
    container_name: redis
    image: redis:5.0.10
    network_mode: host
    restart: unless-stopped
  dht-server-1: &dht-server
    depends_on:
      - redis
    image: zpqsunny/dht-server:latest
    build:
      context: dht-server
      dockerfile: Dockerfile
    network_mode: host
    restart: unless-stopped
    environment:
      PORT: 6881
      REDIS_HOST: 127.0.0.1
      REDIS_PORT: 6379
      REDIS_PASSWORD:
      REDIS_DATABASE: 0
  dht-server-2:
    <<: *dht-server
    environment:
      PORT: 6882
  dht-server-3:
    <<: *dht-server
    environment:
      PORT: 6883
  mongo:
    container_name: mongo
    image: mongo:4.4.1
    volumes:
      - /docker/mongo/db:/data/db
      - /docker/mongo/backup:/backup
    environment:
      MONGO_INITDB_ROOT_USERNAME: admin
      MONGO_INITDB_ROOT_PASSWORD: admin
    network_mode: host
    restart: unless-stopped
  dht-peer:
    depends_on:
      - redis
      - mongo
    deploy:
      mode: replicated
      replicas: 3
    image: zpqsunny/dht-peer:latest
    build:
      context: dht-server
      dockerfile: Dockerfile
    network_mode: host
    restart: unless-stopped
    volumes:
      - /metadata:/metadata
    environment:
      MONGODB_URL: mongodb://admin:[email protected]:27017/?authSource=admin
      REDIS_HOST: 127.0.0.1
      REDIS_PORT: 6379
      REDIS_PASSWORD:
      REDIS_DATABASE: 0
docker-compose up

示例数据

example-data

Stargazers over time

Stargazers over time

鸣谢

IntelliJ IDEA 是一个在各个方面都最大程度地提高开发人员的生产力的 IDE,适用于 JVM 平台语言。

特别感谢 JetBrains 为开源项目(Open Source Projects)提供免费的 IntelliJ IDEA 等 IDE 的授权

IntelliJ IDEA logo

dht's People

Contributors

dependabot[bot] avatar zpqsunny avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

dht's Issues

抓取速率慢的问题

作者你好,请问可以通过调整哪个参数来提高infohash和metainfo的获取速率?

求联系方式

您好, 对您的这个项目很感兴趣,寻求商业合作。

之前发了个邮件给您 但是没有回复 所以在这里给您留言了。 请加我的微信
aiden-jo

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.