Giter Club home page Giter Club logo

mldht's Introduction

mldht

Build Status

A java library and standalone node implementing the Kademlia-based bittorrent mainline DHT, with long-running server-class nodes in mind.

Originally developed as DHT plugin for Azureus/Vuze

Features

Implemented specs:

Spec Title Status
BEP5 Bittorrent DHT Yes
BEP32 IPv6 Yes
BEP33 Scrapes Yes
BEP42 DHT Announce Security Partial; only the ip fields for external address discovery are supported
BEP9 Metadata exchange Partial; only fetching is supported
libtorrent.org Extended get_peers response
Forward compatibility
Client identification
Yes
BEP45 multi-homing/multi-address mode Yes
BEP44 Arbitrary data storage Yes

Additional:

  • high-performance implementation without compromising correctness, i.e. the node will be a good citizen
  • can process 20k packets per second on a single Xeon core
  • low latency lookups by using adaptive timeouts and a secondary routing table/cache tuned for RTT instead of stability
  • export of passively observed <infohash, ip> tuples to redis to survey torrent activity
  • remote CLI for common DHT operations
  • full automatic torrent indexing

Dependencies

  • java 8
  • maven 3.1 (building)
  • junit 4.x (tests)

build

git clone --recursive https://github.com/the8472/mldht.git .
mvn package appassembly:assemble
# install symlink scripts to ~/bin/ 
mvn antrun:run@link

run DHT node in standalone mode

mkdir -p work
cd work
../bin/mldht-daemon
# or manually
# java -cp ../target/* the8472.mldht.Launcher &

this will create various files in the current working directory

  • config.xml, change settings as needed, core settings will be picked up on file modification
  • shutdown, touch to cleanly shutdown running process (SIGHUP works too)
  • *-table.cache, persisted routing table for the ipv4/6 dhts, respectively
  • baseID.config, persisted node ID
  • logs/*, various diagnostics and log files
  • .keys/, default storage directory for BEP44 private keys. used by the CLI

Security note: the shell script launches the JVM with a debug port bound to localhost for easier maintenance, thus allowing arbitrary code execution with the current user's permissions. In a multi-user environment a custom script with debugging disabled should be used

embedding as library

It is not necessary to use the standalone Launcher, instead you can create DHT instances and control their configuration and lifecycle directly.

Consider the Launcher as an example-case how to instantiate DHT nodes.

Hooking into stream of incoming messages

After creating DHT instances, register a callback via addIncomingMessageListener(DHT.IncomingMessageListener l). It will be called for most incoming messages. Some but not all bogus/invalid ones will be prefiltered.

The callback is called from the message processing threads, so it should be non-blocking and thread-safe.

Message objects and their contents should not be modified.

network configuration

  • stateful NATs or firewalls should be put into stateless mode/disable connection tracking and use static forwarding rules for the configured local ports [default: 49001].
    Otherwise state table overflows may occur which lead to dropped packets and degraded performance.
  • nat/firewall rules should not assume any particular remote port, as other DHT nodes are free to chose their own.
  • If no publicly routable IPv6 address is available then IPv6 should be disabled
  • If only NATed IPv4 addresses are available then multihoming mode should be disabled
  • The length of network interface send queues should be increased when the DHT node is operated in multihoming mode on a server with many public IPs.
    This is necessary because UDP sends may be silently dropped when the send queue is full and DHT traffic can be very bursty, easily saturating too-small queues
    Check system logs or netstat statistics to see if outgoing packets are dropped.
  • For similar reasons the maximum socket receive buffer size should be set to at least 2MB, which is the amount this implementation will request when configuring its sockets

optional components

Some features are not enabled out of the box because they only require external infrastructure, provide public services or would cause extra traffic.

They are enabled by adding or uncommenting a <component><className>...</className></component> entry to the config.xml

  • the8472.mldht.cli.Server to enable the remote CLI
  • the8472.mldht.indexing.TorrentDumper obtains infohashes from incoming traffic, then does all the necessary work to fetch them. can acquire approximately 0.3 torrents per second on a single-homed setup without firewall.
  • the8472.mldht.indexing.ActiveLookupProvider raw TCP interface for requesting DHT scrapes on port 36578. just send infohashes in hex, newline separated
  • the8472.mldht.indexing.OpentrackerLiveSync implements a lan-local multicast sender for opentracker's IPv4 live sync. for passively observed DHT lookups will be inserted as peers in opentracker instance. opentracker instance can then be used as source for DHT statistics as if it were just another tracker
  • the8472.mldht.PassiveRedisIndexer obtains statistics on peers seen on particular infohashes

remote-cli

launch daemon with

    <component>
      <className>the8472.mldht.cli.Server</className>
    </component>

run CLI client with

bin/mldht-remote-cli help
# or manually:
# java -cp target/* the8472.mldht.cli.Client help

available commands (subject to change):

HELP                                                 - prints this help
PING ip port                                         - continuously pings a DHT node with a 1 second interval
GET hash [salt]                                      - perform a BEP44 get
PUT -f <input-path> [-keyfile <path>] [-salt <salt>]
PUT <input> [-keyfile <path>] [-salt <salt>]         - perform a BEP44 put, specifying a salt or keyfile implies a mutable put, immutable otherwise. data will be read from file or as single argument
GETTORRENT [infohash...]                             - peer lookup for <infohash(es)>, then attempt metadata exchange, then write .torrent file(s) to the current working directory
GETPEERS [infohash...] [-fast]                       - peer lookup for <infohash(es)>, print ip address/port tuples
BURST [count]                                        - run a batch of find_node lookups to random target IDs. intended test the attainable throughput for active lookups, subject to internal throttling

Security note: The CLI Server component listens on localhost, accepting commands without authentication from any user on the system. It is recommended to not use this component in a multi-user environment.

redis statistics export

    <component xsi:type="mldht:redisIndexerType">
      <className>the8472.mldht.PassiveRedisIndexer</className>
      <address>127.0.0.1</address><!-- additional parameter to allow exporting to other hosts -->
    </component>

custom components

Simply implement Component and configure the launcher to include it on startup through the config.xml:

    <component>
      <className>your.class.name.Here</className>
    </component>

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.