Giter Club home page Giter Club logo

peer-observer's Introduction

peer-observer

Tool to monitor for P2P anomalies and attacks using well-behaving, passive Bitcoin Core honeynodes (honeypot nodes).

Components and their interaction

The peer-observer consists of multiple components. Primiarly an extractor that extracts events from a Bitcoin Core node and multiple tools that process the extracted data. The extractor and tools are connected with a nanomsg-based PUB-SUB TCP connection. The exchanged messages are serialized protobuf structures.

The extractor is written in Rust and uses the Bitcoin Core tracepoints to extract events like received and send P2P messages, open and closed P2P connections, mempool changes, and more. This is implemented using the USDT capabilites of libbpf-rs. The Bitcoin P2P protocol messages are deserialized using rust-bitcoin.

The tools are written in Python or Rust (or any other language that supports nanomsg and protobuf). They subscribe to the nanomsg publisher. For example, the logger tool simply prints out all messages that it receives, the metrics tool produces prometheus metrics, and the addr-connectivity tool tests received addresses if they are reachable. Python tools can make use of the protobuf/python-types to deserialize the Protobuf messages while Rust tools can use the types from the shared Rust module.

                                                ┌──────────────────────┐
                                      Nanomsg   │ Tools                │
                                      PUB-SUB   │                      │
                                         ┌──────┼──►logger             │
              Tracepoints                │      │                      │
┌───────────┐ via libbpf                 ├──────┼──►metrics            │
│  Bitcoin  │          ┌───────────┐     │      │                      │
│ Core Node ├──────────► extractor ├─────┼──────┼──►archiver           │
└───────────┘          └───────────┘     │      │                      │
                                         ├──────┼──►addr-connectivty   │
                                         │      │                      │
                                         └──────┼──►...                │
                                      protobuf  │                      │
                                      messages  └──────────────────────┘

Real-world usage

On public.peer.observer, I run a peer-observer instance with multiple Bitcoin Core honeynodes. To avoid leaking the IP addresses of these honeynodes (an P2P attacker would just not attack these), public access is limited.

Setting up a peer-observer instance is non-trivial as hooking into the Bitcoin Core tracepoints requires elevated system privileges. Additionally, a few not-yet-merged patches to Bitcoin Core are required at the moment. Documentation is sparse or non-existent. Feel free to open an issue if you still want to set up an instance and I'll do my best to add more documentation.

peer-observer's People

Contributors

0xb10c avatar i-am-yuvi avatar

Stargazers

Andras Gemes avatar  avatar 22388o⚡️  avatar  avatar Felix Weis avatar cuterrrrrrrrrrr avatar Cfunk avatar dunxen avatar

Watchers

 avatar  avatar

Forkers

i-am-yuvi

peer-observer's Issues

metric: on closed connection, measure the connection duration

A spike in this metric allows us to detect many old connections being dropped at the same time. Currently, most connections dropped are very short-lived (a second or less) but we can't differentiate them from the long-lived (and probably more important) connections

Extractor (and other tool) arguments with clap

It should be possible to disable certain tracepoint groups (i.e. connection tracepoints) with a command line argument.

Also, being able to set a custom port with a command line argument would be good.

We could use the clap crate for that.

Implement filter for `logger` tool

The logger tool is quite spammy currently as it prints all events (p2p message, connections, addrman (if enabled), validation, and mempool). It would be good to have a feature where we can pass a command line argument to enable only the desired event types. E.g. passing --log-p2pmsgs would only log p2p messages. Passing --log-p2pmsgs --log-connections would log p2p messages and p2p connections.

Improve logging for all tools

Currently, most tools and the extractor don't have proper logging. We might want to:

  • move the simple_logger dependency into shared to have only one place to upgrade and to be sure we're only using one version across all tools
  • replace all println! macros with their respective log:<level>! macros
  • insert SimpleLogger::new().init() at startup
  • (optionally) make logging levels configurable through clap
  • ...

Anomaly detection and alerting for interesting Bitcoin P2P metrics

The current Grafana dashboards show a the raw numbers from Prometheus (via the metrics) tool. Anomaly detection and alerting is not yet implemented.

For example:
image

Here, an anomaly could be a sudden drop in inbound peers connected to one or more peers as in https://b10c.me/observations/05-inbound-connection-flooder-down/. To detect this, a Z-score could be used. If the z-score is above a certain threshold, send an alert.

image

Here, a spike in outbound and (inbound too) address messages across all nodes could indicate an anomaly. Here a Z-score could be used. Maybe there are other possible ways to explore which can be used to detect anomalies.

This issue can be used for discussion and brainstorming.

metrics: conn_closed_address and conn_inbound_address grow too large

The prometheus metrics endpoint currently lists a counter for each address that ever closed an opened a connection to us. This can grow quite large and scraping this can become resource intensive.

A few options:

  • Write a custom text encoder that only includes address metrics if we saw that address more than e.g. 5 times in total. Downside: We potentially don't see long-running connections that only ever connected once.
  • Only include the subnet and not the full address. This should reduce the number of lines drastically
  • Use compression in the metrics webserver
  • Reset metrics every 24h in some way:

Automatically Detecting Spy Peers

Currently, there is no feature to detect spy peers/nodes, which has been discussed here.

Some important anomalies to consider are:

  • When INV is sent to the peer, but the peer doesn't send GETDATA to our node
  • Spy nodes never send us INV for transactions we sent

We only handle INV, GETDATA, and TX p2p messages. We need to maintain a shared state – one entry (IP address + Port) for each connection with the number of INV, GETDATA, and TX sent and received. Additionally, spy peers will close the connection, so we also need to account for handling closed connections. Once all this is implemented, we can have some stats on normal/spy peers/nodes.

One approach could be to find the ratio of INV/GETDATA for each peer, but there might be other heuristics to detect peer identity.

A reference implementation has been done at the following URL: https://github.com/i-am-yuvi/peer-observer/tree/spy-detection.

Test the tools

By recording and then replaying some of the nanomsg/protobuf communication that the extractor publishes, we can write regression test for the tools in the CI. This would be useful to have!

See also: #25 (comment)

extract nng crate to `shared` module

multiple tools currently each define a dependency on the nng (nanomsg) crate. We could move this to the shared module as it is needed for every tool

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.