Giter Club home page Giter Club logo

strfry's Introduction

strfry - a nostr relay

strfry logo

strfry is a relay for the nostr protocol

  • Supports most applicable NIPs: 1, 2, 4, 9, 11, 12, 15, 16, 20, 22, 28, 33, 40
  • No external database required: All data is stored locally on the filesystem in LMDB
  • Hot reloading of config file: No server restart needed for many config param changes
  • Zero downtime restarts, for upgrading binary without impacting users
  • Websocket compression: permessage-deflate with optional sliding window, when supported by clients
  • Built-in support for real-time streaming (up/down/both) events from remote relays, and bulk import/export of events from/to jsonl files
  • negentropy-based set reconcilliation for efficient syncing with remote relays

If you are using strfry, please join our telegram chat. Hopefully soon we'll migrate this to nostr.

Syncing

The most original feature of strfry is a set reconcillation protocol based on negentropy. This is implemented over a nostr protocol extension that allows two parties to synchronise their sets of stored messages with minimal bandwidth overhead. Although primarily designed for relay-to-relay communication, this can also be used by clients.

Either the full set of messages in the DB can be synced, or the results of one or more nostr filter expressions. If the two parties to the sync share common subsets of identical events, then there will be significant bandwidth savings compared to downloading the full set.

Usage

Compile

A C++20 compiler is required, along with a few other common dependencies. On Debian/Ubuntu use these commands:

Linux

sudo apt install -y git build-essential libyaml-perl libtemplate-perl libregexp-grammars-perl libssl-dev zlib1g-dev liblmdb-dev libflatbuffers-dev libsecp256k1-dev libzstd-dev
git clone https://github.com/hoytech/strfry && cd strfry/
git submodule update --init
make setup-golpe
make -j4

FreeBSD

pkg install -y gcc gmake cmake git perl5 openssl lmdb flatbuffers libuv libinotify zstr secp256k1 zlib-ng p5-Regexp-Grammars p5-Module-Install-Template p5-YAML
git clone https://github.com/hoytech/strfry && cd strfry/
git submodule update --init
gmake setup-golpe
gmake -j4

Running a relay

Here is how to run the relay:

./strfry relay

For dev/testing, the config file ./strfry.conf is used by default. It stores data in the ./strfry-db/ directory.

In production, you'll probably want a systemd unit file and a reverse proxy such as nginx (details coming soon).

Importing data

The strfry import command reads line-delimited JSON (jsonl) from its standard input and imports events that validate into the DB in batches of 10,000 at a time:

cat my-nostr-dump.jsonl | ./strfry import
  • By default, it will verify the signatures and other fields of the events. If you know the messages are valid, you can speed up the import a bit by passing the --no-verify flag.

Exporting data

The strfry export command will print events from the DB to standard output in jsonl, ordered by their created_at field (ascending).

Optionally, you can limit the time period exported with the --since and --until flags.

DB Upgrade

In the past, incompatible changes have been made to the DB format. If you try to use a strfry binary with an incompatible DB version, an error will be thrown. Only the strfry export command will work.

In order to upgrade the DB, you should export and then import again:

./strfry export > dbdump.jsonl
mv strfry-db/data.mdb data.mdb.bak
./strfry import < dbdump.jsonl

After you have confirmed everything is working OK, the dbdump.jsonl and data.mdb.bak files can be deleted.

Zero Downtime Restarts

strfry can have multiple different running instances simultaneously listening on the same port, because it uses the REUSE_PORT linux socket option. One of the reasons you may want to do this is to restart the relay without impacting currently connected users. This allows you to upgrade the strfry binary, or perform major configuration changes (for the subset of config options that require a restart).

If you send a SIGUSR1 signal to a strfry process, it will initiate a "graceful shutdown". This means that it will no longer accept new websocket connections, and after its last existing websocket connection is closed, it will exit.

So, the typical flow for a zero downtime restart is:

  • Record the PID of the currently running strfry instance.

  • Start a new relay process using the same configuration as the currently running instance:

    strfry relay
    

    At this point, both instances will be accepting new connections.

  • Initiate the graceful shutdown:

    kill -USR1 $OLD_PID
    

    Now only the new strfry instance will be accepting connections. The old one will exit once all its connections have been closed.

Stream

This command opens a websocket connection to the specified relay and makes a nostr REQ request with filter {"limit":0}:

./strfry stream wss://relay.example.com

All events that are streamed back are inserted into the DB (after validation, checking for duplicates, etc). If the connection is closed for any reason, the command will try reconnecting every 5 seconds.

You can also run it in the opposite direction, which monitors your local DB for any new events and posts them to the specified relay:

./strfry stream wss://relay.example.com --dir up

Both of these operations can be concurrently multiplexed over the same websocket:

./strfry stream wss://relay.example.com --dir both

strfry stream will compress messages with permessage-deflate in both directions, if supported by the server. Sliding window compression is not supported for now.

Sync

This command uses the negentropy protocol and performs a set reconcilliation between the local DB and the specified relay's remote DB.

Effectively what this does is figure out which events the remote relay has that you don't, and vice versa. Assuming that you both have common subsets of events, it does this more efficiently than simply transferring the full set of events (or even just their ids).

You can read about the algorithm used on the negentropy project page. There are both C++ and Javascript reference implementations.

Here is how to perform a "full DB" set reconcilliation against a remote server:

./strfry sync wss://relay.example.com

This will download all missing events from the remote relay and insert them into your DB. Similar to stream, you can also sync in the up or both directions:

./strfry sync wss://relay.example.com --dir both

both is especially efficient, because performing the set reconcilliation automatically determines the missing members on each side.

Instead of a "full DB" sync, you can also sync the result of a nostr filter (or multiple filters, use a JSON array of them):

./strfry sync wss://relay.example.com '{"authors":["003b"]}'

Warning: Syncing can consume a lot of memory and bandwidth if the DBs are highly divergent (for example if your local DB is empty and your filter matches many events).

Architecture

strfry uses concepts from various proprietary systems I have worked on in the past but consists solely of independently-developed open source code.

The golpe application framework is used for basic services such as command-line arg parsing, logging, config files, etc.

Database

strfry is built on the embedded LMDB database (using the lmdbxx C++ interface). This means that records are accessed directly from the page cache. The read data-path requires no locking/system calls and it scales optimally with additional cores.

Database records are serialised with Flatbuffers serialisation, which allows fast and zero-copy access to individual fields within the records. A RasgueaDB layer is used for maintaining indices and executing queries.

The query engine is quite a bit less flexible than a general-purpose SQL engine, however the types of queries that can be performed via the nostr protocol are fairly constrained, so we can ensure that almost all queries have good index support. All possible query plans are determined at compile-time, so there is no SQL generation/parsing overhead, or risk of SQL injection.

When an event is inserted, indexable data (id, pubkey, tags, kind, and created_at) is loaded into a flatbuffers object. Signatures and non-indexed tags are removed, along with recommended relay fields, etc, to keep the record size minimal (and therefore improve cache usage). The full event's raw JSON is stored separately. The raw JSON is re-serialised to remove any unauthenticated fields from the event.

Various indices are created based on the indexed fields. Almost all indices are "clustered" with the event's created_at timestamp, allowing efficient since/until scans. Many queries can be serviced by index-only scans, and don't need to load the flatbuffers object at all.

I've tried to build the query engine with efficiency and performance in mind, but it is possible a SQL engine could find better execution plans, perhaps depending on the query. I haven't done any benchmarking or profiling yet, so your mileage may vary.

One benefit of a custom query engine is that we have the flexibility to optimise it for real-time streaming use-cases more than we could a general-purpose DB. For example, a user on a slow connection should not unnecessarily tie up resources. Our query engine supports pausing a query and storing it (it takes up a few hundred to a few thousand bytes, depending on query complexity), and resuming it later when the client's socket buffer has drained. Additionally, we can pause long-running queries to satisfy new queries as quickly as possible. This is all done without any data-base thread pools. There are worker threads, but they only exist to take advantage of multiple CPUs, not to block on client I/O.

Threads and Inboxes

strfry starts multiple OS threads that communicate with each-other via two channels:

  • Non-copying message queues
  • The LMDB database

This means that no in-memory data-structures are accessed concurrently. This is sometimes called "shared nothing" architecture.

Each individual thread has an "inbox". Typically a thread will block waiting for a batch of messages to arrive in its inbox, process them, queue up new messages in the inboxes of other threads, and repeat.

Websocket

This thread is responsible for accepting new websocket connections, routing incoming requests to the Ingesters, and replying with responses.

The Websocket thread is a single thread that multiplexes IO to/from multiple connections using the most scalable OS-level interface available (for example, epoll on Linux). It uses my fork of uWebSockets.

Since there is only one of these threads, it is critical for system latency that it perform as little CPU-intensive work as possible. No request parsing or JSON encoding/decoding is done on this thread, nor any DB operations.

The Websocket thread does however handle compression and TLS, if configured. In production it is recommended to terminate TLS before strfry, for example with nginx.

Compression

If supported by the client, compression can reduce bandwidth consumption and improve latency.

Compression can run in two modes, either "per-message" or "sliding-window". Per-message uses much less memory, but it cannot take advantage of cross-message redundancy. Sliding-window uses more memory for each client, but the compression is typically better since nostr messages often contain serial redundancy (subIds, repeated pubkeys and event IDs in subsequent messages, etc).

The CPU usage of compression is typically small enough to make it worth it. However, strfry also supports running multiple independent strfry instances on the same machine (using the same DB backing store). This can distribute the compression overhead over several threads, according to the kernel's REUSE_PORT policy.

Ingester

These threads perform the CPU-intensive work of processing incoming messages:

  • Decoding JSON
  • Validating and hashing new events
  • Verifying event signatures
  • Compiling filters

A particular connection's requests are always routed to the same ingester.

Writer

This thread is responsible for most DB writes:

  • Adding new events to the DB
  • Performing event deletion (NIP-09)
  • Deleting replaceable events (NIP-16)

It is important there is only 1 writer thread, because LMDB has an exclusive-write lock, so multiple writers would imply contention. Additionally, when multiple events queue up, there is work that can be amortised across the batch. This serves as a natural counterbalance against high write volumes.

ReqWorker

Incoming REQ messages have two stages. The first stage is retrieving "old" data that already existed in the DB at the time of the request.

Servicing this stage is the job of the ReqWorker thread pool. Like Ingester, messages are consistently delivered to a thread according to connection ID. This is important so that (for example) CLOSE messages are matched with corresponding REQs.

When this stage is complete the next stage (monitoring) begins. When a ReqWorker thread completes the first stage for a subscription, the subscription is then sent to a ReqMonitor thread. ReqWorker is also responsible for forwarding unsubscribe (CLOSE) and socket disconnection messages to ReqMonitor. This forwarding is necessary to avoid a race condition where a message closing a subscription would be delivered while that subscription is pending in the ReqMonitor thread's inbox.

Filters

In nostr, each REQ message from a subscriber can contain multiple filters. We call this collection a FilterGroup. If one or more of the filters in the group matches an event, that event should be sent to the subscriber.

A FilterGroup is a vector of Filter objects. When the Ingester receives a REQ, the JSON filter items are compiled into Filters and the original JSON is discarded. Each filter item's specified fields are compiled into sorted lookup tables called filter sets.

In order to determine if an event matches against a Filter, first the since and until fields are checked. Then, each field of the event for which a filter item was specified is looked up in the corresponding lookup table. Specifically, the upper-bound index is determined using a binary search (for example std::upper_bound). This is the first element greater than the event's item. Then the preceeding table item is checked for either a prefix (ids/authors) or exact (everything else) match.

Since testing Filters against events is performed so frequently, it is a performance-critical operation and some optimisations have been applied. For example, each filter item in the lookup table is represented by a 4 byte data structure, one of which is the first byte of the field and the rest are offset/size lookups into a single memory allocation containing the remaining bytes. Under typical scenarios, this will greatly reduce the amount of memory that needs to be loaded to process a filter. Filters with 16 or fewer items can often be rejected with the load of a single cache line. Because filters aren't scanned linearly, the number of items in a filter (ie amount of pubkeys) doesn't have a significant impact on processing resources.

DBScan

The DB querying engine used by ReqWorker is called DBScan. This engine is designed to take advantage of indices that have been added to the database. The indices have been selected so that no filters require full table scans (over the created_at index), except ones that only use since/until (or nothing).

Because events are stored in the same flatbuffers format in memory and "in the database" (there isn't really any difference with LMDB), compiled filters can be applied to either.

When a user's REQ is being processed for the initial "old" data, each Filter in its FilterGroup is analysed and the best index is determined according to a simple heuristic. For each filter item in the Filter, the index is scanned backwards starting at the upper-bound of that filter item. Because all indices are composite keyed with created_at, the scanner also jumps to the until time when possible. Each event is compared against the compiled Filter and, if it matches, sent to the Websocket thread to be sent to the subscriber. The scan completes when one of the following is true:

  • The key no longer matches the filter item (exact or prefix, depending on field)
  • The event's created_at is before the since filter field
  • The filter's limit field of delivered events has been reached

Once this completes, a scan begins for the next item in the filter field. Note that a filter only ever uses one index. If a filter specifies both ids and authors, only the ids index will be scanned. The authors filters will be applied when the whole filter is matched prior to sending.

An important property of DBScan is that queries can be paused and resumed with minimal overhead. This allows us to ensure that long-running queries don't negatively affect the latency of short-running queries. When ReqWorker first receives a query, it creates a DBScan for it. The scan will be run with a "time budget" (for example 10 milliseconds). If this is exceeded, the query is put to the back of a queue and new queries are checked for. This means that new queries will always be processed before resuming any queries that have already run for 10ms.

ReqMonitor

The second stage of a REQ request is comparing newly-added events against the REQ's filters. If they match, the event should be sent to the subscriber.

ReqMonitor is not directly notified when new events have been written. This is important because new events can be added in a variety of ways. For instance, the strfry import command, event syncing, and multiple independent strfry servers using the same DB (ie, REUSE_PORT).

Instead, ReqMonitor watches for file change events using the OS's inotify API. When the file has changed, it scans all the events that were added to the DB since the last time it ran.

Note that because of this design decision, ephemeral events work differently than in other relay implementations. They are stored to the DB, however they have a very short retention-policy lifetime and will be deleted after 5 minutes (by default).

ActiveMonitors

Even though filter scanning is quite fast, strfry further attempts to optimise the case where a large number of concurrent REQs need to be monitored for.

When ReqMonitor first receives a subscription, it first compares its filter group against all the events that have been written since the subscription's DBScan started (since those are omitted from DBScan).

After the subscription is all caught up to the current transaction's snapshot, the filter group is broken up into its individual filters, and then each filter has one field selected (because all fields in a query must have a match, it is sufficient to choose one). This field is broken up into its individual filter items (ie a list of ids) and these are added to a sorted data-structure called a monitor set.

Whenever a new event is processed, all of its fields are looked up in the various monitor sets, which provides a list of filters that should be fully processed to check for a match. If an event has no fields in common with a filter, a match will not be attempted for this filter.

For example, for each prefix in the authors field in a filter, an entry is added to the allAuthors monitor set. When a new event is subsequently detected, the pubkey is looked up in allAuthors according to a binary search. Then the data-structure is scanned until it stops seeing records that are prefix matches against the pubkey. All of these matching records are pointers to corresponding Filters of the REQs that have subscribed to this author. The filters must then be processed to determine if the event satisfies the other parameters of each filter (since/until/etc).

After comparing the event against each filter detected via the inverted index, that filter is marked as "up-to-date" with this event's ID, whether the filter matched or not. This prevents needlessly re-comparing this filter against the same event in the future (in case one of the other index lookups matches it). If a filter does match, then the entire filter group is marked as up-to-date. This prevents sending the same event multiple times in case multiple filters in a filter group match, and also prevents needlessly comparing other filters in the group against an event that has already been sent.

After an event has been processed, all the matching connections and subscription IDs are sent to the Websocket thread along with a single copy of the event's JSON. This prevents intermediate memory bloat that would occur if a copy was created for each subscription.

Negentropy

These threads implements the provider-side of the negentropy syncing protocol.

When NEG-OPEN requests are received, these threads perform DB queries in the same way as ReqWorker threads do. However, instead of sending the results back to the client, the IDs of the matching events are kept in memory, so they can be queried with future NEG-MSG queries.

Cron

This thread is responsible for periodic maintenance operations. Currently this consists of applying a retention-policy and deleting ephemeral events.

Testing

How to run the tests is described in the test/README.md file.

Fuzz tests

The query engine is the most complicated part of the relay, so there is a differential fuzzing test framework to exercise it.

To bootstrap the tests, we load in a set of real-world nostr events.

There is a simple but inefficient filter implementation in test/dumbFilter.pl that can be used to check if an event matches a filter. In a loop, we randomly generate a complicated filter group and pipe the entire DB's worth of events through the dumb filter and record which events it matched. Next, we perform the query using strfry's query engine (using a strfry scan) and ensure it matches. This gives us confidence that querying for "old" records in the DB will be performed correctly.

Next, we need to verify that monitoring for "new" records will function also. For this, in a loop we create a set of hundreds of random filters and install them in the monitoring engine. One of which is selected as a sample. The entire DB's worth of events is "posted to the relay" (actually just iterated over in the DB using strfry monitor), and we record which events were matched. This is then compared against a full-DB scan using the same query.

Both of these tests have run for several hours with no observed failures.

Author and Copyright

strfry © 2023 Doug Hoyte.

GPLv3 license. See the LICENSE file.

strfry's People

Contributors

alexgleason avatar cosmicpsyop avatar dimi8146 avatar fiatjaf avatar foxytanuki avatar hoytech avatar jaschadub avatar jb55 avatar litch avatar niteshbalusu11 avatar overload3910 avatar theakito avatar v0l avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

strfry's Issues

Compilation error: "src/events.h:4:10: fatal error: secp256k1_schnorrsig.h: No such file or directory"

When I run the make -j4 command, this is the full output I get, even after doing a git pull with the latest update you pushed.

g++ -std=c++2a -O3 -g -Wall -fPIC  -DDOCOPT_HEADER_ONLY -Iinclude -Ibuild -Isrc -Igolpe/external -Igolpe/external/config/include -Igolpe/external/json/include -Igolpe/external/PEGTL/include -Igolpe/external/hoytech-cpp -Igolpe/external/docopt.cpp -Igolpe/external/loguru -Igolpe/external/quadrable/include -MMD -MP -MT src/RelayWebsocket.o -MF src/RelayWebsocket.d -c src/RelayWebsocket.cpp -o src/RelayWebsocket.o

g++ -std=c++2a -O3 -g -Wall -fPIC  -DDOCOPT_HEADER_ONLY -Iinclude -Ibuild -Isrc -Igolpe/external -Igolpe/external/config/include -Igolpe/external/json/include -Igolpe/external/PEGTL/include -Igolpe/external/hoytech-cpp -Igolpe/external/docopt.cpp -Igolpe/external/loguru -Igolpe/external/quadrable/include -MMD -MP -MT src/RelayIngester.o -MF src/RelayIngester.d -c src/RelayIngester.cpp -o src/RelayIngester.o

g++ -std=c++2a -O3 -g -Wall -fPIC  -DDOCOPT_HEADER_ONLY -Iinclude -Ibuild -Isrc -Igolpe/external -Igolpe/external/config/include -Igolpe/external/json/include -Igolpe/external/PEGTL/include -Igolpe/external/hoytech-cpp -Igolpe/external/docopt.cpp -Igolpe/external/loguru -Igolpe/external/quadrable/include -MMD -MP -MT src/cmd_sync.o -MF src/cmd_sync.d -c src/cmd_sync.cpp -o src/cmd_sync.o

g++ -std=c++2a -O3 -g -Wall -fPIC  -DDOCOPT_HEADER_ONLY -Iinclude -Ibuild -Isrc -Igolpe/external -Igolpe/external/config/include -Igolpe/external/json/include -Igolpe/external/PEGTL/include -Igolpe/external/hoytech-cpp -Igolpe/external/docopt.cpp -Igolpe/external/loguru -Igolpe/external/quadrable/include -MMD -MP -MT src/RelayReqWorker.o -MF src/RelayReqWorker.d -c src/RelayReqWorker.cpp -o src/RelayReqWorker.o

In file included from src/WriterPipeline.h:7,
                 from src/cmd_sync.cpp:9:
src/events.h:4:10: fatal error: secp256k1_schnorrsig.h: No such file or directory
    4 | #include <secp256k1_schnorrsig.h>
      |          ^~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
make: *** [golpe/rules.mk:26: src/cmd_sync.o] Error 1
make: *** Waiting for unfinished jobs....
In file included from src/RelayServer.h:19,
                 from src/RelayIngester.cpp:1:
src/events.h:4:10: fatal error: secp256k1_schnorrsig.h: No such file or directory
    4 | #include <secp256k1_schnorrsig.h>
      |          ^~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
make: *** [golpe/rules.mk:26: src/RelayIngester.o] Error 1
In file included from src/RelayServer.h:19,
                 from src/RelayWebsocket.cpp:1:
src/events.h:4:10: fatal error: secp256k1_schnorrsig.h: No such file or directory
    4 | #include <secp256k1_schnorrsig.h>
      |          ^~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
make: *** [golpe/rules.mk:26: src/RelayWebsocket.o] Error 1
In file included from src/RelayServer.h:19,
                 from src/RelayReqWorker.cpp:1:
src/events.h:4:10: fatal error: secp256k1_schnorrsig.h: No such file or directory
    4 | #include <secp256k1_schnorrsig.h>
      |          ^~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
make: *** [golpe/rules.mk:26: src/RelayReqWorker.o] Error 1

systemd service unit example

The readme mentions "coming soon" for the service unit, so I just wanted to share mine which works while substantially restricting system access on Ubuntu 22.04:

[Unit]
Description=Nostr relay

[Service]
User=strfry
Group=strfry
WorkingDirectory=/opt/strfry
ExecStart=/opt/bin/strfry --config=strfry.conf relay
Restart=on-failure
RestartSec=5
ProtectHome=yes
NoNewPrivileges=yes
ProtectSystem=full
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

Creating a restricted strfry user:

useradd -mb /opt -k /dev/null -s $(which nologin) strfry

For clarity, here's my paths under /opt:

├── bin
│   └── strfry
└── strfry
    ├── strfry.conf
    └── strfry-db

ProtectSystem=full requires at least systemd version 232, otherwise you should do this instead:

ProtectSystem=strict
ProtectControlGroups=true
ProtectKernelModules=true
ProtectKernelTunables=yes

Can I log the reason for "Request ID too long"?

This is more out of curiosity, really. But I saw this today when updating strfry to latest master.

[Ingester 1      ]INFO| sending error to [10]: bad req: subscription id too long

I'd love to know what the ID was.

Thanks!

Allow tags with empty string value

Currently strfry rejects event with tag having empty string value (e.g. ["d", ""]) by this line:

if (tagVal.size() == 0) throw herr("tag val empty");

But:

  • NIP-33 has example of empty d tags.
  • Some events with empty tags have already been published. You can see NIP-23 posts (kind:30023) and NIP-58 badges (kind:30009) having tags like ["d", ""], ["thumb", ""] at wss://nostr-pub.wellorder.net.
    • Send ["REQ", "subId", { "#d": [""] }] with Nostr Playground to see some cases.

EDIT: If this is allowed, it would be even better if it could also be searched by query { "#d": [""] } which is currently rejected by error "filter item too small".

Compilation Errors

I'm trying to compile & run strfry and when running make I receive an error message of:

ubuntu@x11r0n:~/strfry$ make -j4
perl golpe/gen-fbs.pl
perl golpe/gen-config.pl
golpe/external/rasgueadb/rasgueadb-generate golpe.yaml build
perl golpe/gen-main.cpp.pl
error: /home/ubuntu/strfry/fbs/nostr-index.fbs:4: 16: error: expecting: ] instead got: :
flatc failure building fbs/nostr-index.fbs at golpe/gen-fbs.pl line 22, <$fh> line 1.
make: *** [golpe/rules.mk:44: build/golpe.h] Error 1
make: *** Waiting for unfinished jobs....
error: /home/ubuntu/strfry/fbs/nostr-index.fbs:4: 16: error: expecting: ] instead got: :
flatc failure at golpe/external/rasgueadb/rasgueadb-generate line 146, <$fh> line 1.
make: *** [golpe/rules.mk:54: build/defaultDb.h] Error 1

The steps I took to reproduce are:

  • sudo apt install -y git build-essential libyaml-perl libtemplate-perl libregexp-grammars-perl libssl-dev zlib1g-dev liblmdb-dev libflatbuffers-dev libsecp256k1-dev libzstd-dev
  • git clone https://github.com/hoytech/strfry.git
  • cd strfry/
  • git checkout tags/0.9.3
  • git submodule update --init
  • make setup-golpe
  • make -j4

This machine is running Ubuntu 20.04.6 LTS.

Any assistance appreciated!

Document all the secret strfry commands

I am motivated to build a CLI wrapper around the ./strfry command in my strfry-policies repo so I can control strfry programatically. But only a few of the commands are actually documented in the README. It would be good to get a list of everything that exists.

Probabilistic filters to reduce memory footprint of ReqWorker for monitoring

Clients with big filters can demand a lot of RAM on the relay. This can be reduced significantly by using probabilistic filters. While these filters are not great to query a database, they are great to check individual events against.

I had implemented this idea using Cuckoo filters here. While the REQ is received as normal JSON filter object, this is being converted into a probabilistic filter after EOSE if for example filtering for more than 10 pubkeys. The probabilistic filter is set to a false positive rate of 1/10.000.000 with the client is filtering out those false positives anyway.

Logging and DDOS protection

This morning I’m getting a whole lot (thousands) of this error message repeated in my strfry logs:

too many concurrent REQs

Unfortunately, that error is logged without the IP address that is intentionally or unintentionally DDOSing my instance.

So I turned on verbose REQ logging and found the IP address in the initial connect and manually blocked it, but that’s obviously not a sustainable workflow, so I’m thinking that maybe it would be good for certain error messages (this one for sure) to log the IP address along with the error every time so that log monitors like fail2ban could keep an eye on things and automatically impose some IP blocking and unblocking by policy.

Please provide an api for "Read Policy Plugins"

A relay can be strained by spam it gets to store forever and that can be addressed with the Write Policy Plugin.
Equally a relay can be strained by spam requests. Here, strfry at this moment provides very limited tools and I would like to better control the kind of queries clients may direct at the relay. Ideally in combination with #47 .

Compression question

Not an issue, just a question.

In the docs it says there is a sliding window compression which applies across messages?

I was a bit confused by this, as I thought each nostr message is its own websocket message, and AFAICT websocket has only per-message compression? I was under the impression that the compression gains occur when a single websocket message is very large (and potentially fragmented into many frames). A large yesstr message, perhaps.

I am quite new to this whole compression thing, and would like to know if it really is possible to achieve cross-message compression. As you say, there is a lot of redundancy across nostr messages.

Large database performance

I noticed that as my database grew, the time between:

CONFIG: successfully installed

and

Filter matched XXX records

seemed to increase exponentially. When I had about 5 million events, it took a few minutes and now that I have 15 million events it takes hours (even though it doesnt max out the CPU). This makes it impossible to sync anymore.

By looking at the code my guess is that its stuck at this loop:

while (1) {
            bool complete = query.process(txn, [&](const auto &sub, uint64_t levId, std::string_view eventPayload){
                auto ev = lookupEventByLevId(txn, levId);
                ne.addItem(ev.flat_nested()->created_at(), sv(ev.flat_nested()->id()).substr(0, ne.idSize));

                numEvents++;
            });

            if (complete) break;
        }

At first sight its just adding events to a list, so it should not take exponentially more time? Is there anything that can be done to optimize this part?

strfry distribution channels

I'm working on a piece of software to replace Mastodon, and it will use strfry as its database, storing even things like config as Nostr events. I want users to be able to snap install strfry or flatpak install strfry. Any thoughts or feelings? Nerds prefer flatpak, but Snap is easier and comes preinstalled on Ubuntu, so I lean towards that.

Other possibilities:

  • apt install - not sure what hoops you have to jump through to make it into Debian or Ubuntu proper.
  • precompiled binary built by GitHub CI served from GitHub - it wouldn't hurt. maybe in addition to a snap/flatpak.

g++: fatal error: Killed signal terminated program cc1plus

g++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make: *** [golpe/rules.mk:28: build/config.o] Error 1
make: *** Waiting for unfinished jobs....

Trying to install on an Ubuntu 22. This occurs when running make -j4

I ran sudo apt update and sudo apt upgrade followed by the commands in the docs. Any ideas?

Compilation error: too few arguments to function ‘int secp256k1_schnorrsig_verify

Running into this when trying to make HEAD of master branch`. (First pre-coffee thought: do I have the right secp256k1 C++ lib, or should I be downloading and installing https://github.com/bitcoin-core/secp256k1 instead?)

# uname -a
Linux relay-001 5.15.0-52-generic #58-Ubuntu SMP Thu Oct 13 08:03:55 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
g++ -std=c++20 -O3 -g -Wall -fPIC  -DDOCOPT_HEADER_ONLY -Iinclude -Ibuild -Isrc -Igolpe/external -Igolpe/external/config/include -Igolpe/external/json/include -Igolpe/external/PEGTL/include -Igolpe/external/hoytech-cpp -Igolpe/external/docopt.cpp -Igolpe/external/loguru -Igolpe/external/quadrable/include -MMD -MP -MT src/events.o -MF src/events.d -c src/events.cpp -o src/events.o
src/events.cpp: In function ‘std::string nostrHash(const value&)’:
src/events.cpp:66:16: warning: ‘int SHA256_Init(SHA256_CTX*)’ is deprecated: Since OpenSSL 3.0 [-Wdeprecated-declarations]
   66 |     SHA256_Init(&sha256);
      |     ~~~~~~~~~~~^~~~~~~~~
In file included from src/events.h:3,
                 from src/events.cpp:1:
/usr/include/openssl/sha.h:73:27: note: declared here
   73 | OSSL_DEPRECATEDIN_3_0 int SHA256_Init(SHA256_CTX *c);
      |                           ^~~~~~~~~~~
src/events.cpp:67:18: warning: ‘int SHA256_Update(SHA256_CTX*, const void*, size_t)’ is deprecated: Since OpenSSL 3.0 [-Wdeprecated-declarations]
   67 |     SHA256_Update(&sha256, encoded.data(), encoded.size());
      |     ~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from src/events.h:3,
                 from src/events.cpp:1:
/usr/include/openssl/sha.h:74:27: note: declared here
   74 | OSSL_DEPRECATEDIN_3_0 int SHA256_Update(SHA256_CTX *c,
      |                           ^~~~~~~~~~~~~
src/events.cpp:68:17: warning: ‘int SHA256_Final(unsigned char*, SHA256_CTX*)’ is deprecated: Since OpenSSL 3.0 [-Wdeprecated-declarations]
   68 |     SHA256_Final(hash, &sha256);
      |     ~~~~~~~~~~~~^~~~~~~~~~~~~~~
In file included from src/events.h:3,
                 from src/events.cpp:1:
/usr/include/openssl/sha.h:76:27: note: declared here
   76 | OSSL_DEPRECATEDIN_3_0 int SHA256_Final(unsigned char *md, SHA256_CTX *c);
      |                           ^~~~~~~~~~~~
src/events.cpp: In function ‘bool verifySig(secp256k1_context*, std::string_view, std::string_view, std::string_view)’:
src/events.cpp:79:102: error: invalid conversion from ‘secp256k1_xonly_pubkey*’ to ‘size_t’ {aka ‘long unsigned int’} [-fpermissive]
   79 |     return secp256k1_schnorrsig_verify(ctx, (const uint8_t*)sig.data(), (const uint8_t*)hash.data(), &pubkeyParsed);
      |                                                                                                      ^~~~~~~~~~~~~
      |                                                                                                      |
      |                                                                                                      secp256k1_xonly_pubkey*
src/events.cpp:79:39: error: too few arguments to function ‘int secp256k1_schnorrsig_verify(const secp256k1_context*, const unsigned char*, const unsigned char*, size_t, const secp256k1_xonly_pubkey*)’
   79 |     return secp256k1_schnorrsig_verify(ctx, (const uint8_t*)sig.data(), (const uint8_t*)hash.data(), &pubkeyParsed);
      |            ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from src/events.h:4,
                 from src/events.cpp:1:
/usr/include/secp256k1_schnorrsig.h:158:48: note: declared here
  158 | SECP256K1_API SECP256K1_WARN_UNUSED_RESULT int secp256k1_schnorrsig_verify(
      |                                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~
make: *** [golpe/rules.mk:26: src/events.o] Error 1

Write policy applied to sync?

I created a whitelist policy, because I dont want spammers to be able to post using my instance.

But now when I run sync or stream those also fail to add anything because the events are not whitelisted.

So how do you make a policy that doesnt allow external users to use the relay, but still accept incoming data from sync or stream?

Build error on FreeBSD

uWebsockets doesn't directly have kqueue support. We'll need to use its libuv wrapper I believe.

Compilation error: could not convert ‘cfg().ConfigValues::relay__logging__dumpInAll’ from ‘const string’ to bool

$ uname -a
Linux relay-001 5.15.0-52-generic #58-Ubuntu SMP Thu Oct 13 08:03:55 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
$ make
perl golpe/gen-config.pl
golpe/external/rasgueadb/rasgueadb-generate golpe.yaml build
g++ -std=c++20 -O3 -g -Wall -fPIC  -DDOCOPT_HEADER_ONLY -Iinclude -Ibuild -Isrc -Igolpe/external -Igolpe/external/config/include -Igolpe/external/json/include -Igolpe/external/PEGTL/include -Igolpe/external/hoytech-cpp -Igolpe/external/docopt.cpp -Igolpe/external/loguru -Igolpe/external/quadrable/include -MMD -MP -MT golpe/logging.o -MF golpe/logging.d -c golpe/logging.cpp -o golpe/logging.o
g++ -std=c++20 -O3 -g -Wall -fPIC  -DDOCOPT_HEADER_ONLY -Iinclude -Ibuild -Isrc -Igolpe/external -Igolpe/external/config/include -Igolpe/external/json/include -Igolpe/external/PEGTL/include -Igolpe/external/hoytech-cpp -Igolpe/external/docopt.cpp -Igolpe/external/loguru -Igolpe/external/quadrable/include -MMD -MP -MT build/main.o -MF build/main.d -c build/main.cpp -o build/main.o
g++ -std=c++20 -O0 -g -Wall -fPIC  -DDOCOPT_HEADER_ONLY -Iinclude -Ibuild -Isrc -Igolpe/external -Igolpe/external/config/include -Igolpe/external/json/include -Igolpe/external/PEGTL/include -Igolpe/external/hoytech-cpp -Igolpe/external/docopt.cpp -Igolpe/external/loguru -Igolpe/external/quadrable/include -MMD -MP -MT build/config.o -MF build/config.d -c build/config.cpp -o build/config.o
g++ -std=c++20 -O3 -g -Wall -fPIC  -DDOCOPT_HEADER_ONLY -Iinclude -Ibuild -Isrc -Igolpe/external -Igolpe/external/config/include -Igolpe/external/json/include -Igolpe/external/PEGTL/include -Igolpe/external/hoytech-cpp -Igolpe/external/docopt.cpp -Igolpe/external/loguru -Igolpe/external/quadrable/include -MMD -MP -MT src/RelayCron.o -MF src/RelayCron.d -c src/RelayCron.cpp -o src/RelayCron.o
g++ -std=c++20 -O3 -g -Wall -fPIC  -DDOCOPT_HEADER_ONLY -Iinclude -Ibuild -Isrc -Igolpe/external -Igolpe/external/config/include -Igolpe/external/json/include -Igolpe/external/PEGTL/include -Igolpe/external/hoytech-cpp -Igolpe/external/docopt.cpp -Igolpe/external/loguru -Igolpe/external/quadrable/include -MMD -MP -MT src/RelayIngester.o -MF src/RelayIngester.d -c src/RelayIngester.cpp -o src/RelayIngester.o
src/RelayIngester.cpp: In member function ‘void RelayServer::runIngester(ThreadPool<MsgIngester>::Thread&)’:
src/RelayIngester.cpp:20:35: error: could not convert ‘cfg().ConfigValues::relay__logging__dumpInAll’ from ‘const string’ {aka ‘const std::__cxx11::basic_string<char>’} to ‘bool’
   20 |                         if (cfg().relay__logging__dumpInAll) LI << "[" << msg->connId << "] dumpInAll: " << msg->payload;
      |                             ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~
      |                                   |
      |                                   const string {aka const std::__cxx11::basic_string<char>}
src/RelayIngester.cpp:29:39: error: could not convert ‘cfg().ConfigValues::relay__logging__dumpInEvents’ from ‘const string’ {aka ‘const std::__cxx11::basic_string<char>’} to ‘bool’
   29 |                             if (cfg().relay__logging__dumpInEvents) LI << "[" << msg->connId << "] dumpInEvent: " << msg->payload;
      |                                 ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                                       |
      |                                       const string {aka const std::__cxx11::basic_string<char>}
src/RelayIngester.cpp:38:39: error: could not convert ‘cfg().ConfigValues::relay__logging__dumpInReqs’ from ‘const string’ {aka ‘const std::__cxx11::basic_string<char>’} to ‘bool’
   38 |                             if (cfg().relay__logging__dumpInReqs) LI << "[" << msg->connId << "] dumpInReq: " << msg->payload;
      |                                 ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
      |                                       |
      |                                       const string {aka const std::__cxx11::basic_string<char>}
src/RelayIngester.cpp:46:39: error: could not convert ‘cfg().ConfigValues::relay__logging__dumpInReqs’ from ‘const string’ {aka ‘const std::__cxx11::basic_string<char>’} to ‘bool’
   46 |                             if (cfg().relay__logging__dumpInReqs) LI << "[" << msg->connId << "] dumpInReq: " << msg->payload;
      |                                 ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
      |                                       |
      |                                       const string {aka const std::__cxx11::basic_string<char>}
make: *** [golpe/rules.mk:26: src/RelayIngester.o] Error 1

(I've reported a similar error before; if the solution here is to update / change versions lmk, happy to just switch to e.g. whatever version of Debian strfry is being developed on. Cheers!)

My strfry.service (path to strfry.conf issue)

I read the home page (https://github.com/hoytech/strfry) and the deploy page for installation (https://github.com/hoytech/strfry/blob/master/docs/DEPLOYMENT.md). For me, the inferred methodology, is slightly different between the two pages. So I made a user and installed under a user (which seems to be the indication from the home page) whereas on the deployment page you install as no login.

Anyway, I am running Ubuntu 22.04 and for the life of me I could not get the relay to run as a service only as logging in and executing ./strfry relay from the install directory.

After 4-5 hours of banging my head against the wall I found the problem it was the path to the config file, this is how I got it work work. The user is "strfry" and the packages are installed is his home directory.

sudo nano /etc/systemd/system/strfry.service

[Unit]
Description=strfry relay service

[Service]
ExecStart=/home/strfry/strfry/strfry --config=/home/strfry/strfry/strfry.conf relay
LimitNOFILE=1000000
LimitNPROC=1000000

[Install]
WantedBy=multi-user.target

nip-11 pubkey as bech32 should be rejected or warned or well documented to avoid

On too many relays, people are putting npub keys into the 'pubkey' field of nip-11, such as wss://relay.mostr.pub/, wss://relay.current.fyi/, wss://nostr.fmt.wiz.biz/, wss://nostr.mom/ (some of those are nostream, I'll file a separate bug for that).

NIP-11 states:

An administrative contact may be listed with a pubkey, in the same format as Nostr events (32-byte hex for a secp256k1 public key)

Could you (1) validate this and error or warn, or (2) make the comments/documentation very clear on this point so people don't make this mistake?

The gossip client is using strict typing and fails to deserialize these NIP-11 JSON objects, meaning the client is not providing any interesting relay information to the user, and gossip is presuming these clients do not support NIP-11 and not utilizing them fully.

Of course I could accept npub, but I fall into the camp of people who believe we should hold the line and defend the standards lest they proliferate into too many defacto variants.

Build error on latest master

It seems that after pulling the latest master, quadrable.h can not be found.

Any dependency I missed? My previous build worked; I just wanted to update it.

Full output:

root@birb:/opt/strfry# git pull
remote: Enumerating objects: 153, done.
remote: Counting objects: 100% (153/153), done.
remote: Compressing objects: 100% (102/102), done.
remote: Total 153 (delta 96), reused 108 (delta 51), pack-reused 0
Receiving objects: 100% (153/153), 58.00 KiB | 322.00 KiB/s, done.
Resolving deltas: 100% (96/96), completed with 7 local objects.
From https://github.com/hoytech/strfry
   fb21e10..798522a  web        -> origin/web
Fetching submodule golpe
From https://github.com/hoytech/golpe
   99fa9be..c0367d6  master     -> origin/master
Fetching submodule golpe/external/uWebSockets
From https://github.com/hoytech/uWebSockets
   8670f28..1e0fda7  master     -> origin/master
Already up to date.
root@birb:/opt/strfry# git submodule update
root@birb:/opt/strfry# git submodule ^C
root@birb:/opt/strfry# git submodule update --recursive --remote
warning: unable to rmdir 'external/quadrable': Directory not empty
Submodule path 'golpe': checked out 'c0367d6554bad33cecbf46763d0a1934891ff737'
Submodule path 'golpe/external/PEGTL': checked out 'eac50a85e3fd1ee3623cfa150eed457aa61f7b9e'
Submodule path 'golpe/external/config': checked out '5e726d1442beb225789ed0889e8aa4fbc75bea7a'
Submodule path 'golpe/external/docopt.cpp': checked out '400e6dd8e59196c914dcc2c56caf7dae7efa5eb3'
Submodule path 'golpe/external/hoytech-cpp': checked out '121eb4252d38e3a2c3359aee3329d1d4cdf4f512'
Submodule path 'golpe/external/json': checked out '330129305f15fbfba5e0716db25e245f0b4d8b0f'
Submodule path 'golpe/external/loguru': checked out '4adaa185883e3c04da25913579c451d3c32cfac1'
Submodule path 'golpe/external/parallel-hashmap': checked out '79cbd2dafd5aab3829064d1b48b71137623d8ff2'
Submodule path 'golpe/external/rasgueadb': checked out 'cb5631f0fa05622282b6c127b5817df1315254d2'
Submodule path 'golpe/external/uWebSockets': checked out '1e0fda756ad2b64a3e71428ad18cb75f1832781b'
root@birb:/opt/strfry# make setup-golpe
cd golpe && git submodule update --init
Submodule 'external/templar' (https://github.com/hoytech/templar.git) registered for path 'external/templar'
Cloning into '/opt/strfry/golpe/external/templar'...
Submodule path 'external/PEGTL': checked out '9afe8a71920b9dadf309a503d734143e1ff78b3e'
Submodule path 'external/config': checked out 'ab8c38a2d00e58dd004fd71da7f0e70749993fc1'
Submodule path 'external/docopt.cpp': checked out '6f5de76970be94a6f1e4556d1716593100e285d2'
Submodule path 'external/json': checked out 'd73d01389660084a8dbedd44eb674da57f26aba6'
Submodule path 'external/loguru': checked out '644f60dca77de3b0f718a03d370c8ebdf5f97968'
Submodule path 'external/parallel-hashmap': checked out '87ece91c6e4c457c5faac179dae6e11e2cd39b16'
Submodule path 'external/templar': checked out '13961f0c0bff435f045cf62864f2ef4c6f2730cc'
root@birb:/opt/strfry# make -j2
perl golpe/gen-fbs.pl
perl golpe/gen-config.pl
perl golpe/gen-golpe.h.pl
perl golpe/gen-main.cpp.pl
g++ -std=c++20 -O3 -g -Wall -fPIC  -DDOCOPT_HEADER_ONLY -Iinclude -Ibuild -Isrc -Igolpe/external -Igolpe/external/lmdbxx/include -Igolpe/external/config/include -Igolpe/external/json/include -Igolpe/external/PEGTL/include -Igolpe/external/hoytech-cpp -Igolpe/external/docopt.cpp -Igolpe/external/loguru -Igolpe/external/parallel-hashmap -MMD -MP -MT golpe/logging.o -MF golpe/logging.d -c golpe/logging.cpp -o golpe/logging.o
g++ -std=c++20 -O3 -g -Wall -fPIC  -DDOCOPT_HEADER_ONLY -Iinclude -Ibuild -Isrc -Igolpe/external -Igolpe/external/lmdbxx/include -Igolpe/external/config/include -Igolpe/external/json/include -Igolpe/external/PEGTL/include -Igolpe/external/hoytech-cpp -Igolpe/external/docopt.cpp -Igolpe/external/loguru -Igolpe/external/parallel-hashmap -MMD -MP -MT build/main.o -MF build/main.d -c build/main.cpp -o build/main.o
In file included from build/golpe.h:64,
                 from golpe/logging.cpp:1:
src/global.h:9:10: fatal error: quadrable.h: No such file or directory
    9 | #include <quadrable.h>
      |          ^~~~~~~~~~~~~
compilation terminated.
In file included from build/golpe.h:64,
                 from build/main.cpp:13:
src/global.h:9:10: fatal error: quadrable.h: No such file or directory
    9 | #include <quadrable.h>
      |          ^~~~~~~~~~~~~
compilation terminated.
make: *** [golpe/rules.mk:31: golpe/logging.o] Error 1
make: *** Waiting for unfinished jobs....
make: *** [golpe/rules.mk:31: build/main.o] Error 1

Yay spammers arrived 🎉

Need to think seriously about spam.

Basic rate limiting but also enabling bursts of events/msgs. So longer time window, bigger limit can achieve this..

Instead of 3/sec maybe 20/min.

If nobody is following the guy, he should get more penalties, or more rate limiting. If the pubkey is new, more rate limiting. It quickly turns to a statistics problem which bloat the relayer but these are some of the easiest things that can be implemented I think.

Another suggestion is incremental PoW. The relay requires more and more PoW from an IPv4 when it finds it is spamming. IPv6 is harder to control I think because it is cheaper. I don't know if there is a NIP for this. When a relay rejects the spam, client tries harder to find more PoW and resubmits it..

Similar messages can also be slowed down even though they come from different IP. The current spammer sends 'similar' good morning messages constantly. I don't think it is using different IP though, this is more for future proofing.

Implement nip 42 client authentication

Please implement client auth as defined in nip 42. This is also needed for #17.

I am primarily interested in my relay requiring auth at some point. Only serve REQ from

  1. paying users
  2. their follows
  3. their follows follows
  4. everybody else. (no auth would count as everybody else)

But ... as clients don't support nip42, the very first step is to let users auth without caring with which key they auth and maybe notify them that auth failed without consequences, to get client devs moving on this topic.

As mentioned in the comments below, full nip-42 support could also mean support for sync and stream.

  • strfry relay supports nip42
  • strfry stream supports nip42
  • strfry sync supports nip42

Track and limit by proximity to group of users

I want to run a relay financed by a tiny percentage of its users and strongly believe in the following being a way to align incentives for all clients and relay operators:

  • Implement authentication. The relay does only process REQests and EVENTs from clients linked to pubkeys
  • Measure resource use per pubkey: milliseconds spent on queries, query count, events sent, event kBs sent, etc.
  • Define group of primary users (how this works is independent of this issue but in my case it might be people I follow or people that pay $x/month).
  • Secondary users are follows of primary users etc.
  • Define limits depending on follows distance to primary users (0, 1, 2, 3, 4, 5, >5 for example)

With this in place, the relay can reject events from "distance >5" with links, longer than 50 chars, other than kind-1 or kind-4, ... It can delete events if no follows were achieved within 7 days, ...

On the other hand, the relay can treat "distance 1" and "distance 2" users almost as primary users and manually deal with primary users that actively follow spam accounts.

With users coming from Twitter for example quickly gaining follows from their Twitter followers, their on-boarding would only be mildly affected by this spam mitigation while the ">5" limits could be designed to gain followers without spamming.

With admission to "primary users" group carrying a cost, spam moderation would be paid for but for personal relays, the relay operator can choose to be the primary user with his follows being privileged and sponsored by him, too.

I was tempted to lay the above out in reply to #9 but it's a feature request independent of other spam mitigations that I would offer a $500 bounty for its implementation.

REQ plugin

I haven't fully thought this through yet, but I sometimes wish I could hook into client requests. Like, to be able to get ["REQ", ...] and potentially change the filter or response, or trigger side-effects. Maybe AUTH could even be implemented in a req plugin.

I still don't know if this is even a good idea yet, but I figured I'd drop it in case it strikes anyone else who might be having similar ideas.

Whitelist / Blacklist pubkeys

Is there, or could there be, a way to configure strfry for whitelisting or blacklisting pubkeys allowing or disallowing posting to the instance?

Export fails

After running 'stream's for hours I tried to do an export:

$ ./strfry export | wc

strfry error: couldn't find leaf node in quadrable, corrupted DB?
2023-01-11 15:40:33.138 ( 0.064s) [main thread ]INFO| atexit

3391 22061 1768370

If a Japanese character is set as the subscriber ID and a REQ message is sent, relay will return "NOTICE".

Describe the bug
"If a Japanese character is set as the subscriber ID and a REQ message is sent from the client to the relay, it will return ["NOTICE","ERROR: bad req: invalid character in subscription id"]."

To Reproduce
For example, sending ["REQ", "日本語のサブスクライバーID", {"kinds": 1, "limit": 10}] from the client to the relay.

Expected behavior
It is possible to use Japanese characters(and other language too) as the subscriber ID.

Additional context
For <subscription_id>, the following rules are defined in NIP-01.
<subscription_id>”<subscription_id> is an arbitrary, non-empty string of max length 64 chars, that should be used to represent a subscription.”
The specification for characters is not provided.

Error running sync

I just migrated nostr.lu.ke to strfry. I have my old nostream instance exposed at nostream.lu.ke. I tried to migrate the DB with ./strfry sync wss://nostream.lu.ke but I'm getting:

date       time         ( uptime  ) [ thread name/id ]   v| 
2023-03-27 09:35:19.848 (   0.031s) [main thread     ]INFO| arguments: ./strfry sync wss://nostream.lu.ke
2023-03-27 09:35:19.848 (   0.031s) [main thread     ]INFO| Current dir: /app
2023-03-27 09:35:19.848 (   0.031s) [main thread     ]INFO| stderr verbosity: 0
2023-03-27 09:35:19.848 (   0.031s) [main thread     ]INFO| -----------------------------------
2023-03-27 09:35:19.848 (   0.031s) [main thread     ]INFO| CONFIG: Loading config from file: /etc/strfry.conf
2023-03-27 09:35:19.857 (   0.040s) [main thread     ]INFO| CONFIG: successfully installed
2023-03-27 09:35:19.906 (   0.089s) [main thread     ]INFO| Filter matched 34213 local events
terminate called without an active exception

Loguru caught a signal: SIGABRT
Stack trace:
13      0x56229354aca5 ./strfry(+0x2eca5) [0x56229354aca5]
12      0x7face7b2fe40 __libc_start_main + 128
11      0x7face7b2fd90 /lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7face7b2fd90]
10      0x56229354a720 ./strfry(+0x2e720) [0x56229354a720]
9       0x56229356b83a ./strfry(+0x4f83a) [0x56229356b83a]
8       0x562293546e44 ./strfry(+0x2ae44) [0x562293546e44]
7       0x562293849554 ./strfry(+0x32d554) [0x562293849554]
6       0x7face7ee32b7 /lib/x86_64-linux-gnu/libstdc++.so.6(+0xae2b7) [0x7face7ee32b7]
5       0x7face7ee324c /lib/x86_64-linux-gnu/libstdc++.so.6(+0xae24c) [0x7face7ee324c]
4       0x7face7ed7bbe /lib/x86_64-linux-gnu/libstdc++.so.6(+0xa2bbe) [0x7face7ed7bbe]
3       0x7face7b2e7f3 abort + 211
2       0x7face7b48476 raise + 22
1       0x7face7b9ca7c pthread_kill + 300
0       0x7face7b48520 /lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7face7b48520]
2023-03-27 09:35:19.907 (   0.090s) [main thread     ]FATL| Signal: SIGABRT
Aborted (core dumped)

I'm running e5ec135 built from the provided Dockerfile. Any ideas?

Detect which strfry instance is accessing the policy plugin

Finally found a potential use-case for extending the policy plugin input message: figuring out which instance of strfry originated the message (when multiple strfry policies use the same plugin on a shared machine).

Inspired by this thread: https://gitlab.com/soapbox-pub/strfry-policies/-/issues/4#note_1426085367

cc @Giszmo

btw, not sure it's even a good idea yet or the best solution to that problem, just want to brainstorm. strfry doesn't even know its own relay URL anyway, right? just the port it's running on.

stream ws://SERVER_IP:PORT --dir up makes strfry not to receive events

Running strfry in docker container, with "relay" it works well, with "stream" it doesn't receive events.

console with stream:

date       time         ( uptime  ) [ thread name/id ]   v|
2023-03-04 03:19:09.991 (   0.016s) [main thread     ]INFO| arguments: /app/strfry stream ws://SERVER_IP:PORT --dir up
2023-03-04 03:19:09.991 (   0.016s) [main thread     ]INFO| Current dir: /app
2023-03-04 03:19:09.991 (   0.016s) [main thread     ]INFO| stderr verbosity: 0
2023-03-04 03:19:09.991 (   0.016s) [main thread     ]INFO| -----------------------------------
2023-03-04 03:19:09.991 (   0.016s) [main thread     ]INFO| CONFIG: Loading config from file: /etc/strfry.conf
2023-03-04 03:19:09.996 (   0.020s) [main thread     ]INFO| CONFIG: successfully installed
2023-03-04 03:19:09.997 (   0.021s) [main thread     ]INFO| Attempting to connect to ws://SERVER_IP:PORT
2023-03-04 03:19:09.998 (   0.022s) [main thread     ]INFO| Connected to SERVER_IP

Client (noscl) says: error opening websocket to ws://STRFRY_SERVER_IP:PORT read tcp balabala read: connection reset by peer.

Docker support

Please add a docker file and a sample docker-compose.

subscribe with prefixed filter ids (nip-01)

I'd like to subscribe to event ids starting with "0" as described in nip-01, but it seems no relays support that yet. Reason: there are some clients that support nip-13 POW and add nonce tag, but filtering by nonce on the client still means that all events are subscribed too which results in lots of bandwidth usage. Therefore I'd like to only subscribe to events starting with '0' or '00'

"ids": <a list of event ids or prefixes>
The ids and authors lists contain lowercase hexadecimal strings, which may either be an exact 64-character match, or a prefix of the event value. A prefix match is when the filter string is an exact string prefix of the event value. The use of prefixes allows for more compact filters where a large number of values are queried, and can provide some privacy for clients that may not want to disclose the exact authors or events they are searching for.

https://github.com/nostr-protocol/nips/blob/master/01.md

example: filter: { ids: ['0'] }

Core dump when run multiple in stream mode

I am collecting events from relays by running multiple instances of strfry in stream mode and this happens like in every few hours.

terminate called after throwing an instance of 'std::runtime_error'
what(): duplicate insert into Event

Loguru caught a signal: SIGABRT
Stack trace:
13 0x7f3c179f0a00 /lib/x86_64-linux-gnu/libc.so.6(+0x126a00) [0x7f3c179f0a00]
12 0x7f3c1795eb43 /lib/x86_64-linux-gnu/libc.so.6(+0x94b43) [0x7f3c1795eb43]
11 0x7f3c17cd52b3 /lib/x86_64-linux-gnu/libstdc++.so.6(+0xdc2b3) [0x7f3c17cd52b3]
10 0x5639ff8e37c3 ./strfry(+0x2b97c3) [0x5639ff8e37c3]
9 0x5639ff8e3110 ./strfry(+0x2b9110) [0x5639ff8e3110]
8 0x5639ff64d76d ./strfry(+0x2376d) [0x5639ff64d76d]
7 0x7f3c17ca7518 /lib/x86_64-linux-gnu/libstdc++.so.6(+0xae518) [0x7f3c17ca7518]
6 0x7f3c17ca72b7 /lib/x86_64-linux-gnu/libstdc++.so.6(+0xae2b7) [0x7f3c17ca72b7]
5 0x7f3c17ca724c /lib/x86_64-linux-gnu/libstdc++.so.6(+0xae24c) [0x7f3c17ca724c]
4 0x7f3c17c9bbbe /lib/x86_64-linux-gnu/libstdc++.so.6(+0xa2bbe) [0x7f3c17c9bbbe]
3 0x7f3c178f27f3 abort + 211
2 0x7f3c1790c476 raise + 22
1 0x7f3c17960a7c pthread_kill + 300
0 0x7f3c1790c520 /lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7f3c1790c520]
2023-01-11 08:15:48.106 (2499.091s) [Writer ]FATL| Signal: SIGABRT
Aborted (core dumped)

"bad event id" on actually valid object.

Long story short:

Jun 15 00:19:28 birb strfry[1775541]: 2023-06-15 00:19:28.688 ( 146.073s) [Ingester 1      ]INFO| [1] dumpInEvent: ["EVENT",{"id":"0bc534823aa6c247766d598ad42cc0726911b183145340887a26e3bccd2cf28c","pubkey":"5e336907a3dda5cd58f11d162d8a4c9388f9cfb2f8dc4b469c8151e379c63bc9","created_at":1686759536,"kind":7,"content":"+","tags":[["e","caeb5711dc51a534b1b599c4b34820262003d7e8367f91dff51d84dc07021672"],["p","89d1ce9164f1f172daaa9c784153178cb1dec7912bf55f5dc07e0f1dabe40e6c"]],"sig":"9b1e2d07682f8e14658c78124537ae2bd9d4851cc1c988075ef043071200b6f0cd167bc35661511dd4da292ac3b6148c976d12be694aae2819f7fdc5b85832dc"}]
Jun 15 00:19:28 birb strfry[1775541]: 2023-06-15 00:19:28.688 ( 146.073s) [Ingester 1      ]INFO| Rejected invalid event: bad event id

But, the ID looks perfectly fine.

Anything I missed or can do?

Config:

db = "/srv/strfry/db/"
dbParams {
    maxreaders = 256
    mapsize = 10995116277760
}
relay {
    bind = "127.0.0.1"
    port = 7777
    nofiles = 0
    realIpHeader = "x-forwarded-for"
    info {
        name = "Just a relay"
        description = "Just a relay I host for fun."
        pubkey = "npub1tcekjparmkju6k83r5tzmzjvjwy0nnajlrwyk35us9g7x7wx80ys9hjmky"
        contact = "@ingwiephoenix:ingwie.me"
    }
    maxWebsocketPayloadSize = 131072
    autoPingSeconds = 55
    enableTcpKeepalive = true
    queryTimesliceBudgetMicroseconds = 10000
    maxFilterLimit = 500
    maxSubsPerConnection = 20
    writePolicy {
        plugin = ""
        lookbackSeconds = 0
    }
    compression {
        enabled = true
        slidingWindow = true
    }
    logging {
        dumpInAll = false
        dumpInEvents = true # <- I changed this to see the logs.
        dumpInReqs = false
        dbScanPerf = false
    }
    numThreads {
        ingester = 3
        reqWorker = 3
        reqMonitor = 3
        yesstr = 1
    }
}
events {
    maxEventSize = 65536
    rejectEventsNewerThanSeconds = 900
    rejectEventsOlderThanSeconds = 94608000
    rejectEphemeralEventsOlderThanSeconds = 60
    ephemeralEventsLifetimeSeconds = 300
    maxNumTags = 2000
    maxTagValSize = 1024
}

I am using Caddy as a reverse proxy to handle TLS/SSL. Strfry is running on Ubuntu 22.04 arm64.

Anything I can do to fix that?

Parser for x-forwarded-for does not work with IPv6 or multiproxy

I have setup haproxy (ssl term) ---> Nginx --> strfry
So x-forwarded-for header contains all proxies in the way, but parser is not happy from it, do not know if it is because of multiple IP or because of client uses IPv6

Websocket ]WARN| Couldn't parse IP from header x-forwarded-for: 2a00:abcd:401a:abcd:ca43:f3a7:c615:aaaa, 192.168.0.1
Websocket ]WARN| Couldn't parse IP from header x-forwarded-for: ffff:123.83.123.203, 192.168.0.1

Using a Reverse Proxy like e.g. NGINX

The README mentions, there'll be details soon, on how the NGINX configuration should look like. However, #50 makes me wonder, whether there need to be some special settings enabled, to make this fully work.

Can I use a 101 NGINX Proxy configuration, as is used by most services?
How can I make sure, i.e. test, the service works fine & all functionalities are working?

I suggest the README gets a sections about NGINX.

strfry stream segfaults

I have a flaky internet connection and the process ended like this after hundreds of reconnect attempts:

2023-06-11 10:59:12.972 (52195.498s) [main thread     ]INFO| Attempting to connect to wss://relay.damus.io
2023-06-11 10:59:12.972 (52195.498s) [main thread     ]INFO| Websocket connection error
2023-06-11 10:59:17.972 (52200.498s) [main thread     ]INFO| Attempting to connect to wss://relay.damus.io
2023-06-11 10:59:17.973 (52200.498s) [main thread     ]INFO| Websocket connection error
2023-06-11 10:59:22.973 (52205.498s) [main thread     ]INFO| Attempting to connect to wss://relay.damus.io
Segmentation fault

Delete DMs

Sorry, I'm not sure that this is related on strfry. I hope to delete my broken DM. But AFAICS, strfry does accept kind 5 for DM (kind 4). Could you please support deletion for DM?

["NOTICE","ERROR: bad req: std::get: wrong index for variant"]

A few days ago I used strfry export to dump events, set up a new strfry server, and then used a combination of strfry import and strfry sync to try to get (most of?) the old events back over.

Everything was working until yesterday when things started to go bad. REQs would hang until I restarted strfry.

I was finally able to get a specific error for a REQ:

➜ echo '["REQ", "neinoienoei", {"authors": "79c2cae114ea28a981e7559b4fe7854a473521a8d22a66bbab9fa248eb820ff6"}]' | nostcat -s wss://relay.mostr.pub
["NOTICE","ERROR: bad req: std::get: wrong index for variant"]

I might just nuke this database.

golpe uses private git path to loguru

$ make setup-golpe
cd golpe && git submodule update --init
Submodule 'external/PEGTL' (https://github.com/taocpp/PEGTL.git) registered for path 'external/PEGTL'
Submodule 'external/config' (https://github.com/taocpp/config.git) registered for path 'external/config'
Submodule 'external/docopt.cpp' (https://github.com/docopt/docopt.cpp.git) registered for path 'external/docopt.cpp'
Submodule 'external/hoytech-cpp' (https://github.com/hoytech/hoytech-cpp.git) registered for path 'external/hoytech-cpp'
Submodule 'external/json' (https://github.com/taocpp/json.git) registered for path 'external/json'
Submodule 'external/lmdbxx' (https://github.com/hoytech/lmdbxx.git) registered for path 'external/lmdbxx'
Submodule 'external/loguru' ([email protected]:emilk/loguru.git) registered for path 'external/loguru'
Submodule 'external/quadrable' (https://github.com/hoytech/quadrable.git) registered for path 'external/quadrable'
Submodule 'external/rasgueadb' (https://github.com/hoytech/rasgueadb.git) registered for path 'external/rasgueadb'
Submodule 'external/uWebSockets' (https://github.com/hoytech/uWebSockets.git) registered for path 'external/uWebSockets'
Cloning into '/home/e24/strfry/golpe/external/PEGTL'...
Cloning into '/home/e24/strfry/golpe/external/config'...
Cloning into '/home/e24/strfry/golpe/external/docopt.cpp'...
Cloning into '/home/e24/strfry/golpe/external/hoytech-cpp'...
Cloning into '/home/e24/strfry/golpe/external/json'...
Cloning into '/home/e24/strfry/golpe/external/lmdbxx'...
Cloning into '/home/e24/strfry/golpe/external/loguru'...
The authenticity of host 'github.com (140.82.121.4)' can't be established.
ECDSA key fingerprint is SHA256:p2QAMXNIC1TJYWeIOttrVc98/R1BUFWu3/LiyKgUfQM.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added 'github.com,140.82.121.4' (ECDSA) to the list of known hosts.
[email protected]: Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
fatal: clone of '[email protected]:emilk/loguru.git' into submodule path '/home/e24/strfry/golpe/external/loguru' failed
Failed to clone 'external/loguru'. Retry scheduled
Cloning into '/home/e24/strfry/golpe/external/quadrable'...
Cloning into '/home/e24/strfry/golpe/external/rasgueadb'...
Cloning into '/home/e24/strfry/golpe/external/uWebSockets'...
Cloning into '/home/e24/strfry/golpe/external/loguru'...
Warning: Permanently added the ECDSA host key for IP address '140.82.121.3' to the list of known hosts.
[email protected]: Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
fatal: clone of '[email protected]:emilk/loguru.git' into submodule path '/home/e24/strfry/golpe/external/loguru' failed
Failed to clone 'external/loguru' a second time, aborting
make: *** [golpe/rules.mk:54: setup-golpe] Error 1

I had to change loguru's url to a public one:

cd golpe
git submodule set-url external/loguru https://github.com/emilk/loguru.git
git submodule update
cd ..
make setup-golpe

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.