zeromq / zbroker Goto Github PK
View Code? Open in Web Editor NEWElastic pipes
License: Mozilla Public License 2.0
Elastic pipes
License: Mozilla Public License 2.0
Symptom: server does not get a valid Close command and so does not clean up the pipe.
Solution: configurable ping-pong heartbeating. Eventually, this should be done at the ZMTP (libzmq / libzmtp) level.
Right now, zpipes implements an infinite pipe. But in order to implement true named pipe semantics over zpipes, it is necessary that the client can detect remote writer close and return a zero byte read.
Solution: write command line zbroker.c that instantiates zpipes_server.
Eventually this will be configurable and daemonable.
Solution: at least information, warning, errors
Solution: define these semantics.
One possibility is a loosely-connected pub-sub model that allows N publishers talk to N subscribers, using the pipe as sync point. Thus:
Solution: detect changed broker config file and reload automatically.
Michael Haberler reports on zeromq-dev,
I notice zlog.c was added to czmq recently
in the machinekit project we used syslog from a cyclic process which at times generates log messages in short succession, but should not block on syslog()
we had all sorts of problems with blocking, funny hangs in futex calls and whatnot with syslog
we got rid of them for good by replacing syslog by syslog_async :
http://thekelleys.org.uk/syslog-async/READ-ME
license is GPL2/GPL3 only
see also: http://lists.sip-router.org/pipermail/users/2008-October/020101.html
We can't use syslog_async, however the protocol is straight-forward so:
Solution: implement our own syslog client using ZMQ_STREAM socket.
In hintjens@6d8a106 I pushed a new client stack for zpipes, which uses libzmtp.
The selftest fails in clients/zpipes_client.c:72:
int rc = zmtp_dealer_ipc_connect (self->dealer, endpoint);
assert (rc == 0);
Solution: toggle animation via runtime configuration switch
in zpipes_client.c, zpipes_client_read returns 0 ambiguously.
zpipes_msg_recv returns NULL on both malloc error and on EINTR (blowing the entire message frame, it looks like). Returning a 0 here looks like a graceful pipe shutdown, and should return an error (-1 with errno set to EIO? It's probably not retryable, so EINTR is probably out.)
Same with a timeout condition -- this should return an error as well, so as to distinguish itself from a graceful pipe close.
Over time it's going to be annoying to have a stand-alone broker, in terms of installing and configuring and managing. I'd like to start a catch-all zbroker project and make zpipes one of the transports (the first one).
Empty project is at: https://github.com/zeromq/zbroker
open read pipe
open corresponding write pipe
close read pipe
close write pipe
open read pipe (same pipe name as initial open)
segv
Named pipe semantics are to flush the pipe contents when both reader and writer disconnect. I think that's probably the expected behavior here. It's very surprising to connect to an "old" pipe and find data in it.
Solution: a writer should block (with optional timeout) if there are no readers.
Solution: if syslog is available, use it as per configuration files
Solution: add pipe name, if known, to all log data
Not sure how this goes wrong, but...
open reader and writer
write small amount of data
attempt to read a large amount of data (blocks, too small)
close write handle
at this point, expected is that reader returns the data in buffer, and subsequent read returns 0. instead, the reader gets dropped, the read request blocks, and the pipe doesn't terminate.
I'll post traces in a bit
s_expect_reply asserts on zpipes_msg_id(reply) == message_id. This likely should instead tear down the pipe and allow the calling read/write function the ability to pass back an error condition.
If the client is blocked in a FETCH, there is no way to cleanly close the zpipe.
Not sure how you actually wanted it formatted. I have a fix that approaches the config sample as supplied here:
https://github.com/rpedde/zproto/compare/zeromq:master...master
But I'm not sure if that's what you were actually shooting for.
Symptom: when doing a blocked read and EINTR occurs, the server state was stuck in expecting chunk, and would't accept a close command.
Duplicate of issue #26, I think.
One symptom is that data will be discarded if a client doesn't provide the right size of buffer.
Solution: expose a classic stream API, and map internally to chunks both on sending and receiving.
Reliable segv:
14-05-01 18:02:48 I: joining cluster as DD561851F4DAA8B14004F9CF9F3C0CB5
14-05-01 18:02:48 N: starting zpipes_server service
14-05-01 18:02:48 N: binding zpipes service to 'ipc://@/zpipes/local'
14-05-01 18:02:48 I: ZPIPES server appeared at 616A81862091A61E1A61C70EC4C77DC9
14-05-01 18:02:53 825: start:
14-05-01 18:02:53 825: INPUT
14-05-01 18:02:53 825: $ lookup or create pipe
14-05-01 18:02:53 825: $ open pipe reader
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff55b3700 (LWP 21517)]
0x00007ffff7bc5dcc in s_client_execute (self=0xffffffffffffffff, event=10)
at zpipes_server_engine.h:521
521 self->next_event = event;
(gdb) bt
#0 0x00007ffff7bc5dcc in s_client_execute (self=0xffffffffffffffff, event=10)
at zpipes_server_engine.h:521
#1 0x00007ffff7bc532c in engine_send_event (client=0xffffffffffffffff,
event=have_reader_event) at zpipes_server_engine.h:340
#2 0x00007ffff7bcb353 in pipe_attach_local_reader (self=0x7fffec001510,
reader=0x7fffec001890) at zpipes_server.c:220
#3 0x00007ffff7bcc44a in open_pipe_reader (self=0x7fffec001890) at zpipes_server.c:481
#4 0x00007ffff7bc60ff in s_client_execute (self=0x7fffec001890, event=2)
at zpipes_server_engine.h:562
#5 0x00007ffff7bca95e in s_server_client_message (loop=0x60fcd0, item=0x61f3b0,
argument=0x60ecd0) at zpipes_server_engine.h:1447
#6 0x00007ffff776dc6c in zloop_start (self=0x60fcd0) at zloop.c:463
#7 0x00007ffff7bcaac7 in s_server_task (args=0x0, ctx=0x608ec0, pipe=0x6091f0)
at zpipes_server_engine.h:1465
#8 0x00007ffff777915f in s_thread_shim (args=0x608e90) at zthread.c:81
#9 0x00007ffff6ce1b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#10 0x00007ffff72550ed in clone () from /lib/x86_64-linux-gnu/libc.so.6
#11 0x0000000000000000 in ?? ()
(gdb)
Solution: use zbeacon or zyre for local area discovery using UDP broadcasts
When adding a library for the thin client, we'll get 3 different libraries, which seems too complex.
Solution: classic ZeroMQ-based client & server can go into same libzbroker library. Thin libzmtp based client can go into a separate libzbrokercli client library.
It would be nice to be able to change animate logging via configuration rather than compilation.
Honoring SIGHUP for config re-read would be useful as well.
In some kind of far-future super utopian world, a command-line client that talked the management protocol to push a single setting into a running server would be great. :)
Animate logging would be sufficient for debugging if it added some extra details:
Daemon assistance for dropping privs after pidfile creation is pretty much de rigeur.
zpipes_client_read returns EAGAIN on timeout, while zpipes_client_write returns EBADF for both error and timeout.
Should return empty read on timeout and end-of-pipe errors
zbroker should be able to be run as either a daemon or a foreground application, so as to interface better with the various init styles.
Solution: redesign as zproto message class.
Currently, zbroker is version tied to zeromq 4.0.4 and czmq 2.2.0. It would be nice to tie to a tagged version of zyre as well.
Solution: tag Zyre, double-check that ZRE is stable.
Necessary to be able to connect zpipes between brokers located on different machines.
They're too long.
Solution: chop to 6 hex chars. Chances of duplicates are insignificant.
Repro:
In some cases, the reader and writer don't connect, and the writer blocks indefinitely.
writer animation:
zbroker service/0.0.1
Copyright (c) 2014 the Contributors
This Software is provided under the MPLv2 License on an "as is" basis,
without warranty of any kind, either expressed, implied, or statutory.
14-05-05 12:46:37 I: starting zpipes server using config in 'zbroker-2.cfg'
14-05-05 12:46:37 I: joining cluster as 00D325
14-05-05 12:46:37 N: starting zpipes_server service
14-05-05 12:46:37 N: binding zpipes service to 'ipc://@/zpipes/local2
14-05-05 12:46:37 I: ZPIPES server appeared at 5AB1DE
14-05-05 12:46:37 I: ZPIPES server vanished from 5AB1DE
14-05-05 12:46:37 I: ZPIPES server appeared at 5AB1DE
14-05-05 12:46:38 write_test: attach local writer
14-05-05 12:46:38 write_test: broadcast we are now writer
14-05-05 12:46:38 D: 289:write_test : open local writer
14-05-05 12:46:39 D: 289:write_test : close local writer
interrupted
14-05-05 12:46:39 N: terminating zpipes_server service
reader animation:
zbroker service/0.0.1
Copyright (c) 2014 the Contributors
This Software is provided under the MPLv2 License on an "as is" basis,
without warranty of any kind, either expressed, implied, or statutory.
14-05-05 12:46:37 I: starting zpipes server using config in 'zbroker-1.cfg'
14-05-05 12:46:37 I: joining cluster as 5AB1DE
14-05-05 12:46:37 N: starting zpipes_server service
14-05-05 12:46:37 N: binding zpipes service to 'ipc://@/zpipes/local'
14-05-05 12:46:37 I: ZPIPES server appeared at 002BAA
14-05-05 12:46:37 I: ZPIPES server appeared at 002BAA
14-05-05 12:46:37 write_test: attach local reader
14-05-05 12:46:37 write_test: broadcast we are now reader
14-05-05 12:46:37 D: 289:write_test : open local reader
14-05-05 12:46:38 D: remote=002BAA command=HAVE WRITER pipe=write_test unicast=0
14-05-05 12:46:38 write_test: attach remote writer
14-05-05 12:46:38 write_test: tell peer we are now reader
14-05-05 12:46:39 write_test: tell peer we stopped being reader
14-05-05 12:46:39 D: 289:write_test : close local reader
interrupted
14-05-05 12:46:39 N: terminating zpipes_server service
The writer was stuck in zpipes_client_write
I find this difficult to reproduce on a single node with two brokers, but it evidences itself frequently on true multi-node config.
Trying to repo another problem I ran into this one:
in quick succession:
open writer on hosta
open reader on hostb
open writer on hosta
second writer open asserts:
zpipes_client.c:76: zpipes_client_new: Assertion `0' failed.
Aborted
Animate data:
14-05-02 17:07:01 420: start:
14-05-02 17:07:01 420: OUTPUT
14-05-02 17:07:01 420: $ lookup or create pipe
14-05-02 17:07:01 420: $ open pipe writer
14-05-02 17:07:01 420: > before writing
14-05-02 17:07:01 420: before writing:
14-05-02 17:07:01 420: ok
14-05-02 17:07:01 420: $ send OUTPUT_OK
14-05-02 17:07:01 420: > writing
14-05-02 17:07:01 421: start:
14-05-02 17:07:01 421: INPUT
14-05-02 17:07:01 421: $ lookup or create pipe
14-05-02 17:07:01 421: $ open pipe reader
14-05-02 17:07:01 421: > before reading
14-05-02 17:07:01 421: before reading:
14-05-02 17:07:01 421: error
14-05-02 17:07:01 421: $ send INPUT_FAILED
14-05-02 17:07:01 421: $ terminate
14-05-02 17:07:11 420: writing:
14-05-02 17:07:11 420: expired
14-05-02 17:07:11 420: $ close pipe writer
14-05-02 17:07:11 420: $ terminate
I suspect this comment is wrong:
{
zpipes_msg_send_input (self->dealer, pipe_name);
if (s_expect_reply (self, ZPIPES_MSG_INPUT_OK))
assert (false); // Cannot happen in current use case
}
:)
Solution: use zlog for all logging, consistently.
on a named pipe, writing against a read-side close file handle results in SIGPIPE, and -1/EBADF. zpipes_client_write should probably return ssize_t with -1 on error.
Solution: simulate N cluster nodes doing some work
Perhaps we could use simulation to also measure different kinds of activity, throughput, latency, etc.
This makes it hard to know what the applications are doing.
Solution:
Problem: there's no formal spec for the ZeroMQ ipc:// protocol
Solution: (1) collect the basic info of the protocol frames used, and then (2) write up as a formal ZeroMQ RFC.
Solution: port the zproto codec to generate code for libzmtp, and port the client API to this codec.
For now, this can be in a directory 'clients' in zbroker; it'll build a separate library.
zbroker should use a configuration file to specify the bind endpoint, log level, and log file.
Option for syslog with configurable facility would be a nice-to-have.
Assert: lt-zbroker: zpipes_server.c:516: look_for_pipe_reader: Assertion `self->pipe' failed.
If the buffer provided by a client is smaller than the size of the pending data chunk, the excess data in the chunk is thrown away.
The client read function really needs to keep an offset of the position of the current chunk, and continue to serve from it until the chunk is exhausted.
In a perfect world, we'd never return short reads, either. Reads shorter then requested size happen, but are surprising unless at end of file, I think. So a single read in excess of the pending chunk size should probably continue to pull chunks until the client supplied buffer is full.
This might be something better deferred to after the raw client re-write, though. Just noting this for posterity.
This might be related to the other assert, as it's in the same pattern:
open writer on hosta
open corresponding reader on hostb
open new writer on hostb
write to new writer handle, reader asserts:
zpipes_server.c:417: pipe_send_data: Assertion `self->reader' failed.
Aborted
Writer animation:
14-05-02 17:14:23 I: starting zpipes server using config in '/etc/zbroker.cfg'
14-05-02 17:14:23 I: joining cluster as DEC3D1B4C98A5003B10285003C1CD50F
14-05-02 17:14:23 N: starting zpipes_server service
14-05-02 17:14:23 N: binding zpipes service to 'ipc://@/zpipes/local'
14-05-02 17:14:33 I: ZPIPES server appeared at 64ECAC5F2DDC5D91320A6CD4AA3079A4
14-05-02 17:14:42 0: start:
14-05-02 17:14:42 0: OUTPUT
14-05-02 17:14:42 0: $ lookup or create pipe
14-05-02 17:14:42 0: $ open pipe writer
14-05-02 17:14:42 0: > before writing
14-05-02 17:14:42 0: before writing:
14-05-02 17:14:42 0: ok
14-05-02 17:14:42 0: $ send OUTPUT_OK
14-05-02 17:14:42 0: > writing
14-05-02 17:14:42 1: start:
14-05-02 17:14:42 1: OUTPUT
14-05-02 17:14:42 1: $ lookup or create pipe
14-05-02 17:14:42 1: $ open pipe writer
14-05-02 17:14:42 1: > before writing
14-05-02 17:14:42 1: before writing:
14-05-02 17:14:42 1: ok
14-05-02 17:14:42 1: $ send OUTPUT_OK
14-05-02 17:14:42 1: > writing
14-05-02 17:14:42 1: writing:
14-05-02 17:14:42 1: WRITE
14-05-02 17:14:42 1: $ process write request
14-05-02 17:14:42 1: > processing write
14-05-02 17:14:42 1: processing write:
14-05-02 17:14:42 1: have reader
14-05-02 17:14:42 1: $ pass data to reader
14-05-02 17:14:42 1: $ send WRITE_OK
14-05-02 17:14:42 1: > writing
14-05-02 17:14:42 1: writing:
14-05-02 17:14:42 1: CLOSE
14-05-02 17:14:42 1: $ close pipe writer
14-05-02 17:14:42 1: $ send CLOSE_OK
14-05-02 17:14:42 1: > start
14-05-02 17:14:52 0: writing:
14-05-02 17:14:52 0: expired
14-05-02 17:14:52 0: $ close pipe writer
14-05-02 17:14:52 0: $ terminate
14-05-02 17:14:52 1: start:
14-05-02 17:14:52 1: expired
14-05-02 17:14:52 1: $ terminate
14-05-02 17:14:52 I: ZPIPES server vanished from 64ECAC5F2DDC5D91320A6CD4AA3079A4
Reader animation:
14-05-02 17:14:33 I: starting zpipes server using config in 'zbroker.cfg'
14-05-02 17:14:33 I: joining cluster as 64ECAC5F2DDC5D91320A6CD4AA3079A4
14-05-02 17:14:33 N: starting zpipes_server service
14-05-02 17:14:33 N: binding zpipes service to 'ipc://@/zpipes/local'
14-05-02 17:14:33 I: ZPIPES server appeared at DEC3D1B4C98A5003B10285003C1CD50F
14-05-02 17:14:42 643: start:
14-05-02 17:14:42 643: INPUT
14-05-02 17:14:42 643: $ lookup or create pipe
14-05-02 17:14:42 643: $ open pipe reader
14-05-02 17:14:42 643: > before reading
14-05-02 17:14:42 643: before reading:
14-05-02 17:14:42 643: ok
14-05-02 17:14:42 643: $ send INPUT_OK
14-05-02 17:14:42 643: > reading
lt-zbroker: zpipes_server.c:417: pipe_send_data: Assertion `self->reader' failed.
Aborted
Strangely, the pipe that was written to didn't have a reader. I'm thinking some wires got crossed here. The writer should have blocked on no reader, but it sent data.
Let me know if there is more repro data I can provide.
Daemon mode should drop a pidfile to a configurable location (with a default of /var/run/zbroker.pid, perhaps).
Pipe semantics are that a request to read zero bytes returns a zero result, which is our end-of-pipe response.
Solution: if client requests 0 bytes, return end-of-pipe.
Symptom: as the "expecting chunk" state deals only with internal events, "fetch" commands are held in a queue until the FSM hits the "reading" state. Adding "close" to the state switches that off (in the current implementation), so pipelined "fetch" commands are held until they can be safely processed.
Solution: don't mix protocol command events and internal events. We could perhaps filter commands, e.g. queue fetch and then accept close. Not sure of the semantics of this, though.
Better solution: document the state machine generator properly to explain this. Second, deal with EINTR differently.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.