So all was running just fine:
2020-04-07 15:16:28,097 [INFO]: Initialized hash ring of size 3894 (blinded key: b'S6Vj3CykF/tcJdl7Gzz3PORZEsC8sF7JHgeR46G7ogA=')
2020-04-07 15:16:28,098 [INFO]: Getting HS index with TP#18359 for second descriptor (1 replica)
2020-04-07 15:16:28,099 [INFO]: Tried with HS index f89fa7cb8c67b87ec948300504417586f62e287c055525656c4fe099b42cf94a got position 3788
2020-04-07 15:16:28,101 [INFO]: Getting HS index with TP#18359 for second descriptor (2 replica)
2020-04-07 15:16:28,102 [INFO]: Tried with HS index c2771648cf7eaa66e409366dfd19de8e2d8967c0ea682d81b730b30204553c30 got position 2959
2020-04-07 15:16:28,103 [INFO]: HSDir set remained the same
2020-04-07 15:16:28,105 [INFO]: No reason to publish second descriptor for 5tqw3kwwuy7sf3zhdpvz6valtyjnbsuxasx52afnuezr6hkmbsnqrtad.onion
Did a restart on the Tor process (reload / SIGHUP do not trigger this bug)to downgrade to info
log level, and:
2020-04-07 15:20:39,391 [INFO]: [] fetch_instance_descriptors() called []
Traceback (most recent call last):
File "/usr/local/bin/onionbalance", line 11, in
load_entry_point('OnionBalance==0.1.9', 'console_scripts', 'onionbalance')()
File "/usr/local/lib/python3.7/dist-packages/OnionBalance-0.1.9-py3.7.egg/onionbalance/manager.py", line 31, in main
File "/usr/local/lib/python3.7/dist-packages/OnionBalance-0.1.9-py3.7.egg/onionbalance/hs_v3/manager.py", line 43, in main
File "/usr/local/lib/python3.7/dist-packages/OnionBalance-0.1.9-py3.7.egg/onionbalance/common/scheduler.py", line 102, in run_forever
File "/usr/local/lib/python3.7/dist-packages/OnionBalance-0.1.9-py3.7.egg/onionbalance/common/scheduler.py", line 79, in _run_job
File "/usr/local/lib/python3.7/dist-packages/OnionBalance-0.1.9-py3.7.egg/onionbalance/common/scheduler.py", line 45, in run
File "/usr/local/lib/python3.7/dist-packages/OnionBalance-0.1.9-py3.7.egg/onionbalance/hs_v3/onionbalance.py", line 111, in fetch_instance_descriptors
File "/usr/local/lib/python3.7/dist-packages/OnionBalance-0.1.9-py3.7.egg/onionbalance/hs_v3/stem_controller.py", line 56, in mark_tor_as_active
File "/usr/local/lib/python3.7/dist-packages/stem-1.8.0-py3.7.egg/stem/control.py", line 3823, in signal
response = self.msg('SIGNAL %s' % signal)
File "/usr/local/lib/python3.7/dist-packages/stem-1.8.0-py3.7.egg/stem/control.py", line 662, in msg
self._socket.send(message)
File "/usr/local/lib/python3.7/dist-packages/stem-1.8.0-py3.7.egg/stem/socket.py", line 460, in send
self._send(message, lambda s, sf, msg: send_message(sf, msg))
File "/usr/local/lib/python3.7/dist-packages/stem-1.8.0-py3.7.egg/stem/socket.py", line 243, in _send
raise stem.SocketClosed()
stem.SocketClosed
Shouldn't onionbalance daemon be somewhat resistant to Tor process restarts or reloads? Like when Tor process is not running or not responding shout a warning about it, but recover when it becomes available or responsive again.
Why we need to be resilient for restarts / reloads and Tor lack of response?
- vanguards addon (changing layer 2 and layer 3 nodes);
- various changes in torrc that hosts the frontend, that require on-the-fly appliance;
- too many events on ControlPort overwhelm it, so onionbalance should peacefully wait until it works
Of course is under optimal for onionbalance to start without a working ControlPort at all and just print useless warnings, so the patch should be that at start time, if it has no communication with the Tor daemon die and log why (we already do this). But in case communication with Tor daemon is lost while running smoothly, wait for it to come back and issue often warnings.