Giter Club home page Giter Club logo

Comments (19)

daniel-abramov avatar daniel-abramov commented on September 17, 2024

Hi,

Well... tokio-tungstenite does not do anything monumentally complex, it just wraps the underlying tungstenite-rs in futures to make it usable with tokio, at the moment I don't see anything there that could have lead to such behavior. I'll try to look at tungstenite-rs again, but we did not have such issues so far (actually once I experienced something similar, but it was related to the scheduling problems in tokio when running CPU-bound tasks within the same reactor).

Some additional information would be helpful to spot the problem (it could be on the tokio side and/or not related to tokio/tungstenite). Can you give a bit more details when it gets stuck? As far as I understood, the thread which accepts the incoming websocket connections get stuck and does not do any progress on tasks, the same thread (and [tokio] reactor), which accept the connections, also performs WebSocket handshakes and reading/writing part for each WebSocket, right?

If so:

  1. When exactly the stuck happens?
  2. What happens if the new connection is coming? (after the "stuck") Have you tried to log each incoming connection and each websocket handshake completion?
  3. Do you still receive incoming messages from other clients that have been already connected (within the same tokio reactor).
  4. What does the Wireshark show?
  5. Have you tried to run it with trace logs for tungstenite-rs/tokio-tungstenite?

Do you have a sample code or something, so that I can reproduce the problem? (that would be very helpful)

from tokio-tungstenite.

jq-rs avatar jq-rs commented on September 17, 2024

Hello, thanks for you reply. Here are some answers:

The stuck happens when a client connects to server over WebSocket. It looks like it could be timing dependent as adding debugging output may cause it to not to trigger as often. Also, 0.2.0 tungstenite-rs was working somewhat better, but also failed eventually after more extensive testing. I have not tried 0.2.1, so 0.2.0 and 0.2.2 are versions where this issue is seen. With 0.2.2 it is fairly easy to reproduce, both debug and release builds have the same symptoms.

When issue occurs, the client that encounters the failure is not received anymore behind accept_async(). New connections cannot be done after this. The websocket client TCP will timeout after some time. I have not tried enabling logs, I can certainly try them, please advice how to best enable them.

Existing clients cannot receive messages anymore after the connection problem. All existing connections stop working as the main task of reactor handling fails. These connections will eventually timeout.

I can try to reproduce the problem with debug logs enabled and take a Wireshark capture while it happens, if I get it reproduced again.

I have not tried traces either, I am happy to try, please advice how to best enable them.

As I am seeing the problem in interactive setup, I can only provide similar setup to you for trial runs. I added a sample client to https://github.com/jq-rs/mles-websocket-client, the issue is reproduced with similar connections on latest Firefox. It will connect by default over WebSocket to port 8076 to the address you define. Mles server and Mles-client/websocket-proxy can be set up as follows.

An Mles server on a host:
target/debug/mles

An Mles client/proxy in the same host, will be listening port 8076:
target/debug/mles-client 127.0.0.1 --use-websockets

Before 0.9 I had different WS library in use. Needed Tokio support and changed due to that. Thus, the issue is seen on most recent 0.9.

Usually problem happens with several concurrent connections and it may help if some connections are dropped out and a new connection is instantly opened after such drop.

Let me know if you need further guidance regarding Mles.

from tokio-tungstenite.

jq-rs avatar jq-rs commented on September 17, 2024

Hi again,

I was able to reproduce the problem with the proto client quite easily and make a capture. Below in text mode successful connection:

``
Origin: null
Sec-WebSocket-Protocol: mles-websocket
Sec-WebSocket-Extensions: permessage-deflate
Sec-WebSocket-Key: m6POfiLaxgrWk5N5DQQ/Zw==
Connection: keep-alive, Upgrade
Pragma: no-cache
Cache-Control: no-cache
Upgrade: websocket

HTTP/1.1 101 Switching Protocols
Connection: Upgrade
Upgrade: websocket
Sec-WebSocket-Accept: Se0iutkw4plVW63pViYACYiV4ao=
``

In the failing connection, we do not anymore get the "HTTP/1.1 101 Switching Protocols" upgrade, although TCP SYN sequence as such is fine. After a while WebSocket timeouts because of this.

from tokio-tungstenite.

agalakhov avatar agalakhov commented on September 17, 2024

Hello,
from your call stack I see that the code is stuck in TcpStream::read() called from InputBuffer::read_from() inside tungstenite. Please check the following:

  1. Is the code stuck in this read() or does it call read() in an infinite loop? If it is stuck, it's a bug in Tokio. If it's an infinite loop, then:
  2. Does read() always return Err(kind=WouldBlock) or something else? What exactly?

from tokio-tungstenite.

jq-rs avatar jq-rs commented on September 17, 2024

Hello,

I tried now with debug-build and here is more complete callstack:

#0  0x000055555567cdd4 in tungstenite::handshake::server::{{impl}}::try_parse (buf=...)
    at /home/ubuntu/.cargo/registry/src/github.com-1ecc6299db9ec823/tungstenite-0.2.2/src/handshake/server.rs:38
#1  0x00005555555b6ebf in tungstenite::handshake::machine::{{impl}}::single_round (self=...)
    at /home/ubuntu/.cargo/registry/src/github.com-1ecc6299db9ec823/tungstenite-0.2.2/src/handshake/machine.rs:48
#2  0x00005555555b54ef in tungstenite::handshake::{{impl}}::handshake (self=...)
    at /home/ubuntu/.cargo/registry/src/github.com-1ecc6299db9ec823/tungstenite-0.2.2/src/handshake/mod.rs:43
#3  0x00005555555cfd3b in tokio_tungstenite::{{impl}}::poll (self=0x7fffffffa378)
    at /home/ubuntu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-tungstenite-0.1.2/src/lib.rs:158
#4  0x00005555555c7eec in tokio_tungstenite::{{impl}}::poll (self=0x7fffffffa378)
    at /home/ubuntu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-tungstenite-0.1.2/src/lib.rs:141
#5  0x00005555555afb25 in futures::future::chain::{{impl}}::poll,futures::future::result_::FutureResult<(), tungstenite::error::Error>,closure,closure> (self=0x7fffffffa370, f=...) at /home/ubuntu/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-0.1.13/src/future/chain.rs:26
#6  0x0000555555563cec in futures::future::and_then::{{impl}}::poll,core::result::Result<(), tungstenite::error::Error>,closure> (self=0x7fffffffa370) at /home/ubuntu/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-0.1.13/src/future/and_then.rs:32
#7  0x00005555555d0678 in futures::future::map_err::{{impl}}::poll, core::result::Result<(), tungstenite::error::Error>, closure>,closure> (self=0x7fffffffa370)
    at /home/ubuntu/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-0.1.13/src/future/map_err.rs:30
#8  0x0000555555567ef3 in futures::stream::for_each::{{impl}}::poll, core::result::Result<(), tungstenite::error::Error>, closure>, closure>> (self=0x7fffffffbfc0)
    at /home/ubuntu/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-0.1.13/src/stream/for_each.rs:39
#9  0x000055555557fb1c in futures::task_impl::{{impl}}::poll_future::{{closure}}, core::result::Result<(), tungstenite::error::Error>, closure>, closure>>> (
    f=0x7fffffffbfc0) at /home/ubuntu/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-0.1.13/src/task_impl/mod.rs:337
#10 0x0000555555580d0e in futures::task_impl::{{impl}}::enter::{{closure}}, core::result::Result<(), tungstenite::error::Error>, closure>, closure>>,closure,core::result::Result, std::io::error::Error>> () at /home/ubuntu/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-0.1.13/src/task_impl/mod.rs:484
#11 0x00005555555c5693 in futures::task_impl::set::{{closure}}, std::io::error::Error>> (c=0x7ffff7fe9748)
    at /home/ubuntu/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-0.1.13/src/task_impl/mod.rs:61
#12 0x0000555555583bd4 in std::thread::local::{{impl}}::with,closure,core::result::Result, std::io::error::Error>> (self=0x555555988e60 , f=...) at /checkout/src/libstd/thread/local.rs:253
#13 0x00005555555c547f in futures::task_impl::set, std::io::error::Error>> (task=0x7fffffffb798, f=...)
    at /home/ubuntu/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-0.1.13/src/task_impl/mod.rs:54
#14 0x00005555555807a9 in futures::task_impl::{{impl}}::enter, core::result::Result<(), tungstenite::error::Error>, closure>, closure>>,closure,core::result::Result, std::io::error::Error>> (self=0x7fffffffbfc0, unpark=0x7fffffffb870, f=...)
    at /home/ubuntu/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-0.1.13/src/task_impl/mod.rs:484
#15 0x000055555557f877 in futures::task_impl::{{impl}}::poll_future, core::result::Result<(), tungstenite::error::Error>, closure>, closure>>> (self=0x7fffffffbfc0,
    unpark=...) at /home/ubuntu/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-0.1.13/src/task_impl/mod.rs:337
#16 0x000055555556b407 in tokio_core::reactor::{{impl}}::run::{{closure}}, core::result::Result<(), tungstenite::error::Error>, closure>, closure>>> ()
    at /home/ubuntu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-core-0.1.7/src/reactor/mod.rs:243
#17 0x00005555555797c1 in scoped_tls::{{impl}}::set, std::io::error::Error>> (
    self=0x555555986160 , t=0x7fffffffc7e0, f=...)
    at /home/ubuntu/.cargo/registry/src/github.com-1ecc6299db9ec823/scoped-tls-0.1.0/src/lib.rs:135
#18 0x000055555556b0af in tokio_core::reactor::{{impl}}::run, core::result::Result<(), tungstenite::error::Error>, closure>, closure>>> (self=0x7fffffffc7e0, f=...)
    at /home/ubuntu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-core-0.1.7/src/reactor/mod.rs:242
#19 0x00005555555d4620 in mles_client::ws::process_ws_proxy (raddr=..., keyval=..., keyaddr=...) at /home/ubuntu/mles/mles-rs/mles-client/src/ws.rs:116
---Type  to continue, or q  to quit---
#20 0x00005555555d700c in mles_client::main () at /home/ubuntu/mles/mles-rs/mles-client/src/main.rs:114
#21 0x00005555556bfe26 in std::panicking::try::do_call () at /checkout/src/libstd/panicking.rs:454
#22 0x00005555556c6d4b in panic_unwind::__rust_maybe_catch_panic () at /checkout/src/libpanic_unwind/lib.rs:98
#23 0x00005555556c0677 in std::panicking::try<(),fn()> () at /checkout/src/libstd/panicking.rs:433
#24 std::panic::catch_unwind () at /checkout/src/libstd/panic.rs:361
#25 std::rt::lang_start () at /checkout/src/libstd/rt.rs:57
#26 0x00005555555dbd13 in main ()

What happens there now is that we seem to be stuck here:

(gdb)
42              loop {
(gdb)
43                  mach = match mach.single_round()? {
(gdb)
44                      RoundResult::WouldBlock(m) => {
(gdb)
47                      RoundResult::Incomplete(m) => m,
(gdb)
43                  mach = match mach.single_round()? {
(gdb)
55              }
(gdb)
42              loop {
(gdb)
43                  mach = match mach.single_round()? {
(gdb)
44                      RoundResult::WouldBlock(m) => {
(gdb)
47                      RoundResult::Incomplete(m) => m,
(gdb)
43                  mach = match mach.single_round()? {
(gdb)
55              }
(gdb)
42              loop {
(gdb)
43                  mach = match mach.single_round()? {
(gdb)
44                      RoundResult::WouldBlock(m) => {
(gdb)
47                      RoundResult::Incomplete(m) => m,
(gdb)
43                  mach = match mach.single_round()? {
(gdb)
55              }
(gdb)

I hope this helps!

from tokio-tungstenite.

agalakhov avatar agalakhov commented on September 17, 2024

I just pushed a potential fix into tungstenite master, could you please test it?

from tokio-tungstenite.

daniel-abramov avatar daniel-abramov commented on September 17, 2024

I have not tried traces either, I am happy to try, please advice how to best enable them.

In case if you want the messages from tungstenite-rs to be displayed, you can do it by using RUST_LOG as usually, i.e.:

$ RUST_LOG=tungstenite ./my_application

from tokio-tungstenite.

jq-rs avatar jq-rs commented on September 17, 2024

When trying to use the new master version 22f7df0, the following compiler errors are seen. I cannot easily figure out how to resolve them as protocol::message is private.

error[E0281]: type mismatch: the type `[closure@src/ws.rs:82:45: 93:14 ws_tx_own:_, mles_tx:_]` implements the trait `std::ops::FnMut<(tungstenite::Message,)>`, but the trait `std::ops::FnMut<(tungstenite::protocol::message::Message,)>` is required (expected enum `tungstenite::protocol::message::Message`, found enum `tungstenite::Message`)
  --> src/ws.rs:82:36
   |
82 |             let ws_reader = stream.for_each(move |message: Message| {
   |                                    ^^^^^^^^
error[E0281]: type mismatch: the type `[closure@src/ws.rs:82:45: 93:14 ws_tx_own:_, mles_tx:_]` implements the trait `std::ops::FnOnce<(tungstenite::Message,)>`, but the trait `std::ops::FnOnce<(tungstenite::protocol::message::Message,)>` is required (expected enum `tungstenite::protocol::message::Message`, found enum `tungstenite::Message`)
  --> src/ws.rs:82:36
   |
82 |             let ws_reader = stream.for_each(move |message: Message| {
   |                                    ^^^^^^^^

error[E0308]: mismatched types
  --> src/ws.rs:96:41
   |
96 |                 let _ = sink.start_send(Message::binary(msg)).map_err(|err| {
   |                                         ^^^^^^^^^^^^^^^^^^^^ expected enum `tungstenite::protocol::message::Message`, found enum `tungstenite::Message`
   |
   = note: expected type `tungstenite::protocol::message::Message`
              found type `tungstenite::Message`

error: no method named `map` found for type `futures::stream::ForEach>, [closure@src/ws.rs:82:45: 93:14 ws_tx_own:_, mles_tx:_], std::result::Result<(), tungstenite::error::Error>>` in the current scope
   --> src/ws.rs:101:40
    |
101 |             let connection = ws_reader.map(|_| ()).map_err(|_| ())
    |                                        ^^^
    |
    = note: the method `map` exists but the following trait bounds were not satisfied: `[closure@src/ws.rs:82:45: 93:14 ws_tx_own:_, mles_tx:_] : std::ops::FnMut<(tungstenite::protocol::message::Message,)>`, `futures::stream::ForEach>, [closure@src/ws.rs:82:45: 93:14 ws_tx_own:_, mles_tx:_], std::result::Result<(), tungstenite::error::Error>> : futures::Future`, `futures::stream::ForEach>, [closure@src/ws.rs:82:45: 93:14 ws_tx_own:_, mles_tx:_], std::result::Result<(), tungstenite::error::Error>> : futures::Stream`, `futures::stream::ForEach>, [closure@src/ws.rs:82:45: 93:14 ws_tx_own:_, mles_tx:_], std::result::Result<(), tungstenite::error::Error>> : std::iter::Iterator`

error: aborting due to 4 previous errors

error: Could not compile `mles-client`.

from tokio-tungstenite.

agalakhov avatar agalakhov commented on September 17, 2024

Looks like you mix old and new versions of tokio-tungstenite and tungstenite. Please take the latest version of tokio-tungstenite.

from tokio-tungstenite.

agalakhov avatar agalakhov commented on September 17, 2024

Ok, I checked it myself. The problem is caused by version mismatch. You use different versions of tungstenite in tokio-tungstenite and in mles-client, thus tokio_tungstenite's Message is not the same as the tungstenite's one. Please point both tokio-tungstenite and mles-client to the same version of tungstenite.

from tokio-tungstenite.

jq-rs avatar jq-rs commented on September 17, 2024

Something very weird is going on when I point tungstenite-rs to be fetched directly from github. It just fails. I have tried a lot of different combinations, even local files. I cannot resolve it easily. It seems not the be because of the changes, it is some generic problem.

Getting library from crates.io seems to work just fine every time. If you can, you could release 0.2.3 version of tungstenite-rs and I would probably get this tested sooner.

from tokio-tungstenite.

jq-rs avatar jq-rs commented on September 17, 2024

I was finally able to resolve the conflict. I'll let you know the results by tomorrow. Thx!

from tokio-tungstenite.

agalakhov avatar agalakhov commented on September 17, 2024

If it's fixed, I'll release a new version and update tokio-tungstenite accordingly.

from tokio-tungstenite.

jq-rs avatar jq-rs commented on September 17, 2024

Hi! I made a quick test round which ended to application close with following output:

RUST_LOG=tungstenite target/debug/mles-client 127.0.0.1 --use-websockets
Listening WebSockets on: 0.0.0.0:8076
New WebSocket connection 1: 83.136.45.61:43456
New WebSocket connection 2: 83.136.45.61:43460
New WebSocket connection 3: 83.136.45.61:43458
New WebSocket connection 4: 83.136.45.61:43463
New WebSocket connection 5: 192.130.252.31:51053
Error during the WebSocket handshake occurred: WebSocket protocol error: Handshake not finished
Error: WebSocket protocol error: Handshake not finished

The connection does not get stuck anymore, which is great! However, the application exits completely. I guess the target here would be that the failing connection would be dropped, but existing ones would continue to work.

I can easily now pickup new versions from master, so let me know if you are able to craft new versions. Thanks a million!

from tokio-tungstenite.

daniel-abramov avatar daniel-abramov commented on September 17, 2024

However, the application exits completely. I guess the target here would be that the failing connection would be dropped, but existing ones would continue to work.

Yeah, the future will resolve if you wrote a combination of futures which fails the server future when the error occurs, it's up to you to decide what to do when the handshake fails. If you don't want to resolve the server future, you can handle the failed handshake (i.e. write a message to the log, do some actions) without failing the whole future. I've noticed that you do map_err() and return Error::new(ErrorKind::Other, e), which indeed will resolve the whole server future to an error. But it's not mandatory to fail it if you want to continue.

Or, as alternative, you can start each incoming connection in a separate task (via handle.spawn()) as described in tokio documentation on https://tokio.rs.

from tokio-tungstenite.

jq-rs avatar jq-rs commented on September 17, 2024

You are right, that is just the original example crate handling which is not quite right yet. I'll improve it next. I think the fix should be now enough to have the service up and running. Pretty interesting, though, that handshake fails that easily. I'll run some test runs tomorrow and close this if everything looks good.

from tokio-tungstenite.

agalakhov avatar agalakhov commented on September 17, 2024

It is, in fact, "connection reset by peer" and not a handshake failure as such. May be caused by ultra small time-out in the network or something like it. Look at the peer side - why it may want to close the connection in the middle of the handshake?

from tokio-tungstenite.

jq-rs avatar jq-rs commented on September 17, 2024

Hello! Good news, it seems that problem is resolved with the fix 22f7df0 in tungstenite-rs. Thank you very much for the swift fix!

I was able to verify that a TCP RST is indeed seen in the by client in this situation. I do not know why handshake would fail that way, but new attempt is done seamlessly after that, so it does not matter much.

If you can release 0.2.3 soon, that would be great. Thanks!

from tokio-tungstenite.

agalakhov avatar agalakhov commented on September 17, 2024

Just released v0.2.3 of tungstenite and v0.2.0 of tokio-tungstenite. The version bump of tokio-tungstenite is due to its ability to add custom request (minor API change, probably non-breaking).

from tokio-tungstenite.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.