Giter Club home page Giter Club logo

Comments (20)

ferd avatar ferd commented on July 22, 2024

I'm wondering if this can be related to #56

Do you have the original Erlang data structure so that I can make a test case out of it?

from socket.io-erlang.

andymck avatar andymck commented on July 22, 2024

sure does look related. Sorry i missed that.

let me see if i can get you the original erlang structure....but the issue
can be seen with any erlang tuple list which includes binary strings....the
resulting json will have the binary strings escaped and its those that
length does not count.

thanks

On Fri, Aug 12, 2011 at 4:36 PM, ferd <
[email protected]>wrote:

I'm wondering if this can be related to
#56

Do you have the original Erlang data structure so that I can make a test
case out of it?

Reply to this email directly or view it on GitHub:
#57 (comment)

from socket.io-erlang.

ferd avatar ferd commented on July 22, 2024

I have doubts on the idea of 'any binary string'. The current test suite does have these. Here's a simple counter-example:

1> socketio_data:decode(#msg{content=socketio_data:encode(#msg{content=[{<<"hi">>,<<"there \"bud\\\"dy!">>}], json=true})}).
 [#msg{content = [{<<"hi">>,<<"there \"bud\\\"dy!">>}],
       json = true,length = 0}]

The decoded copy is the same as the encoded copy, same length, escaping of quotation marks in binaries or not. This is why I'm more or less wondering whether there were unicode sequences in there, which might mess up the length of the string, or the original data structure. It doesn't seem like escaping is the actual problem.

There's a funny thing that you made me discover though:

2> socketio_data:encode(#msg{content=[{<<"hi">>,<<"there \"buddy">>}]}).                                                    
[126,109,126,49,126,109,126,{<<"hi">>,<<"there \"buddy">>}]

Encoding a tuplelist without the 'json' flag in the struct shouldn't work. The list right there should be caught by the encoder. I'll try to at least make this more strict. It's purely luck (and wanting to implement something simple) that the decoder hasn' caught that one.

from socket.io-erlang.

andymck avatar andymck commented on July 22, 2024

Did you run the bif length on that example ? it will return a count lesser
than that of the string
The issue i was pointing out is not to do with encoding and decoding, i
agree that works fine....

However after you get the encoded json back, the bif length() is called on
that string and its that value which results in the truncating of what is
sent to the client..as length() is not counting escaped sequences.....so the
encoder works fine buts the number of bytes being sent over the wire is
incorrect ( when using firefox ).

I suspect the issue is to do with the decoding, the encoded message is in
the format

?FRAME ++ Length ++ ?FRAME ++ ?JSON_FRAME ++ JSON

That same length is being used in the decode function....so as it has a
value less than the size of the encoded json string, then that is resulting
in the decoder truncating the json...

Hth,

Andy

On Fri, Aug 12, 2011 at 4:52 PM, ferd <
[email protected]>wrote:

I have doubts on the idea of 'any binary string'. The current test suite
does have these. Here's a simple counter-example:

1>
socketio_data:decode(#msg{content=socketio_data:encode(#msg{content=[{<<"hi">>,<<"there
"bud\"dy!">>}], json=true})}).
[#msg{content = [{<<"hi">>,<<"there "bud\"dy!">>}],
json = true,length = 0}]

The decoded copy is the same as the encoded copy, same length, escaping of
quotation marks in binaries or not. This is why I'm more or less wondering
whether there were unicode sequences in there, which might mess up the
length of the string, or the original data structure. It doesn't seem like
escaping is the actual problem.

There's a funny thing that you made me discover though:

2> socketio_data:encode(#msg{content=[{<<"hi">>,<<"there "buddy">>}]}).
[126,109,126,49,126,109,126,{<<"hi">>,<<"there "buddy">>}]

Encoding a tuplelist without the 'json' flag in the struct shouldn't work.
The list right there should be caught by the encoder. I'll try to at least
make this more strict. It's purely luck (and wanting to implement something
simple) that the decoder hasn' caught that one.

Reply to this email directly or view it on GitHub:
#57 (comment)

from socket.io-erlang.

andymck avatar andymck commented on July 22, 2024

Sorry that was a rushed response... up to my eyes here

I will be looking at this in more detail on monday and will be able to get
you more solid info then

Thanks

On Fri, Aug 12, 2011 at 4:52 PM, ferd <
[email protected]>wrote:

I have doubts on the idea of 'any binary string'. The current test suite
does have these. Here's a simple counter-example:

1>
socketio_data:decode(#msg{content=socketio_data:encode(#msg{content=[{<<"hi">>,<<"there
"bud\"dy!">>}], json=true})}).
[#msg{content = [{<<"hi">>,<<"there "bud\"dy!">>}],
json = true,length = 0}]

The decoded copy is the same as the encoded copy, same length, escaping of
quotation marks in binaries or not. This is why I'm more or less wondering
whether there were unicode sequences in there, which might mess up the
length of the string, or the original data structure. It doesn't seem like
escaping is the actual problem.

There's a funny thing that you made me discover though:

2> socketio_data:encode(#msg{content=[{<<"hi">>,<<"there "buddy">>}]}).
[126,109,126,49,126,109,126,{<<"hi">>,<<"there "buddy">>}]

Encoding a tuplelist without the 'json' flag in the struct shouldn't work.
The list right there should be caught by the encoder. I'll try to at least
make this more strict. It's purely luck (and wanting to implement something
simple) that the decoder hasn' caught that one.

Reply to this email directly or view it on GitHub:
#57 (comment)

from socket.io-erlang.

andymck avatar andymck commented on July 22, 2024

Hi, havent had much time to play around with this but here is a description
of the problem from what i can see:

  1. When the transport is polling and on the server side using
    socketio_transport_polling.erl, JSON data received by the browser fails to
    parse the received msg to valid JSON.

  2. This is due to the length() BIF being used within socket_io_data:encode,
    the encoded JSON string returned from here includes a number of headers /
    frames one of which is the length in bytes of the JSON string. Full
    structure of the encoded msgs being ?FRAME ++ Length ++ ?FRAME ++
    ?JSON_FRAME ++ JSON;

  3. The length returned in the msg header described above does not equal the
    length of the JSON string as the length() function does not count escape
    sequence characters, so calling length(""name"") will resolve to 6,
    whereas it needs to resolve to 8 as we wish to send the entire string
    including escape sequences.

  4. On the client side then the msg received from the server is decoded
    using the ?FRAME headers, and uses the length header included in the
    received msg to construct the data payload. This can be see in
    the Transport.prototype._decode function of socket.io.js. As the length
    header received is less than the size of the data payload, the js at line
    198 of socket.io.js

    this.base._onMessage(JSON.parse(message.substr(3)));

Fails to parse the message, as the msgs it received does not equal the fully
payload sent by the server.

Let me know what you think or if you need any further info. I could be
completely wrong on all of this as i havent got my head around all the code
yet but this is what i am seeing and the theory matches the results i see in
firefox.

Thanks

Andrew

On Fri, Aug 12, 2011 at 4:52 PM, ferd <
[email protected]>wrote:

I have doubts on the idea of 'any binary string'. The current test suite
does have these. Here's a simple counter-example:

1>
socketio_data:decode(#msg{content=socketio_data:encode(#msg{content=[{<<"hi">>,<<"there
"bud\"dy!">>}], json=true})}).
[#msg{content = [{<<"hi">>,<<"there "bud\"dy!">>}],
json = true,length = 0}]

The decoded copy is the same as the encoded copy, same length, escaping of
quotation marks in binaries or not. This is why I'm more or less wondering
whether there were unicode sequences in there, which might mess up the
length of the string, or the original data structure. It doesn't seem like
escaping is the actual problem.

There's a funny thing that you made me discover though:

2> socketio_data:encode(#msg{content=[{<<"hi">>,<<"there "buddy">>}]}).
[126,109,126,49,126,109,126,{<<"hi">>,<<"there "buddy">>}]

Encoding a tuplelist without the 'json' flag in the struct shouldn't work.
The list right there should be caught by the encoder. I'll try to at least
make this more strict. It's purely luck (and wanting to implement something
simple) that the decoder hasn' caught that one.

Reply to this email directly or view it on GitHub:
#57 (comment)

from socket.io-erlang.

andymck avatar andymck commented on July 22, 2024

Hi, wondering if you have had a chance to look at my last email?

Keen to know what think...

Thanks

Andrew

On Fri, Aug 12, 2011 at 4:52 PM, ferd <
[email protected]>wrote:

I have doubts on the idea of 'any binary string'. The current test suite
does have these. Here's a simple counter-example:

1>
socketio_data:decode(#msg{content=socketio_data:encode(#msg{content=[{<<"hi">>,<<"there
"bud\"dy!">>}], json=true})}).
[#msg{content = [{<<"hi">>,<<"there "bud\"dy!">>}],
json = true,length = 0}]

The decoded copy is the same as the encoded copy, same length, escaping of
quotation marks in binaries or not. This is why I'm more or less wondering
whether there were unicode sequences in there, which might mess up the
length of the string, or the original data structure. It doesn't seem like
escaping is the actual problem.

There's a funny thing that you made me discover though:

2> socketio_data:encode(#msg{content=[{<<"hi">>,<<"there "buddy">>}]}).
[126,109,126,49,126,109,126,{<<"hi">>,<<"there "buddy">>}]

Encoding a tuplelist without the 'json' flag in the struct shouldn't work.
The list right there should be caught by the encoder. I'll try to at least
make this more strict. It's purely luck (and wanting to implement something
simple) that the decoder hasn' caught that one.

Reply to this email directly or view it on GitHub:
#57 (comment)

from socket.io-erlang.

ferd avatar ferd commented on July 22, 2024

Not yet. I'm having a pretty busy week. I know this is pretty important so I'll try to find some free time ASAP.

from socket.io-erlang.

andymck avatar andymck commented on July 22, 2024

Hi, ok np.

FYI here is a test that will demo the issue:

simple_json_test2() ->
Msg = "{"hello":"world"}",
TestMsg = lists:concat(["~m", integer_to_list(length(Msg)), "m~~j",
Msg]),
[X] = decode(#msg{content=TestMsg}),
?assertMatch(#msg{content=[{<<"hello">>,<<"world">>}], json=true}, X).

Thanks

On Wed, Aug 17, 2011 at 12:16 AM, ferd <
[email protected]>wrote:

Not yet. I'm having a pretty busy week. I know this is pretty important so
I'll try to find some free time ASAP.

Reply to this email directly or view it on GitHub:
#57 (comment)

from socket.io-erlang.

ferd avatar ferd commented on July 22, 2024

Regarding the test case, that's because the correct length isn't 17. See:

20> socketio_data:encode(#msg{content=[{<<"hello">>,<<"world">>}], json=true}).
"~m~20~m~~j~{\"hello\":\"world\"}"

The 3 characters difference is due to the j framing, which counts towards the total length, but you didn't count it for the test. See https://github.com/LearnBoost/socket.io/blob/0.6.17/lib/socket.io/utils.js#L26 for the correct JS implementation as of 0.6.x.

Using the encoding functions we provide to create your messages seems to work.

from socket.io-erlang.

ferd avatar ferd commented on July 22, 2024

Note being, I still can't produce any of these problems you had when I start by submitting Erlang Data structures rather than a JSON string and then trying to shove it in the app. Are you using the interface functions or just submitting the JSON?

from socket.io-erlang.

andymck avatar andymck commented on July 22, 2024

k, i guess i am not communication this properly. Hopefully this example
using real data will help:

Here is the term i trying to send via socket.io

[[{<<"action">>,<<"feedback">>},
{<<"payload">>,
[{<<"context">>,<<"order_ack">>},
{<<"report_id">>,90},
{<<"report_type">>,<<"new">>},
{<<"order_id">>,76},
{<<"user_id">>,<<"stig">>},
{<<"client_order_id">>,6},
{<<"client_head_id">>,6},
{<<"order_type">>,<<"new">>},
{<<"status">>,<<"new">>},
{<<"fill_type">>,<<"day">>},
{<<"book_id">>,<<"usd_rate_outright_2">>},
{<<"side">>,<<"bid">>},
{<<"price">>,9000},
{<<"order_qty">>,120},
{<<"head_qty">>,120},
{<<"display_qty">>,<<"undefined">>},
{<<"fill_qty">>,0},
{<<"left_qty">>,120},
{<<"child_id">>,<<"undefined">>},
{<<"head_id">>,76}]}]]

Here is the resulting JSON returned from the encode function

"[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty":0,"left_qty":120,"child_id":"undefined","head_id":76}}]"

Now if you run length() from an erlang shell on that JSON string you will
see that it results in a length of 396 characters, whereas in fact the
length should be 459

When sending this data via socket.io api i.e.

socketio_client:send(Client, #msg{ content=Data, json=true })

If the browser is Firefox the JSON.parse in the clientside script i.e.

Transport.prototype._onMessage = function(message){
if (!this.sessionid){
this.sessionid = message;
this._onConnect();
} else if (message.substr(0, 3) == 'h'){
this._onHeartbeat(message.substr(3));
} else if (message.substr(0, 3) == 'j'){
this.base._onMessage(JSON.parse(message.substr(3)));
} else {
this.base._onMessage(message);
}
},

Fails to parse the string, if I look at the length of the string passed to
here it is 396 characters.

If you use the above term along with a firefox browser on the client ( i am
using firefox 3.6.17 on debian ), i would expect you will see the same
result. NOTE: the transport method has to be jsonp-polling. All works fine
when its websockets.

Andrew

On Wed, Aug 17, 2011 at 6:36 PM, ferd <
[email protected]>wrote:

Note being, I still can't produce any of these problems you had when I
start by submitting Erlang Data structures rather than a JSON string and
then trying to shove it in the app. Are you using the interface functions or
just submitting the JSON?

Reply to this email directly or view it on GitHub:
#57 (comment)

from socket.io-erlang.

ferd avatar ferd commented on July 22, 2024

On Wed, Aug 17, 2011 at 2:22 PM, andymck <
[email protected]>wrote:

k, i guess i am not communication this properly. Hopefully this example

using real data will help:

Here is the term i trying to send via socket.io


[[{<<"action">>,<<"feedback">>},
{<<"payload">>,
[{<<"context">>,<<"order_ack">>},
{<<"report_id">>,90},
{<<"report_type">>,<<"new">>},
{<<"order_id">>,76},
{<<"user_id">>,<<"stig">>},
{<<"client_order_id">>,6},
{<<"client_head_id">>,6},
{<<"order_type">>,<<"new">>},
{<<"status">>,<<"new">>},
{<<"fill_type">>,<<"day">>},
{<<"book_id">>,<<"usd_rate_outright_2">>},
{<<"side">>,<<"bid">>},
{<<"price">>,9000},
{<<"order_qty">>,120},
{<<"head_qty">>,120},
{<<"display_qty">>,<<"undefined">>},
{<<"fill_qty">>,0},
{<<"left_qty">>,120},
{<<"child_id">>,<<"undefined">>},
{<<"head_id">>,76}]}]]

Here is the resulting JSON returned from the encode function


"[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty":0,"left_qty":120,"child_id":"undefined","head_id":76}}]"

Now if you run length() from an erlang shell on that JSON string you will
see that it results in a length of 396 characters, whereas in fact the
length should be 459

The length is 396, including the j frame, yes. The \ you see are
_not_something you want to count. They're only a visual representation
in the
shell for you:

30> "m396m~~j["++[Char0,Char1,Char2|_] = JSON.
"m396m~~j[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty":0,"left_qty":120,"child_id":"undefined","head_id":76}}]"

31> {Char0,Char1,Char2}.
{123,34,97}
32> $".
34

The backslash characters are not part of the string and it's normal for
length/1 not to count them. They're just there so it makes sense when you
read it as a string so you get """ and not """.

When sending this data via socket.io api i.e.

socketio_client:send(Client, #msg{ content=Data, json=true })

If the browser is Firefox the JSON.parse in the clientside script i.e.

Transport.prototype._onMessage = function(message){
if (!this.sessionid){
this.sessionid = message;
this._onConnect();
} else if (message.substr(0, 3) == 'h'){
this._onHeartbeat(message.substr(3));
} else if (message.substr(0, 3) == 'j'){

this.base._onMessage(JSON.parse(message.substr(3)));

} else {
this.base._onMessage(message);
}
},

Fails to parse the string, if I look at the length of the string passed to
here it is 396 characters.

If you use the above term along with a firefox browser on the client ( i am
using firefox 3.6.17 on debian ), i would expect you will see the same
result. NOTE: the transport method has to be jsonp-polling. All works
fine
when its websockets.

I'm trying it with jsonp-polling and I am able to reproduce the bug, even
with FF6. That sounds like a good way to start looking into it. The error is
likely not about the length of the string and how we encode the data, but
likely has to do with the transport or the client. More likely the transport
module.

I inserted some debugging code in the client and got this as the output:

j[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty

There we have backslashes added when they need not to, rather than length
not taking them into account.

We'll start to work from there. Thanks for the report and the useful use
case there.

Andrew

On Wed, Aug 17, 2011 at 6:36 PM, ferd <

[email protected]>wrote:

Note being, I still can't produce any of these problems you had when I
start by submitting Erlang Data structures rather than a JSON string and
then trying to shove it in the app. Are you using the interface functions
or
just submitting the JSON?

Reply to this email directly or view it on GitHub:

#57 (comment)

Reply to this email directly or view it on GitHub:

#57 (comment)

from socket.io-erlang.

andymck avatar andymck commented on July 22, 2024

yes, but in firefox running jsonp-polling here is what the browser receives
for that same msg

j[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_

As you can see the backslashes are present....and the string terminates at
the exact same length as what the erlang shell returns for the length of the
json string....

Firefox throws the following error, as it cannot obviously parse the
truncated string...

JSON.parse
[Break On This Error] this.base._onMessage(JSON.parse(message.substr(3)));
socket.io.js (line 199)

On Wed, Aug 17, 2011 at 7:47 PM, ferd <
[email protected]>wrote:

On Wed, Aug 17, 2011 at 2:22 PM, andymck <
[email protected]>wrote:

k, i guess i am not communication this properly. Hopefully this example

using real data will help:

Here is the term i trying to send via socket.io


[[{<<"action">>,<<"feedback">>},
{<<"payload">>,
[{<<"context">>,<<"order_ack">>},
{<<"report_id">>,90},
{<<"report_type">>,<<"new">>},
{<<"order_id">>,76},
{<<"user_id">>,<<"stig">>},
{<<"client_order_id">>,6},
{<<"client_head_id">>,6},
{<<"order_type">>,<<"new">>},
{<<"status">>,<<"new">>},
{<<"fill_type">>,<<"day">>},
{<<"book_id">>,<<"usd_rate_outright_2">>},
{<<"side">>,<<"bid">>},
{<<"price">>,9000},
{<<"order_qty">>,120},
{<<"head_qty">>,120},
{<<"display_qty">>,<<"undefined">>},
{<<"fill_qty">>,0},
{<<"left_qty">>,120},
{<<"child_id">>,<<"undefined">>},
{<<"head_id">>,76}]}]]

Here is the resulting JSON returned from the encode function


"[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty":0,"left_qty":120,"child_id":"undefined","head_id":76}}]"

Now if you run length() from an erlang shell on that JSON string you will
see that it results in a length of 396 characters, whereas in fact the
length should be 459

The length is 396, including the j frame, yes. The \ you see are
_not_something you want to count. They're only a visual representation
in the
shell for you:

30> "m396m~~j["++[Char0,Char1,Char2|_] = JSON.

"m396m~~j[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty":0,"left_qty":120,"child_id":"undefined","head_id":76}}]"

31> {Char0,Char1,Char2}.
{123,34,97}
32> $".
34

The backslash characters are not part of the string and it's normal for
length/1 not to count them. They're just there so it makes sense when you
read it as a string so you get """ and not """.

When sending this data via socket.io api i.e.

socketio_client:send(Client, #msg{ content=Data, json=true })

If the browser is Firefox the JSON.parse in the clientside script i.e.

Transport.prototype._onMessage = function(message){
if (!this.sessionid){
this.sessionid = message;
this._onConnect();
} else if (message.substr(0, 3) == 'h'){
this._onHeartbeat(message.substr(3));
} else if (message.substr(0, 3) == 'j'){

this.base._onMessage(JSON.parse(message.substr(3)));

} else {
this.base._onMessage(message);
}
},

Fails to parse the string, if I look at the length of the string passed
to
here it is 396 characters.

If you use the above term along with a firefox browser on the client ( i
am
using firefox 3.6.17 on debian ), i would expect you will see the same
result. NOTE: the transport method has to be jsonp-polling. All works
fine
when its websockets.

I'm trying it with jsonp-polling and I am able to reproduce the bug, even
with FF6. That sounds like a good way to start looking into it. The error
is
likely not about the length of the string and how we encode the data, but
likely has to do with the transport or the client. More likely the
transport
module.

I inserted some debugging code in the client and got this as the output:

j[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty

There we have backslashes added when they need not to, rather than length
not taking them into account.

We'll start to work from there. Thanks for the report and the useful use
case there.

Andrew

On Wed, Aug 17, 2011 at 6:36 PM, ferd <

[email protected]>wrote:

Note being, I still can't produce any of these problems you had when I
start by submitting Erlang Data structures rather than a JSON string
and
then trying to shove it in the app. Are you using the interface
functions
or
just submitting the JSON?

Reply to this email directly or view it on GitHub:

#57 (comment)

Reply to this email directly or view it on GitHub:

#57 (comment)

Reply to this email directly or view it on GitHub:
#57 (comment)

from socket.io-erlang.

andymck avatar andymck commented on July 22, 2024

Sorry, just noticed the bottom half of your email....which correlates with
what i just send

So ignore that please. Glad to see that you are able to reproduce the
problem

On Wed, Aug 17, 2011 at 7:47 PM, ferd <
[email protected]>wrote:

On Wed, Aug 17, 2011 at 2:22 PM, andymck <
[email protected]>wrote:

k, i guess i am not communication this properly. Hopefully this example

using real data will help:

Here is the term i trying to send via socket.io


[[{<<"action">>,<<"feedback">>},
{<<"payload">>,
[{<<"context">>,<<"order_ack">>},
{<<"report_id">>,90},
{<<"report_type">>,<<"new">>},
{<<"order_id">>,76},
{<<"user_id">>,<<"stig">>},
{<<"client_order_id">>,6},
{<<"client_head_id">>,6},
{<<"order_type">>,<<"new">>},
{<<"status">>,<<"new">>},
{<<"fill_type">>,<<"day">>},
{<<"book_id">>,<<"usd_rate_outright_2">>},
{<<"side">>,<<"bid">>},
{<<"price">>,9000},
{<<"order_qty">>,120},
{<<"head_qty">>,120},
{<<"display_qty">>,<<"undefined">>},
{<<"fill_qty">>,0},
{<<"left_qty">>,120},
{<<"child_id">>,<<"undefined">>},
{<<"head_id">>,76}]}]]

Here is the resulting JSON returned from the encode function


"[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty":0,"left_qty":120,"child_id":"undefined","head_id":76}}]"

Now if you run length() from an erlang shell on that JSON string you will
see that it results in a length of 396 characters, whereas in fact the
length should be 459

The length is 396, including the j frame, yes. The \ you see are
_not_something you want to count. They're only a visual representation
in the
shell for you:

30> "m396m~~j["++[Char0,Char1,Char2|_] = JSON.

"m396m~~j[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty":0,"left_qty":120,"child_id":"undefined","head_id":76}}]"

31> {Char0,Char1,Char2}.
{123,34,97}
32> $".
34

The backslash characters are not part of the string and it's normal for
length/1 not to count them. They're just there so it makes sense when you
read it as a string so you get """ and not """.

When sending this data via socket.io api i.e.

socketio_client:send(Client, #msg{ content=Data, json=true })

If the browser is Firefox the JSON.parse in the clientside script i.e.

Transport.prototype._onMessage = function(message){
if (!this.sessionid){
this.sessionid = message;
this._onConnect();
} else if (message.substr(0, 3) == 'h'){
this._onHeartbeat(message.substr(3));
} else if (message.substr(0, 3) == 'j'){

this.base._onMessage(JSON.parse(message.substr(3)));

} else {
this.base._onMessage(message);
}
},

Fails to parse the string, if I look at the length of the string passed
to
here it is 396 characters.

If you use the above term along with a firefox browser on the client ( i
am
using firefox 3.6.17 on debian ), i would expect you will see the same
result. NOTE: the transport method has to be jsonp-polling. All works
fine
when its websockets.

I'm trying it with jsonp-polling and I am able to reproduce the bug, even
with FF6. That sounds like a good way to start looking into it. The error
is
likely not about the length of the string and how we encode the data, but
likely has to do with the transport or the client. More likely the
transport
module.

I inserted some debugging code in the client and got this as the output:

j[{"action":"feedback","payload":{"context":"order_ack","report_id":90,"report_type":"new","order_id":76,"user_id":"stig","client_order_id":6,"client_head_id":6,"order_type":"new","status":"new","fill_type":"day","book_id":"usd_rate_outright_2","side":"bid","price":9000,"order_qty":120,"head_qty":120,"display_qty":"undefined","fill_qty

There we have backslashes added when they need not to, rather than length
not taking them into account.

We'll start to work from there. Thanks for the report and the useful use
case there.

Andrew

On Wed, Aug 17, 2011 at 6:36 PM, ferd <

[email protected]>wrote:

Note being, I still can't produce any of these problems you had when I
start by submitting Erlang Data structures rather than a JSON string
and
then trying to shove it in the app. Are you using the interface
functions
or
just submitting the JSON?

Reply to this email directly or view it on GitHub:

#57 (comment)

Reply to this email directly or view it on GitHub:

#57 (comment)

Reply to this email directly or view it on GitHub:
#57 (comment)

from socket.io-erlang.

ferd avatar ferd commented on July 22, 2024

I'm out of time to check this for now, but the source of the problem seems to be

Message0 = binary_to_list(jsx:term_to_json(list_to_binary(Message), [{strict, false}]))

on line 298 in socketio_transport_polling.erl (send_message_1/5). The problem is that if the previous data was already JSON, then it ends up being escaped just a bit too much.

I don't have much time left today, but if this is critical for you, this quick patch ought to do the job until we find a real solution to the problem:

diff --git a/src/socketio_transport_polling.erl b/src/socketio_transport_polling.erl
index e52e0ce..0b76d03 100644
--- a/src/socketio_transport_polling.erl
+++ b/src/socketio_transport_polling.erl
@@ -296,7 +296,7 @@ send_message(Message, Req, Index, ServerModule, Sup) ->

 send_message_1(Headers, Message, Req, Index, ServerModule) ->
     Headers0 = [{"Content-Type", "text/javascript; charset=UTF-8"}|Headers],
-    Message0 = binary_to_list(jsx:term_to_json(list_to_binary(Message), [{strict, false}])),
+    Message0 = escape(binary_to_list(jsx:term_to_json(list_to_binary(Message), [{strict, false}]))),
     Message1 = "io.JSONP["++Index++"]._(" ++ Message0 ++ ");",
     apply(ServerModule, respond, [Req, 200, Headers0, Message1]).

@@ -325,3 +325,7 @@ reset_duration({TimerRef, Time}) ->
     erlang:cancel_timer(TimerRef),
     NewRef = erlang:start_timer(Time, self(), polling),
     {NewRef, Time}.
+
+escape([]) -> [];
+escape("\\\\\\"++Rest) -> "\\"++escape(Rest);
+escape([C|Rest]) -> [C|escape(Rest)].

It will only 'unescape' some characters, so I'd consider it somewhat risky for user-submitted data. Risks of XSS are to be evaluated. I hope this helps in the meantime. As I said, we'll need to find a good solution, not just this toy thing. I've quickly tested the other transports and they seemed to work fine, in any case.

from socket.io-erlang.

andymck avatar andymck commented on July 22, 2024

Thanks. The patch works for our usage right now, so thats a big help.

Cheers

On Wed, Aug 17, 2011 at 8:29 PM, ferd <
[email protected]>wrote:

I'm out of time to check this for now, but the source of the problem seems
to be

Message0 = binary_to_list(jsx:term_to_json(list_to_binary(Message),
[{strict, false}]))

on line 298 in socketio_transport_polling.erl (send_message_1/5). The
problem is that if the previous data was already JSON, then it ends up being
escaped just a bit too much.

I don't have much time left today, but if this is critical for you, this
quick patch ought to do the job until we find a real solution to the
problem:

diff --git a/src/socketio_transport_polling.erl
b/src/socketio_transport_polling.erl
index e52e0ce..0b76d03 100644
--- a/src/socketio_transport_polling.erl
+++ b/src/socketio_transport_polling.erl
@@ -296,7 +296,7 @@ send_message(Message, Req, Index, ServerModule, Sup)
->

send_message_1(Headers, Message, Req, Index, ServerModule) ->
    Headers0 = [{"Content-Type", "text/javascript;

charset=UTF-8"}|Headers],

  • Message0 = binary_to_list(jsx:term_to_json(list_to_binary(Message),
    [{strict, false}])),
  • Message0 =
    escape(binary_to_list(jsx:term_to_json(list_to_binary(Message), [{strict,
    false}]))),
    Message1 = "io.JSONP["++Index++"]._(" ++ Message0 ++ ");",
    apply(ServerModule, respond, [Req, 200, Headers0, Message1]).

@@ -325,3 +325,7 @@ reset_duration({TimerRef, Time}) ->
erlang:cancel_timer(TimerRef),
NewRef = erlang:start_timer(Time, self(), polling),
{NewRef, Time}.
+
+escape([]) -> [];
+escape("\"++Rest) -> ""++escape(Rest);
+escape([C|Rest]) -> [C|escape(Rest)].

It will only 'unescape' some characters, so I'd consider it somewhat risky
for user-submitted data. Risks of XSS are to be evaluated. I hope this helps
in the meantime. As I said, we'll need to find a good solution, not just
this toy thing. I've quickly tested the other transports and they seemed to
work fine, in any case.

Reply to this email directly or view it on GitHub:
#57 (comment)

from socket.io-erlang.

ferd avatar ferd commented on July 22, 2024

Should be fixed and safer as of a146778

andymck, if you could update your version, you'd have something safer with regards to XSS, although likely a bit slower. Safety over speed.

from socket.io-erlang.

andymck avatar andymck commented on July 22, 2024

ferd, thanks for that

But can i ask that you dont use the data i sent you as an example. That
data is from a live product and I dont know if my boss would be that happy
to see it out in the open

Appreciate the fix.

Thanks

Andrew

On Fri, Aug 19, 2011 at 4:56 PM, ferd <
[email protected]>wrote:

Should be fixed and safer as of
a146778

andymck, if you could update your version, you'd have something safer with
regards to XSS, although likely a bit slower. Safety over speed.

Reply to this email directly or view it on GitHub:
#57 (comment)

from socket.io-erlang.

ferd avatar ferd commented on July 22, 2024

That should be fixed in the merge. Let me know if this is enough or are looking to have all traces gone (which I am not sure how to do)

from socket.io-erlang.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.