Giter Club home page Giter Club logo

Comments (15)

yrashk avatar yrashk commented on August 24, 2024

@ferd, any comments?

from socket.io-erlang.

ferd avatar ferd commented on August 24, 2024

I don't know what the original code seemed to be or why the output fails. Trying this:

8> socketio_data:encode(#msg{content=[126,109,126,50,49,126,109,126,126,106,126,123,34, 116,101,120,116,34,58,34,209,139,209,132,208,178,08,176,208,178,209,139,208,176,34,125], json=true}).   
"~m~139~m~~j~[126,109,126,50,49,126,109,126,126,106,126,123,34,116,101,120,116,34,58,34,209,139,209,132,208,178,8,176,208,178,209,139,208,176,34,125]"

Yields valid output when encoding, which tells me the conversions we build ourselves aren't an issue. However, it appears that the string itself has some invalid characters when we try to input it. A valid JSON string representation stops at character 21-22 of the string:

23> lists:sublist([126,109,126,50,49,126,109,126,126,106,126,123,34, 116,101,120,116,34,58,34,209,139,209,132,208,178,08,176,208,178,209,139,208,176,34,125], 21).
"~m~21~m~~j~{\"text\":\"Ñ"

However, if we print it and let the Erlang drivers figure out whatever encoding:

73> io:format("~s~n",[[126,109,126,50,49,126,109,126,126,106,126,123,34, 116,101,120,116,34,58,34,209,139,209,132,208,178,08,176,208,178,209,139,208,176,34,125]]).  
~m~21~m~~j~{"text":"Ñ�Ñ�в^H°Ð²Ñ�Ð

has a valid string representation (in my shell. It looks broken here). This, to me, seems to point to a problem within jsx's JSON conversion when reading from unicode strings given the shell is able to figure something out that jsx doesn't seem able to. There is a big HOWEVER there.

The unicode string given is quite a mess. I'm wondering if it's possible that some of the characters given are control characters (I strongly suspect some are). Such control characters must be encoded using \u and 4 hex digits. So if you're having control characters through that unicode string, it's quite normal for jsx to die on it -- they need to be properly escaped. dvarkin, can you check to see if you have any of these?

from socket.io-erlang.

dvarkin avatar dvarkin commented on August 24, 2024

hi! thanks for reply.

This json has not any control symbols, only some cyrillic.

I think, the problem is in incorrect calc size of unicode string, that socketio_data:json/2 receive as "Length" argument.

Maybe this code, has an incorrect match:

header(?FRAME ++ Rest=[|], Acc)->
Length = list_to_integer(lists:reverse(Acc)),
body(Length, Rest);

%%% I have m21
header([N|Rest], Acc) when N >= $0, N =< $9 ->
header(Rest, [N|Acc]).

I don't know the best solution in this case, maybe working with unicode as binary?

from socket.io-erlang.

ferd avatar ferd commented on August 24, 2024

That could be something related to that. The way code points are seen, it sounds like a very good candidate for the issue where we'd need to read a binary character per character (pattern matching on a utf8 type).
We'd need to test it a good bit more just to make sure.

Sadly, everything related to unicode in Erlang has to be switched to binaries. The server (misultin) as it is uses lists by default and it then becomes rather unclear what we should do. For JSON strings, we can likely switch things to binary and accumulate data until we have the right length. For regular text though, we'd have no way to know what encoding the user had and then we risk interpreting them in a type they didn't intend.

We could force utf8 by default, but I'm not sure it's the best of ideas. In any case, we need to resolve this. Any opinions?

Sidenote: the fix dvarkin posted doesn't seem safe to me as it drops the parsed message size. The issue is that the socket.io client will sometimes concatenate two messages (m3abcm3def or something like that), but it seems to drop the length of the message header and instead just pick whatever's left after that. If two or more messages are appended together, the fix will break the app.

from socket.io-erlang.

dvarkin avatar dvarkin commented on August 24, 2024

yes, it's not a fix. don't use that! We are using QuotedPrintable but this unsafe to, because of client side.

from socket.io-erlang.

ferd avatar ferd commented on August 24, 2024

I'll try to take some time during my lunch break to generate a good failing test case for the decoding to try and fix the lenght issue. I'll be waiting for comments, but for the meantime, it feels that assuming UTF8 is the safest option -- it'll play well with ASCII and ISO-8859-1 users, although it might be problematic for the UCS-2 and UCS-4 (Windows, Python), UTF16, UTF32, etc. We'll at least always be safe with JSON parsing.

from socket.io-erlang.

ferd avatar ferd commented on August 24, 2024

I studied the problem a bit. The character that causes problems is a unicode control character (U+0084 => 139). There are dozens of them, and they can be inserted, before, between and after any set of characters to modify them into a single visible character.

To me, this would be the reason as to why the length of the string gets messed up.

If I just take the subsequence [209,139], I get the following reasoning:

The resulting character under a binary representation (ы) is the direct two list entries ([209,139]) as a 2-bytes binary (<<209,139>>). However, if I take [209,139] as their literal meaning, it is the "Ñ" character plus a hidden value.

This means that while in both cases I need to get somewhat very clever to solve the issue, I first have to figure out if the faulty input you have was meant to be [209, 139] vs. <<209,139>>. Dvarkin, if you could tell me which one of the two it is, I can know the format in which we do things (choosing between unicode:characters_to_list(list_to_binary(Str)) vs. leaving it as it is vs. list_to_binary(Str) vs. unicode:characters_to_binary(Str), etc.) and try to fix stuff.

Then I'll need to figure out how the hell we're supposed to find the length of a string based on its control characters. Usually, languages have functions for that, but it seems we don't in Erlang. I'll have to see whatever Javascript does and try to re-implement it here, given that the length given in the serialized string is likely based on that. This is going to prove challenging for bare messages, but I could just let jsx do some stream parsing if I find a json structure, which should be way, way easier.

I'm not sure what the performance impacts might be.

from socket.io-erlang.

ferd avatar ferd commented on August 24, 2024

Hi @dvarkin, Check out my branch (https://github.com/ferd/socket.io-erlang) (or wait until it is merged) to see if the changes I brought fix your issues. Hopefully they will.

from socket.io-erlang.

sinnus avatar sinnus commented on August 24, 2024

Hi @ferd, I tried your fix to send the following message via xhr-polling transport:
socket.send("Привет!");
and got big exception stack trace.

from socket.io-erlang.

yrashk avatar yrashk commented on August 24, 2024

@sinnus, can you share that stack trace, please?

from socket.io-erlang.

sinnus avatar sinnus commented on August 24, 2024

=ERROR REPORT==== 26-Sep-2011::12:24:16 ===
** Generic server <0.86.0> terminating
** Last message in was {'xhr-polling',data,
{misultin_req,
{req,#Port<0.5379>,http,
{127,0,0,1},
47763,undefined,keep_alive,59,
{1,1},
'POST',
{abs_path,
"/socket.io/xhr-polling/73a69a1d-8ce3-4843-ba1d-ed818d1bab78/send"},
[],
[{'Host',"localhost:7878"},
{'User-Agent',
"Mozilla/5.0 (X11; Linux x86_64; rv:6.0.2) Gecko/20100101 Firefox/6.0.2"},
{'Accept',
"text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8"},
{'Accept-Language',"en-us,en;q=0.5"},
{'Accept-Encoding',"gzip, deflate"},
{'Accept-Charset',
"windows-1251,utf-8;q=0.7,;q=0.7"},
{'Connection',"keep-alive"},
{'Cookie',"socketio=xhr-polling"},
{'Content-Type',
"application/x-www-form-urlencoded; charset=utf-8"},
{'Cache-Control',"no-cache"},
{'Pragma',"no-cache"},
{'Content-Length',"59"}],
false,
<<"data=%7Em%7E7%7Em%7E%D0%9F%D1%80%D0%B8%D0%B2%D0%B5%D1%82%21">>},
<0.105.0>}}
*
When Server state == {state,"73a69a1d-8ce3-4843-ba1d-ed818d1bab78",[],
socketio_http_misultin,
{'xhr-polling',connected},
{misultin_req,
{req,#Port<0.5033>,http,
{127,0,0,1},
47761,undefined,keep_alive,undefined,
{1,1},
'GET',
{abs_path,
"/socket.io/xhr-polling/73a69a1d-8ce3-4843-ba1d-ed818d1bab78/1317025455364"},
[],
[{'Host',"localhost:7878"},
{'User-Agent',
"Mozilla/5.0 (X11; Linux x86_64; rv:6.0.2) Gecko/20100101 Firefox/6.0.2"},
{'Accept',
"text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8"},
{'Accept-Language',"en-us,en;q=0.5"},
{'Accept-Encoding',"gzip, deflate"},
{'Accept-Charset',
"windows-1251,utf-8;q=0.7,;q=0.7"},
{'Connection',"keep-alive"},
{'Cookie',"socketio=xhr-polling"}],
false,<<>>},
<0.83.0>},
{<0.103.0>,#Ref<0.0.0.1534>},
undefined,
{#Ref<0.0.0.1535>,20000},
8000,<0.87.0>,<0.57.0>}
*
Reason for termination ==
** {function_clause,
[{socketio_data,header,[[178,208,181,209,130,33]]},
{socketio_data,message,3},
{socketio_transport_polling,'-handle_call/3-lc$^0/1-0-',1},
{socketio_transport_polling,handle_call,3},
{gen_server,handle_msg,5},
{proc_lib,init_p_do_apply,3}]}

=CRASH REPORT==== 26-Sep-2011::12:24:16 ===
crasher:
initial call: socketio_transport_polling:init/1
pid: <0.86.0>
registered_name: []
exception exit: {function_clause,
[{socketio_data,header,[[178,208,181,209,130,33]]},
{socketio_data,message,3},
{socketio_transport_polling,
'-handle_call/3-lc$^0/1-0-',1},
{socketio_transport_polling,handle_call,3},
{gen_server,handle_msg,5},
{proc_lib,init_p_do_apply,3}]}
in function gen_server:terminate/6
ancestors: [socketio_client_sup,socketio_listener_sup,
socketio_listener_sup_sup,socketio_sup,<0.54.0>]
messages: []
links: [<0.74.0>,<0.87.0>,<0.60.0>,#Port<0.5033>]
dictionary: [{{grapheme_break_property,grapheme_break_property},
#Fun<ux_unidata_filelist.2.99711654>}]
trap_exit: true
status: running
heap_size: 46368
stack_size: 24
reductions: 86376
neighbours:

=ERROR REPORT==== 26-Sep-2011::12:24:16 ===
** Generic server <0.60.0> terminating
** Last message in was {request,'POST',
["send","73a69a1d-8ce3-4843-ba1d-ed818d1bab78",
"xhr-polling","socket.io"],
{misultin_req,
{req,#Port<0.5379>,http,
{127,0,0,1},
47763,undefined,keep_alive,59,
{1,1},
'POST',
{abs_path,
"/socket.io/xhr-polling/73a69a1d-8ce3-4843-ba1d-ed818d1bab78/send"},
[],
[{'Host',"localhost:7878"},
{'User-Agent',
"Mozilla/5.0 (X11; Linux x86_64; rv:6.0.2) Gecko/20100101 Firefox/6.0.2"},
{'Accept',
"text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8"},
{'Accept-Language',"en-us,en;q=0.5"},
{'Accept-Encoding',"gzip, deflate"},
{'Accept-Charset',
"windows-1251,utf-8;q=0.7,;q=0.7"},
{'Connection',"keep-alive"},
{'Cookie',"socketio=xhr-polling"},
{'Content-Type',
"application/x-www-form-urlencoded; charset=utf-8"},
{'Cache-Control',"no-cache"},
{'Pragma',"no-cache"},
{'Content-Length',"59"}],
false,
<<"data=%7Em%7E7%7Em%7E%D0%9F%D1%80%D0%B8%D0%B2%D0%B5%D1%82%21">>},
<0.105.0>}}
*
When Server state == {state,demo_erl__escript__1317__25445__902576,53276,
<0.58.0>,<0.57.0>,#Ref<0.0.0.158>,
socketio_http_misultin,
["socket.io"]}
** Reason for termination ==
* {{function_clause,
[{socketio_data,header,[[178,208,181,209,130,33]]},
{socketio_data,message,3},
{socketio_transport_polling,'-handle_call/3-lc$^0/1-0-',1},
{socketio_transport_polling,handle_call,3},
{gen_server,handle_msg,5},
{proc_lib,init_p_do_apply,3}]},
{gen_server,call,
[<0.86.0>,
{'xhr-polling',data,
{misultin_req,
{req,#Port<0.5379>,http,
{127,0,0,1},
47763,undefined,keep_alive,59,
{1,1},
'POST',
{abs_path,
"/socket.io/xhr-polling/73a69a1d-8ce3-4843-ba1d-ed818d1bab78/send"},
[],
[{'Host',"localhost:7878"},
{'User-Agent',
"Mozilla/5.0 (X11; Linux x86_64; rv:6.0.2) Gecko/20100101 Firefox/6.0.2"},
{'Accept',
"text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8"},
{'Accept-Language',"en-us,en;q=0.5"},
{'Accept-Encoding',"gzip, deflate"},
{'Accept-Charset',"windows-1251,utf-8;q=0.7,
;q=0.7"},
{'Connection',"keep-alive"},
{'Cookie',"socketio=xhr-polling"},
{'Content-Type',
"application/x-www-form-urlencoded; charset=utf-8"},
{'Cache-Control',"no-cache"},
{'Pragma',"no-cache"},
{'Content-Length',"59"}],
false,
<<"data=%7Em%7E7%7Em%7E%D0%9F%D1%80%D0%B8%D0%B2%D0%B5%D1%82%21">>},
<0.105.0>}}]}}

=CRASH REPORT==== 26-Sep-2011::12:24:16 ===
crasher:
initial call: socketio_http:init/1
pid: <0.60.0>
registered_name: []
exception exit: {{function_clause,
[{socketio_data,header,[[178,208,181,209,130,33]]},
{socketio_data,message,3},
{socketio_transport_polling,
'-handle_call/3-lc$^0/1-0-',1},
{socketio_transport_polling,handle_call,3},
{gen_server,handle_msg,5},
{proc_lib,init_p_do_apply,3}]},
{gen_server,call,
[<0.86.0>,
{'xhr-polling',data,
{misultin_req,
{req,#Port<0.5379>,http,
{127,0,0,1},
47763,undefined,keep_alive,59,
{1,1},
'POST',
{abs_path,
"/socket.io/xhr-polling/73a69a1d-8ce3-4843-ba1d-ed818d1bab78/send"},
[],
[{'Host',"localhost:7878"},
{'User-Agent',
"Mozilla/5.0 (X11; Linux x86_64; rv:6.0.2) Gecko/20100101 Firefox/6.0.2"},
{'Accept',
"text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8"},
{'Accept-Language',"en-us,en;q=0.5"},
{'Accept-Encoding',"gzip, deflate"},
{'Accept-Charset',
"windows-1251,utf-8;q=0.7,*;q=0.7"},
{'Connection',"keep-alive"},
{'Cookie',"socketio=xhr-polling"},
{'Content-Type',
"application/x-www-form-urlencoded; charset=utf-8"},
{'Cache-Control',"no-cache"},
{'Pragma',"no-cache"},
{'Content-Length',"59"}],
false,
<<"data=%7Em%7E7%7Em%7E%D0%9F%D1%80%D0%B8%D0%B2%D0%B5%D1%82%21">>},
<0.105.0>}}]}}
in function gen_server:terminate/6
ancestors: [socketio_listener_sup,socketio_listener_sup_sup,
socketio_sup,<0.54.0>]
messages: [{'EXIT',<0.86.0>,
{function_clause,
[{socketio_data,header,[[178,208,181,209,130,33]]},
{socketio_data,message,3},
{socketio_transport_polling,
'-handle_call/3-lc$^0/1-0-',1},
{socketio_transport_polling,handle_call,3},
{gen_server,handle_msg,5},
{proc_lib,init_p_do_apply,3}]}}]
links: [<0.61.0>,<0.57.0>]
dictionary: []
trap_exit: true
status: running
heap_size: 4181
stack_size: 24
reductions: 558
neighbours:

=CRASH REPORT==== 26-Sep-2011::12:24:16 ===
crasher:
initial call: gen_event:init_it/6
pid: <0.87.0>
registered_name: []
exception exit: {function_clause,
[{socketio_data,header,[[178,208,181,209,130,33]]},
{socketio_data,message,3},
{socketio_transport_polling,
'-handle_call/3-lc$^0/1-0-',1},
{socketio_transport_polling,handle_call,3},
{gen_server,handle_msg,5},
{proc_lib,init_p_do_apply,3}]}
in function gen_event:terminate_server/4
ancestors: [<0.86.0>,socketio_client_sup,socketio_listener_sup,
socketio_listener_sup_sup,socketio_sup,<0.54.0>]
messages: []
links: []
dictionary: []
trap_exit: true
status: running
heap_size: 377
stack_size: 24
reductions: 136
neighbours:

=SUPERVISOR REPORT==== 26-Sep-2011::12:24:16 ===
Supervisor: {local,socketio_listener_sup}
Context: child_terminated
Reason: {{function_clause,
[{socketio_data,header,[[178,208,181,209,130,33]]},
{socketio_data,message,3},
{socketio_transport_polling,'-handle_call/3-lc$^0/1-0-',1},
{socketio_transport_polling,handle_call,3},
{gen_server,handle_msg,5},
{proc_lib,init_p_do_apply,3}]},
{gen_server,call,
[<0.86.0>,
{'xhr-polling',data,
{misultin_req,
{req,#Port<0.5379>,http,
{127,0,0,1},
47763,undefined,keep_alive,59,
{1,1},
'POST',
{abs_path,
"/socket.io/xhr-polling/73a69a1d-8ce3-4843-ba1d-ed818d1bab78/send"},
[],
[{'Host',"localhost:7878"},
{'User-Agent',
"Mozilla/5.0 (X11; Linux x86_64; rv:6.0.2) Gecko/20100101 Firefox/6.0.2"},
{'Accept',
"text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8"},
{'Accept-Language',"en-us,en;q=0.5"},
{'Accept-Encoding',"gzip, deflate"},
{'Accept-Charset',"windows-1251,utf-8;q=0.7,*;q=0.7"},
{'Connection',"keep-alive"},
{'Cookie',"socketio=xhr-polling"},
{'Content-Type',
"application/x-www-form-urlencoded; charset=utf-8"},
{'Cache-Control',"no-cache"},
{'Pragma',"no-cache"},
{'Content-Length',"59"}],
false,
<<"data=%7Em%7E7%7Em%7E%D0%9F%D1%80%D0%B8%D0%B2%D0%B5%D1%82%21">>},
<0.105.0>}}]}}
Offender: [{pid,<0.60.0>},
{name,socketio_http},
{mfargs,
{socketio_http,start_link,
[socketio_http_misultin,7878,
["socket.io"],
undefined,demo_erl__escript__1317__25445__902576,
<0.57.0>]}},
{restart_type,permanent},
{shutdown,5000},
{child_type,worker}]

=SUPERVISOR REPORT==== 26-Sep-2011::12:24:16 ===
Supervisor: {local,socketio_client_sup}
Context: child_terminated
Reason: {function_clause,
[{socketio_data,header,[[178,208,181,209,130,33]]},
{socketio_data,message,3},
{socketio_transport_polling,
'-handle_call/3-lc$^0/1-0-',1},
{socketio_transport_polling,handle_call,3},
{gen_server,handle_msg,5},
{proc_lib,init_p_do_apply,3}]}
Offender: [{pid,<0.86.0>},
{name,socketio_client},
{mfargs,
{socketio_client,start_link,
[<0.57.0>,socketio_transport_polling,
"73a69a1d-8ce3-4843-ba1d-ed818d1bab78",
socketio_http_misultin,
{'xhr-polling',
{misultin_req,
{req,#Port<0.5033>,http,
{127,0,0,1},
47761,undefined,keep_alive,undefined,
{1,1},
'GET',
{abs_path,"/socket.io/xhr-polling//1317025455177"},
[],
[{'Host',"localhost:7878"},
{'User-Agent',
"Mozilla/5.0 (X11; Linux x86_64; rv:6.0.2) Gecko/20100101 Firefox/6.0.2"},
{'Accept',
"text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8"},
{'Accept-Language',"en-us,en;q=0.5"},
{'Accept-Encoding',"gzip, deflate"},
{'Accept-Charset',"windows-1251,utf-8;q=0.7,*;q=0.7"},
{'Connection',"keep-alive"},
{'Cookie',"socketio=xhr-polling"}],
false,<<>>},
<0.83.0>}}]}},
{restart_type,transient},
{shutdown,5000},
{child_type,worker}]

=PROGRESS REPORT==== 26-Sep-2011::12:24:16 ===
supervisor: {<0.109.0>,misultin}
started: [{pid,<0.111.0>},
{name,server},
{mfargs,{misultin_server,start_link,[{1024}]}},
{restart_type,permanent},
{shutdown,60000},
{child_type,worker}]

=CRASH REPORT==== 26-Sep-2011::12:24:16 ===
crasher:
initial call: supervisor:misultin_acceptors_sup/1
pid: <0.112.0>
registered_name: []
exception exit: {bad_return,
{misultin_acceptors_sup,init,{error,eaddrinuse}}}
in function gen_server:init_it/6
ancestors: [<0.109.0>,<0.107.0>,socketio_listener_sup,
socketio_listener_sup_sup,socketio_sup,<0.54.0>]
messages: []
links: [<0.109.0>]
dictionary: []
trap_exit: true
status: running
heap_size: 377
stack_size: 24
reductions: 316
neighbours:

=SUPERVISOR REPORT==== 26-Sep-2011::12:24:16 ===
Supervisor: {<0.109.0>,misultin}
Context: start_error
Reason: {bad_return,{misultin_acceptors_sup,init,{error,eaddrinuse}}}
Offender: [{pid,undefined},
{name,acceptors_sup},
{mfargs,
{misultin_acceptors_sup,start_link,
[<0.109.0>,7878,
[binary,
{packet,raw},
{ip,{0,0,0,0}},
{reuseaddr,true},
{active,false},
{backlog,128},
inet],
10,30000,http,
{custom_opts,4194304,2000,false,
#Fun<socketio_http_misultin.0.67680361>,true,
#Fun<socketio_http_misultin.1.39101188>,false,
false,false}]}},
{restart_type,permanent},
{shutdown,infinity},
{child_type,supervisor}]

=CRASH REPORT==== 26-Sep-2011::12:24:16 ===
crasher:
initial call: socketio_http:init/1
pid: <0.107.0>
registered_name: []
exception exit: {{badmatch,{error,shutdown}},
[{socketio_http,init,1},
{gen_server,init_it,6},
{proc_lib,init_p_do_apply,3}]}
in function gen_server:init_it/6
ancestors: [socketio_listener_sup,socketio_listener_sup_sup,
socketio_sup,<0.54.0>]
messages: [{'EXIT',<0.109.0>,shutdown}]
links: [<0.57.0>]
dictionary: []
trap_exit: true
status: running
heap_size: 233
stack_size: 24
reductions: 156
neighbours:

=SUPERVISOR REPORT==== 26-Sep-2011::12:24:16 ===
Supervisor: {local,socketio_listener_sup}
Context: start_error
Reason: {{badmatch,{error,shutdown}},
[{socketio_http,init,1},
{gen_server,init_it,6},
{proc_lib,init_p_do_apply,3}]}
Offender: [{pid,<0.60.0>},
{name,socketio_http},
{mfargs,
{socketio_http,start_link,
[socketio_http_misultin,7878,
["socket.io"],
undefined,demo_erl__escript__1317__25445__902576,
<0.57.0>]}},
{restart_type,permanent},
{shutdown,5000},
{child_type,worker}]

=ERROR REPORT==== 26-Sep-2011::12:24:16 ===
module: misultin_http
line: 543
error in custom loop: {{{function_clause,
[{socketio_data,header,[[178,208,181,209,130,33]]},
{socketio_data,message,3},
{socketio_transport_polling,
'-handle_call/3-lc$^0/1-0-',1},
{socketio_transport_polling,handle_call,3},
{gen_server,handle_msg,5},
{proc_lib,init_p_do_apply,3}]},
{gen_server,call,
[<0.86.0>,
{'xhr-polling',data,
{misultin_req,
{req,#Port<0.5379>,http,
{127,0,0,1},
47763,undefined,keep_alive,59,
{1,1},
'POST',
{abs_path,
"/socket.io/xhr-polling/73a69a1d-8ce3-4843-ba1d-ed818d1bab78/send"},
[],
[{'Host',"localhost:7878"},
{'User-Agent',
"Mozilla/5.0 (X11; Linux x86_64; rv:6.0.2) Gecko/20100101 Firefox/6.0.2"},
{'Accept',
"text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8"},
{'Accept-Language',"en-us,en;q=0.5"},
{'Accept-Encoding',"gzip, deflate"},
{'Accept-Charset',
"windows-1251,utf-8;q=0.7,;q=0.7"},
{'Connection',"keep-alive"},
{'Cookie',"socketio=xhr-polling"},
{'Content-Type',
"application/x-www-form-urlencoded; charset=utf-8"},
{'Cache-Control',"no-cache"},
{'Pragma',"no-cache"},
{'Content-Length',"59"}],
false,
<<"data=%7Em%7E7%7Em%7E%D0%9F%D1%80%D0%B8%D0%B2%D0%B5%D1%82%21">>},
<0.105.0>}}]}},
{gen_server,call,
[<0.60.0>,
{request,'GET',
["1317025455364",
"73a69a1d-8ce3-4843-ba1d-ed818d1bab78",
"xhr-polling","socket.io"],
{misultin_req,
{req,#Port<0.5033>,http,
{127,0,0,1},
47761,undefined,keep_alive,undefined,
{1,1},
'GET',
{abs_path,
"/socket.io/xhr-polling/73a69a1d-8ce3-4843-ba1d-ed818d1bab78/1317025455364"},
[],
[{'Host',"localhost:7878"},
{'User-Agent',
"Mozilla/5.0 (X11; Linux x86_64; rv:6.0.2) Gecko/20100101 Firefox/6.0.2"},
{'Accept',
"text/html,application/xhtml+xml,application/xml;q=0.9,
/;q=0.8"},
{'Accept-Language',"en-us,en;q=0.5"},
{'Accept-Encoding',"gzip, deflate"},
{'Accept-Charset',
"windows-1251,utf-8;q=0.7,
;q=0.7"},
{'Connection',"keep-alive"},
{'Cookie',"socketio=xhr-polling"}],
false,<<>>},
<0.83.0>}},
infinity]}} serving request: {req,#Port<0.5033>,
http,
{127,0,0,1},
47761,undefined,
keep_alive,undefined,
{1,1},
'GET',
{abs_path,
"/socket.io/xhr-polling/73a69a1d-8ce3-4843-ba1d-ed818d1bab78/1317025455364"},
[],
[{'Host',
"localhost:7878"},
{'User-Agent',
"Mozilla/5.0 (X11; Linux x86_64; rv:6.0.2) Gecko/20100101 Firefox/6.0.2"},
{'Accept',
"text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8"},
{'Accept-Language',
"en-us,en;q=0.5"},
{'Accept-Encoding',
"gzip, deflate"},
{'Accept-Charset',
"windows-1251,utf-8;q=0.7,*;q=0.7"},
{'Connection',
"keep-alive"},
{'Cookie',
"socketio=xhr-polling"}],
false,<<>>}

=ERROR REPORT==== 26-Sep-2011::12:24:16 ===
module: misultin_socket
line: 106
error sending data: closed

=ERROR REPORT==== 26-Sep-2011::12:24:16 ===
module: misultin_server
line: 221
http process <0.83.0> has died with reason: kill, removing from references of open connections and websockets

=ERROR REPORT==== 26-Sep-2011::12:24:16 ===
module: misultin_http
line: 543
error in custom loop: {{{function_clause,
[{socketio_data,header,[[178,208,181,209,130,33]]},
{socketio_data,message,3},
{socketio_transport_polling,
'-handle_call/3-lc$^0/1-0-',1},
{socketio_transport_polling,handle_call,3},
{gen_server,handle_msg,5},
{proc_lib,init_p_do_apply,3}]},
{gen_server,call,
[<0.86.0>,
{'xhr-polling',data,
{misultin_req,
{req,#Port<0.5379>,http,
{127,0,0,1},
47763,undefined,keep_alive,59,
{1,1},
'POST',
{abs_path,
"/socket.io/xhr-polling/73a69a1d-8ce3-4843-ba1d-ed818d1bab78/send"},
[],
[{'Host',"localhost:7878"},
{'User-Agent',
"Mozilla/5.0 (X11; Linux x86_64; rv:6.0.2) Gecko/20100101 Firefox/6.0.2"},
{'Accept',
"text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8"},
{'Accept-Language',"en-us,en;q=0.5"},
{'Accept-Encoding',"gzip, deflate"},
{'Accept-Charset',
"windows-1251,utf-8;q=0.7,;q=0.7"},
{'Connection',"keep-alive"},
{'Cookie',"socketio=xhr-polling"},
{'Content-Type',
"application/x-www-form-urlencoded; charset=utf-8"},
{'Cache-Control',"no-cache"},
{'Pragma',"no-cache"},
{'Content-Length',"59"}],
false,
<<"data=%7Em%7E7%7Em%7E%D0%9F%D1%80%D0%B8%D0%B2%D0%B5%D1%82%21">>},
<0.105.0>}}]}},
{gen_server,call,
[<0.60.0>,
{request,'POST',
["send","73a69a1d-8ce3-4843-ba1d-ed818d1bab78",
"xhr-polling","socket.io"],
{misultin_req,
{req,#Port<0.5379>,http,
{127,0,0,1},
47763,undefined,keep_alive,59,
{1,1},
'POST',
{abs_path,
"/socket.io/xhr-polling/73a69a1d-8ce3-4843-ba1d-ed818d1bab78/send"},
[],
[{'Host',"localhost:7878"},
{'User-Agent',
"Mozilla/5.0 (X11; Linux x86_64; rv:6.0.2) Gecko/20100101 Firefox/6.0.2"},
{'Accept',
"text/html,application/xhtml+xml,application/xml;q=0.9,
/;q=0.8"},
{'Accept-Language',"en-us,en;q=0.5"},
{'Accept-Encoding',"gzip, deflate"},
{'Accept-Charset',
"windows-1251,utf-8;q=0.7,
;q=0.7"},
{'Connection',"keep-alive"},
{'Cookie',"socketio=xhr-polling"},
{'Content-Type',
"application/x-www-form-urlencoded; charset=utf-8"},
{'Cache-Control',"no-cache"},
{'Pragma',"no-cache"},
{'Content-Length',"59"}],
false,
<<"data=%7Em%7E7%7Em%7E%D0%9F%D1%80%D0%B8%D0%B2%D0%B5%D1%82%21">>},
<0.105.0>}},
infinity]}} serving request: {req,#Port<0.5379>,
http,
{127,0,0,1},
47763,undefined,
keep_alive,59,
{1,1},
'POST',
{abs_path,
"/socket.io/xhr-polling/73a69a1d-8ce3-4843-ba1d-ed818d1bab78/send"},
[],
[{'Host',
"localhost:7878"},
{'User-Agent',
"Mozilla/5.0 (X11; Linux x86_64; rv:6.0.2) Gecko/20100101 Firefox/6.0.2"},
{'Accept',
"text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8"},
{'Accept-Language',
"en-us,en;q=0.5"},
{'Accept-Encoding',
"gzip, deflate"},
{'Accept-Charset',
"windows-1251,utf-8;q=0.7,*;q=0.7"},
{'Connection',
"keep-alive"},
{'Cookie',
"socketio=xhr-polling"},
{'Content-Type',
"application/x-www-form-urlencoded; charset=utf-8"},
{'Cache-Control',
"no-cache"},
{'Pragma',"no-cache"},
{'Content-Length',
"59"}],
false,
<<"data=%7Em%7E7%7Em%7E%D0%9F%D1%80%D0%B8%D0%B2%D0%B5%D1%82%21">>}

from socket.io-erlang.

sinnus avatar sinnus commented on August 24, 2024

My hotfix for the issue
sinnus@ec4b305 (sorry, forget to remove logger:)
I think the best solution will be to write own misultin_req:parse_post function.

from socket.io-erlang.

ferd avatar ferd commented on August 24, 2024

OK, looking at the stack trace, I see: <<"data=%7Em%7E7%7Em%7E%D0%9F%D1%80%D0%B8%D0%B2%D0%B5%D1%82%21">>. Decoding it with misultin, it returns:

3> misultin_utility:parse_qs(S).
[{"data",
   [126,109,126,55,126,109,126,208,159,209,128,208,184,208,178,208,181,209,130,33]}]
4> io:format("~ts~n",[[126,109,126,55,126,109,126,208,159,209,128,208,184,208,178,208,181,209,130,33]]).              
~m~7~m~��иве�!
ok
5> io:format("~ts~n", [list_to_binary([126,109,126,55,126,109,126,208,159,209,128,208,184,208,178,208,181,209,130,33])]).
~m~7~m~Привет!
ok

Socket.io-erlang handles unicode correctly. The problem has to do with how misultin does it. The issue, as far as I can see, is that the unicode is parsed as a binary, which turns all code points in bytes (0..255). Then the binaries are blindly turned into lists, but they don't have the same unicode format in Erlang -- you actually need to convert them to codepoints greater than 255, and then output them with a ~ts combination instead of just ~s.

This is covered on our side, but visibly not on Ostinelli's side (misultin). We can likely hot-patch it either in the data parser the way you did, but I'll be filing a bug report with Ostinelli to see if he can make Misultin right in the first place instead.

Here'S the issue on misultin: ostinelli/misultin#61

from socket.io-erlang.

ferd avatar ferd commented on August 24, 2024

Yes, this is related to how utf-8 is encoded and the size of bytes it uses.
I'm working with Roberto Ostinelli to fix misultin and make sure things are
fine for unicode in general.

from socket.io-erlang.

sinnus avatar sinnus commented on August 24, 2024

To summarize my fixes:
sinnus@ec4b305
sinnus@c5abb9b
sinnus@e4d01df
sinnus@4991ba5

from socket.io-erlang.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.