Comments (10)
The UTF8ONLY token only exists to let clients detect that the server is UTF-8 only. It is backwards compatible with the existing situation where servers that require UTF-8 silently break with clients which are not configured to use UTF-8.
The spec does not specify any required method for handling clients that send non-UTF-8. It's entirely legal under the spec for implementations to transcode any non-UTF-8 to UTF-8 if they want.
from ircv3-specifications.
since clients that do not support the specification will happily send non-UTF8 and be disconnected for violating the protocol.
Ideally such servers would always handle these cases without disconnecting the client. However, given the amount of discussion that'd likely result from trying to specify one specific way of handling these cases, I thought it'd be best to just let the servers handle it in whatever way they find appropriate.
To be backwards-compatible, this should be opt-in with a CAP exchange. Once a client has ACK'd UTF8ONLY, it is reasonable to expect it not to send anything that violates the UTF8ONLY specification.
Unfortunately we can't make this opt-in with a CAP, since servers that only accept UTF-8 traffic already exist and they need to transcode, reject, or in some other way handle non-UTF-8 traffic from clients in line with the definition written in the spec anyway.
ideally it would at least say that servers SHOULD not drop the client for sending non-UTF8, though they may ignore individual protocol messages
Definitely makes sense to discourage disconnecting the client outright. I'll play with the language there and try to PR some alternative language that encourages that only as a last resort. Thanks for the note, much appreciated!
from ircv3-specifications.
The UTF8ONLY token only exists to let clients detect that the server is UTF-8 only. It is backwards compatible with the existing situation where servers that require UTF-8 silently break with clients which are not configured to use UTF-8.
Such servers aren't really following the spirit of the backwards-compatibility principle, so it seems harmful to endorse that approach in IRCv3. The way it appears now it looks like a desired and encouraged part of the specification - ideally it would at least say that servers SHOULD not drop the client for sending non-UTF8, though they may ignore individual protocol messages.
from ircv3-specifications.
Such servers aren't really following the spirit of the backwards-compatibility principle, so it seems harmful to endorse that approach in IRCv3.
It's a tricky issue, yeah. I think a compatibility break is inherent in the intent of the specification --- if a server implements the spec, it's never really going to interoperate acceptably with clients that use non-UTF8 encodings (even if you can robustly transcode input, the server will only emit UTF8, likely violating client expectations that the output encoding will agree with the input encoding).
I agree with the suggestion that disconnecting the client altogether is unnecessarily aggressive and should probably be deprecated. (From the comment history on #432, it sounds like we were exploring it as the best way to get the end user's attention.)
from ircv3-specifications.
I'll bump this issue a year later- I agree, the concept of disconnecting a client over UTF8 seems heavy-handed and appears to be an option suggested in the UTF8ONLY spec. I would love to see this language be removed.
from ircv3-specifications.
I think this change gives a more accurate explanation of why this spec exists, and also removes the disconnection language entirely. Please let me know watcha think: https://gist.github.com/DanielOaks/02a60498e4be4ecb7d6be387eecb642a/revisions#diff-014869833613b58c7e37f5208548f4e64d8d0deb465a47d1db21da761158f143=
from ircv3-specifications.
I think the changes improve the document, and appreciate the removal of the language referencing disconnection as a server option.
from ircv3-specifications.
I'm OK with removing the disconnection language, but I don't like the other changes.
Only allowing this encoding breaks compatibility with the IRC protocol as written
Is this true? I've always thought of UTF8ONLY
as being an example of a server's ability to impose a content moderation policy. In this case, non-UTF8 "payloads" (final parameters to PRIVMSG
, NOTICE
, USER
, TOPIC
, etc.) are being disallowed.
from ircv3-specifications.
Only allowing this encoding breaks compatibility with the IRC protocol as written
Is this true? I've always thought of
UTF8ONLY
as being an example of a server's ability to impose a content moderation policy. In this case, non-UTF8 "payloads" (final parameters toPRIVMSG
,NOTICE
,USER
,TOPIC
, etc.) are being disallowed.
Depends on your view of the protocol I guess. Some do see disallowing that as a protocol break, some responses to non-UTF-8 content (e.g. disconnecting the client) would prolly classify as a protocol break, and some don't see it as a protocol break.
I guess in my view of that sentence, I'm kind of conflating the 'decode everything as UTF-8' approach that some software does as not following the 'traditional' treat-everything-as-octets-and-bytes direction, but I guess the token/stdreplies code themselves doesn't necessarily mean that 🤷
from ircv3-specifications.
some don't see it as a protocol break.
Put me in this camp :-)
I found a better way to phrase my objection: the current spec language implies that non-UTF8 is legacy and UTF8 is preferred. I like this implication and I want to keep it.
from ircv3-specifications.
Related Issues (20)
- CHATHISTORY: consider an API to discover DM correspondents HOT 8
- A capability for enabling receiving arbitrary standard replies HOT 3
- BOT flag lacks notification of change HOT 5
- sasl spec should clarify that AUTHENTICATE is a normal IRC message HOT 2
- CAP DEL undefined behavior
- oper tag HOT 1
- Unclear how servers should send cap updates HOT 2
- Standardize pre-welcome FAIL ACCOUNT_REQUIRED HOT 3
- Client-tag for specifying in which shared channel a private NOTICE should be displayed HOT 5
- CVE-2022-2663 defence-in-depth: Specify CTCP PING character limits HOT 4
- CHATHISTORY: Clarify a limit of 0 in messages HOT 7
- Multiline messages: Clarify what counts towards max-bytes and what doesn't
- sasl-3.1: Mention size limit of incoming SASL authentication messages HOT 1
- Chat history + Channel rename HOT 3
- irc Some privacy issues HOT 5
- sasl: spec recommendations breaks single roundtrip connection registration HOT 4
- Unresolved issues with message redaction HOT 6
- CHATHISTORY: clarify behaviour when messages have no consistent total ordering HOT 1
- draft/account-registration: should all responses use standard-replies? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ircv3-specifications.