Giter Club home page Giter Club logo

draft-ietf-masque-h3-datagram's People

Contributors

afrind avatar bashi avatar davidschinazi avatar ekinnear avatar gloinul avatar lpardue avatar martinduke avatar martinthomson avatar mikebishop avatar tfpauly avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

draft-ietf-masque-h3-datagram's Issues

Many reasons why a message with Datagram-flow-Id could be delayed

Section 6 makes the keen observation that:

   Since the QUIC STREAM frame that contains the "Datagram-Flow-Id"
   header could be lost or reordered, it is possible that an endpoint
   will receive an HTTP/3 datagram with a flow identifier that it does
   not know as it has not yet received the corresponding "Datagram-Flow-
   Id" header.  Endpoints MUST NOT treat that as an error; they MUST
   either silently discard the datagram or buffer it until they receive
   the "Datagram-Flow-Id" header.

This is accurate but, when thinking about HTTP/3 APIs, does not cover all possible reasons that might cause a request message to be processed after the corresponding DATAGRAM frame is available. For instance, the message might be blocked by QPACK, or the implementation might present DATAGRAMS before stream headers.

We could boil the ocean with describing all possible reasons, so I think it might be better to shuffle the text to focus on describe the symptom (DATAGRAMS before HTTP messages) and the treatment. We can then give non-exhaustive example(s) as already done.

Remove context identifiers

The capsules that signal use of different datagram contexts (or the absence thereof) assume that there is a generic need for this form of identification. That has not been established. WebTransport certainly doesn't need it.

As this is an end-to-end signal anyway, it is also a protocol-specific signal. All uses of this framework are perfectly capable of doing their own signaling and negotiation. A one-size-fits-all mechanism is more likely a one-size-fits-few mechanism, but it still comes with costs that every user has to bear (see ietf-wg-webtrans/draft-ietf-webtrans-http3#54 for instance).

For instance, if CONNECT-UDP wanted to signal ECN and DSCP, that might be more efficiently achieved with a single byte (2 + 6) rather than a varint. If CONNECT-IP wanted different compression contexts, that might be suited to a varint, but why would this draft be in a position to decide that?

As there is no value in an intermediary looking at these context identifiers - I would argue negative utility in the RFC 8558 sense - these should be removed from the protocol.

The only potential value I see in having some sort of signaling here is to explicitly signal the use of DATAGRAM frames associated with a stream. However, as we have agreed to send DATAGRAM frames without negotiation, that too seems like it won't be needed.

Returning flow identifiers

There is a certain advantage to being able to use small flow identifiers, but there are only 32 of the smallest available for any given usage. Would it be possible for a particular usage to reuse a value?

I ask because CONNECT-UDP has a distinct start and end and could possibly take advantage of this. That would have some small, but likely meaningful, impact on efficiency.

reusing flow identifiers is racy

#20 introduced the following text:

The flow identifier allocation service MAY reuse previously retired flow identifiers once they have ascertained that there are no packets with DATAGRAM frames using that flow identifier still in flight.

There are 2 problems with this, in each direction:

  1. In the send-direction, this can only be ascertained in the case of no packet loss, when all packets have been acknowledged. If any packet is not acknowledged, this could be due to this packet actually having been lost or just reordered delayed.
  2. While it's possible to ascertain that all sent packets have been acknowledged, this is not possible for packets sent by the peer.

Reusing a flow identifier opens up an attack vector: An attacker could delay the delivery of a packet containing the old flow identifier until it has been retired and reused. This would lead to a reinterpretation of the DATAGRAM contents in a new application context.

Pick mechanism to register context IDs (header vs message)

During the 2021-04 MASQUE Interim, we discussed multiple options for registering context IDs. Without going into the specific of the encoding, there are two broad classes of solutions:

  • "Header" design
    • Once-at-setup registration
    • This happens during the HTTP request/response
    • Simplest solution is to use an HTTP header such as Datagram-Flow-Id (encoding TBD)
    • Example use-case: CONNECT-UDP without extensions
  • "Message" design
    • Mid-stream registration
    • This can happen at any point during the lifetime of the request stream
    • This would involve some sort of "register" message (encoding TBD, see separate issue)
    • Example use-case: CONNECT-IP on-the-fly compression (this can't be done at tunnel setup time because the bits top be compressed aren't always known at that time)

It's possible to implement once-at-setup registration using the Message design, but it isn't possible to implement mid-stream registration. Therefore I think we should go with the Message design. It would also be possible to implement both but I don't think that provides much value.

Clarify meaning of SETTINGS parameter

In section 6, it says "A value of 0 indicates that this mechanism is not supported." The draft actually has two mechanisms: the DATAGRAM frame and the CAPSULE frame. It would be useful to clarify that the parameter only refers to support for the DATAGRAM frame.

2-layer design

During the 2021-04 MASQUE Interim, we discussed a design proposal from @bemasc to replace the one-layer flow ID design with a two-layer design. This design replaces the flow ID with two numbers: a stream ID and a context ID. The contents of the QUIC DATAGRAM frame would now look like:

HTTP/3 DATAGRAM Frame {
  Stream ID (i),
  Context ID (i),
  HTTP/3 Datagram Payload (..),
}

While the flow ID in the draft was per-hop, the proposal has better separation:

  • the stream ID is per-hop, it maps to an HTTP request
  • the context ID is end-to-end, it maps to context information inside that request

Intermediaries now only look at stream IDs and can be blissfully ignorant of context IDs.

In the room at the 2021-04 MASQUE Interim, multiple participants spoke in support of this design and no one raised objections. This issue exists to ensure that we have consensus on this change - please speak up if you disagree.

Merge REGISTER_DATAGRAM_CONTEXT and REGISTER_DATAGRAM_NO_CONTEXT capsules

There are two possible ways to encode REGISTER_DATAGRAM capsule: using two different capsules (current variant), or using a single capsule with a field indicating whether the context is present. While those two are largely isomorphic, I feel like the latter would help us avoid situations where libraries or intermediaries only support one but not the other.

"0-length" DATAGRAMS

Since DATAGRAMS can have a 0-length payload, and H3 DATAGRAM mandates a flow ID, then something should be said about how an endpoint treats this.

My first reaction is that this is a connection error of type H3_FRAME_ERROR.

Remove the ECN example in Datagram-Flow-Id Header Field Definition

Currently, the Datagram-Flow-Id Header Field Definition section contains examples of the Datagram-Flow-Id header, which includes a 4-flow mechanism for conveying ECN information:

Datagram-Flow-Id = 42, 44; ecn-ect0, 46; ecn-ect1, 48; ecn-ce

This example seems to be tripping people up, myself included, by attempting to solve a problem of in-band signaling of non-UDP-payload bits by providing these streams, and drawing attention to that problem rather than the capability being described.

I propose we remove this example until we have a better one to replace it. If we cannot find a good example, that also lends some evidence that perhaps we don't need the feature of creating multiple Datagram-Flow-Id associations in the same request.

Add examples

I feel the text could use some concrete examples of use cases where application-layer demultiplexing would be useful (or, perhaps more generally, how unreliable data over HTTP/3 would be useful).

Can HTTP server push use h3-datagram?

I would prefer that h3-datagram cannot use server push, because I believe it adds extra complexity and there are no clear use cases.

Additionally, server push is an optimization which allows a server to push a resource up to 1RTT prior to the client's (expected) requests for the resource. h3-datagram is much more analogous to a CONNECT, which is neither idempotent or cacheable, so I'd assume a client would drop any received datagrams, rendering the optimization useless.

Whatever we decide, we should definitely clarify this. This issue was motivated by #42 (comment)

Describe document structure and/or terms in intro section

There's now several different concepts to wrangle in this document. Reading from the top, we first tackle multiplexing (the major motivation for the document), then the frame, then capsules. Capsules underpin the mechanics of multiplexing. I don't think we can fix this chicken and egg problem but adding a brief overview of the concepts and terms before they are tackled in the prose could help readability.

Quarter Stream ID range is smaller than Stream ID range

Stating the obvious here. But we're using value with a legal range larger than semantically legal. RFC 9000 has some similar caveats with MAX_STREAMS https://www.rfc-editor.org/rfc/rfc9000.html#section-19.11

To address this, the suggestion is to add something like

The largest legal stream ID is 2^62 - 1, so this value cannot exceed (2^62-1) / 4. Receipt of a frame that includes a larger value MUST be treated as a connection error of type FRAME_ENCODING_ERROR

Are context IDs "negotiated"?

It doesn't appear that the peer has any say in the allocation of context IDs. If not, they're not really "negotiated." Perhaps better to say that they are "declared" or merge with "allocated."

Running out of Flow IDs

https://tools.ietf.org/html/draft-schinazi-masque-h3-datagram-01#section-6 states:

   Note that integer structured fields can only encode values up to
   10^15-1, therefore the maximum possible value of the "Datagram-Flow-
   Id" header is lower then the theoretical maximum value of a flow
   identifier which is 2^62-1 due to the QUIC variable length integer
   encoding.  If the flow identifier allocation service of an endpoint
   runs out of values lower than 10^15-1, the endpoint MUST treat is as
   a connection error of type H3_ID_ERROR.

The text is a little ambiguous. I presume the intention is that the endpoint that is allocating flows should do what HTTP/3 defines as Immediate Application Closure? The doesn't cover the case where an endpoint receives a Datagram Flow ID greater than 10^15-1 (either in a request or a DATAGRAM).

I think it might help to separate these cases. Receiving a value of 10^15-1 is always a connection error of type H3_ID_ERROR.
But I'm not so sure about the sender case, if you really are flow ID exhausted, perhaps it is nicer to just spin up a new connection and leave the current one as is. We might recommend that endpoints initiate an HTTP/3 graceful close followed by an application close - but reusing H3_ID_ERROR for that case seems a bit overkill.

Make Capsule Frame before Headers an H3_FRAME_UNEXPECTED connection error

The HTTP/3 RFC states

Receipt of an invalid sequence of frames MUST be treated
as a connection error of type H3_FRAME_UNEXPECTED;

The current draft instead recommends closing the stream rather than the connection and with the generic H3_GENERAL_PROTOCOL_ERROR

According to https://datatracker.ietf.org/doc/html/draft-ietf-quic-http-34#section-8.1 H3_GENERAL_PROTOCOL_ERROR is to be used when a more specific error code isn't available, but here we have clearly a more specific error available.

Also, what is the rationale for closing just the stream and not the whole connection?

Furthermore, a more general question:
right now the above recommendation only exists for REGISTER_DATAGRAM_NO_CONTEXT and REGISTER_DATAGRAM_CONTEXT capsule types. Would it be fair to say that no Capsule frame is allowed before headers instead?

Consider splitting DATAGRAM Capsule into two types

The DATAGRAM capsule has an optional Context ID field. The presence of this field is based on state held in stream objects, not signalled on the wire itself.

IME it's rare that the parsing of an HTTP/3 frame (in this case, the CAPSULE frame that holds the capsule) is dependent on state in this way.

So the suggestion is to define DATAGRAM/DATAGRAM_NO_CONTEXT capsule types which would allow stateless parsing of frames. After a whole frame is read, receivers can then validate ordering rules such REGISTER_CONTEXT followed by DATAGRAM capsule, REGISTER_NO_CONTEXT followed by DATAGRAM_NO_CONTEXT.

This would also allow an early shortcut exit where the CAPSULE frame length can be used to validate that the DATAGRAM capsule is suitable large.

Add a note about sticking out to security considerations

As discussed at IETF 110, some MASQUE servers may prefer to avoid sticking out (i.e. they may wish to be indistinguishable from a non-MASQUE-capable HTTP/3 server). The H3_DATAGRAM SETTINGS parameter may stick out. Therefore, we should add a note about this to the Security Considerations section. A simple solution could be to encourage widespread HTTP/3 servers to always send this.

Datagrams with Unidirectional Streams

In #47, the conclusion was that server push didn't have a need to use datagrams, and I'm fine with that decision. However, it's a subset of a larger question.

I'm slightly concerned that using the "quarter stream ID" as the namespace forecloses the use of datagrams associated to anything except a request. For example, we discussed during HTTP/3 development that it might be reasonable to shift certain control frames to datagrams if that had been an option in the protocol.

I don't see that the quarter ID buys us anything beyond saving a byte or four in certain ranges of stream IDs, and it introduces an element in the protocol where we're presenting the Stream ID in a different format than every other instance. It forecloses options that we might find a use for in the future, even if it's not required for CONNECT-UDP.

Why not simply use the stream ID, like everywhere else, and keep the same rules about the stream needing to be open for sending? Where currently we say client, we could equally refer to stream initiators.

Is it an error to receive a flow ID greater than 10^15-1?

There's a disparity between the largest flow ID that can be carried in the DATAGRAM frame and the largest that we can express in Datagram-Flow-Id.

Draft 00 prevents a flow ID allocation of >= 10^15-1 but doesn't seem to address the case where a DATAGRAM frame contains that value. This is clearly an error, and I'd suggest that it is a connection error; probably H3_ID_ERROR or some new codepoint if we really wanted.

Do context ID closures need more details?

In #52, we introduce the CLOSE_DATAGRAM_CONTEXT capsule which allows endpoints to close a context. That message currently only contains the context ID to close. @bemasc suggests that we may want to add more information there.

  • Should we differentiate between "context close" and "context registration rejected"?
  • Should we add an error code?

In particular, do folks have use cases for these features?

Consider a different error code when the flow ID doesn't fit in the frame

There's some good guidance about parsing the Flow ID out of the HTTP/3 Datagram with respect to datagram payload length. But I'm not sure if the recommend error code of PROTOCOL_VIOLATION is the most appropriate. This implies that the error is a Transport Error but I'd argue that at this point (parsing of the frame payload) that its an HTTP/3 Application error.

Since we're not actually parsing a true HTTP/3 frame, just a view over a transport frame, H3_FRAME_ERROR seems too specific. So I think H3_GENERAL_PROTOCOL_ERROR is the most straightforward.

I wonder what others think.

Flow ID allocation failures

draft 00 section 6 says:

   If the flow identifier allocation service
   of an endpoint runs out of values lower than 10^15-1, the endpoint
   MUST fail the flow identifier allocation.

First, Section 3, which talks about the service, seems like a better place to more-generally talk about errors. Specific error conditions probably depend on the encoding - in this case, the maximum depends on Datagram-Flow-Id header but we might imagine other uses of H3 DGRAM might have other limits.

Second, what does failure look like? Is it a "won't vend flow ID" or is this a connection-critical error etc.

How to actually use the Context ID registrations mechanism

Sent to the WG mailing list.

I and my co-authors on https://datatracker.ietf.org/doc/draft-kuehlewind-masque-connect-ip/ have had some discussion around how one actually can and should be using the context-ID registration mechanism. This as we are actually need it to realize some of the functionality in our CONNECT-IP solution.

We have found several ways that appear to practically work from an CONNECT-IP application perspective.

A) Statically define that in the context of CONNECT-IP the application will map certain formats to certain ContextID, and simply send registrations without any extension headers.
B) Define a Context Extension that is intended to provide format identifiers for the CONNECT-IP method different formats.
C) Use values or parameters in HTTP headers related to the format to carry the identifiers.
D) Combine A and B, so that we have a set of initially statically defined mappings, and then use Context Extension for any future extensions.

In our discussion we see that C appears to have some potential for failure modes if an HTTP intermediary does things with the headers.

A) is very straight forward for an application initially but can run into trouble when needing to be extended but HTTP level negotiation of the general functionality should resolve those.

B) appears to have some overhead from an application perspective as each application needs to define its own Context Extension to carry identifiers. Is it really intended that each HTTP application that needs to convey some identifier for its Context ID format to carry across the HTTP Datagram capsule. Would there be a point in defining a Context Extension that carries format identifier for what one like to use this context ID.

D) appears to be simple initially, but leads to the extension mechanism having code paths that would not be exercised by default

So this was basically to start a discussion to see if we can get a bit more clearer guidance towards the HTTP application writers that will attempt to use HTTP Datagram on how to actually use Context IDs. I noted that the Connect-UDP method was no help as example yet.

packing DATAGRAMs with the same flow-id in a single QUIC packet

in Section 2 "Flow Identifiers" it reads

If multiple DATAGRAM frames can be packed into a single QUIC packet, the sender SHOULD group them by flow identifier to promote fate-sharing within a specific flow and improve the ability to process batches of datagram messages efficiently on the receiver.

this to me seems to be in conflict with the opening statement

QUIC DATAGRAM frames do not provide a means to demultiplex application contexts

if the flow-id is in the DATAGRAM payload, the transport is not supposed to be peeking at it. While it could be a nice (but a bit hacky) optimization in a real-world implementation, I feel like the RFC having it as a SHOULD is not ideal.

Allowing a single flow ID to be associated with multiple requests is brittle

The only note in the draft I see about this is:
"If an intermediary processes distinct HTTP requests that refer to the same flow ID in their respective "Datagram-Flow-Id" header fields, it MUST ensure that those requests are routed to the same backend."

This is not a requirement our load balancer would be able to enforce, since requests are routed to backends individually.  Is there a use case that really needs this, because it feels very odd to me as well as not working for our deployment.

Walkthrough of intended use

The document currently has a lot of tools, but there are different ways of thinking about how you could use them and why they're in the protocol. I feel like readability would improve dramatically if the document started with an overview that described one or two hypothetical flows.

Not fully fleshed out protocols -- just an example of declaring a context associated to a request, request headers that inform the context, context extensions that inform it, exchanging datagrams, and closing the context. The document makes reference to a few possible applications; maybe one of these could be drawn a little bit larger?

Why are flow IDs named?

I spent a while trying to understand this and it's not clear to me from the spec. Lucas posted this on Issue #24, but it's not quite enough for me(and it's not in the document).

My understanding of the ask that lead to the "name parameter" being defined in draft-schinazi-masque-h3-datagram-04 was that we want to be able to assign unique handles to flow IDs in a generic way that is part of the this document. That would allow a general purpose implementation to be able to disambiguate lows without having to understand any specific extension.

Originally posted by @LPardue in #24 (comment)

Datagram flows should have their individual max datagram size

Consider a situation where a single HTTP/3 connection is terminated at a CDN and it carries two different WebTransport sessions to different backends. Those backends may have different max datagram size, thus the protocol needs to be able to specify max datagram size limit per-flow.

MTU handling

This has been sent to the MASQUE WG mailing list in hopes to get some discussion there. However, I want an issue that tracks this:

In the work of writing up our Connect-IP proposal (https://datatracker.ietf.org/doc/draft-kuehlewind-masque-connect-ip/) we looked into how to deal with MTU issues effectively. Our conclusion was that this is going to be a general problem for any user of HTTP Datagrams. Thus, we like to propose that MTU handling is done within HTTP datagram.

This email will start explaining what we see as requirements for a MTU signaling solution for HTTP datagram, then propose a potential solution.

Lets start with a figure that provides us with a framework to discuss the requirements:

+--------+ Path#1A +--------+        +-------+        +--------+
|Client A|<------->| HTTP   | Path#2 | MASQUE| Path#3 |        |
+--------+         | Inter- |<------>| Server|<------>| Target |
+--------+ Path#1B | mediary|        |(proxy)|        |        |
|Client B|<------->|        |        |       |        |        |
+--------+         +--------+        +-------+        +--------+

I think what makes this a bit more complex is the fact that we need to consider HTTP intermediaries, such as a front end load balancer that terminates a first QUIC and HTTP/3 connection between the client and that intermediary. From that intermediary another HTTP connection is used towards the HTTP server that consumes and produces the HTTP datagrams, and in the case of CONNECT-UDP and CONNECT-IP it also have a third path towards the target to consider. This figure includes two clients to remind us to consider that the HTTP intermediary may actually aggregate the HTTP request and response and HTTP datagram over one HTTP/2 or HTTP/3 connection over Path #2. In the case of MTU this complicates things more as the proxy cant assume that all HTTP requests have the same MTU for its datagrams on a HTTP/3 connection. And the Intermediary will be the entity that have direct knowledge of the client facing as well as the next hop MTU over the HTTP connections that may all differ in MTU.

We also have to consider the fact that the underlying transport connection may at any time be subject to a IP MTU change due to route change for the path between the nodes. In addition if one have enabled PMTUD in TCP or QUIC a larger MTU on the individual path could be made available and in some case desirable to use. Thus, we need to consider dynamic changes during the HTTP connections life time and each HTTP request response pairs usage of HTTP datagram.

So when using HTTP/3 datagram there are a strict MTU limit on the individual datagrams for it to be possible to be sent as QUIC datagrams, and not being forced to be encapsulated as CAPSULES over the reliable stream. This is clearly a possibility but results in that the datagrams are sent reliable and in order for each HTTP request, i.e. Connect-UDP or CONNECT-IP request. Also in case some end-to-end payloads fit in HTTP datagrams and others don’t there is potential for reordering among the payloads. Thus, to avoid this the client and the proxy needs to determine what the lowest currently supported HTTP datagram size on the path.

For each QUIC connection the end point will know what the initial MTU value is for this path when the HTTP/3 connection has been successful completed. However, that knowledge will not be available if one attempt to construct and send the HTTP request prior to connection establishment has concluded.

So the requirements we see for an MTU handling solution for HTTP Datagrams are the following.

  1. Hop-by-hop signaling across the HTTP entities of the lowest MTU of any sub-path
  2. Needs to be associated to a particular HTTP request or end-to-end path to support aggregation by HTTP intermediaries
  3. Endpoints needs to be able to initiate update of the MTU value upon detection of any changes during HTTP Datagram streams lifetime.
  4. HTTP Intermediaries needs to be able to initiate updates of the MTU value upon detecting MTU changes from the individual HTTP connections.

Solution proposal

A new HTTP Datagram Capsule is defined for MTU value exchange. This one is intended for the HTTP intermediary that needs to interpret, update or initiate sending of it. Thus, it needs to be a fixed registered type so it can be easily processed. It can also be exchanged in parallel to the Register_Datagram_* capsules as at that point the underlying HTTP connection is established and initial HTTP Datagram values will be known. A capsule will also travel all the way to the end. And an intermediary can initiate one in each direction for request paths. The only downside I see of this is that one is required per open stream when MTU changes occurs. Maybe someone have an idea of how to handle signaling when aggregating multiple endpoints streams onto one HTTP/3 connection.

To make it more efficient, rather than sending one MTU capsule per stream, an MTU capsule could list all streams it is applicable to. That way the number of MTU capsules would be no more than the number of end-to-end paths actually used.

Expand all capsule figures to include their Type and Length fields

QUIC and HTTP/3 style is to define the general format and then in each instance define the object including all of its fields. This would look like

REGISTER_DATAGRAM_CONTEXT Capsule {
  Type (i) = see Section 8.2,
  Length (i)
  Context ID (i),
  Datagram Format Type (i),
  Datagram Format Additional Data (..),
}

This can later be updated with the actual type value once things settle down

Discussion of use of flow identifiers should be improved

From @martinthomson (link to email):

I would frame this slightly differently, to the above point: This creates a system of "flow identifiers" to allow for concurrent use of DATAGRAM frames by multiple extensions without creating contention. Except that it doesn't entirely...

You should mention that signaling about flow identifiers is necessary, but that is the responsibility of the protocol that defines the usage. If you don't signal, how does a receiver know that flow ID 1 maps to usage foo and flow ID 2 maps to usage bar?

Also, you should explain that there is no general limit to the flow identifier space, though protocols might want to provide mechanisms to limit usage to prevent resource exhaustion. This will depend on the protocol, of course. Some protocols might not expend resources when creating flow IDs.

Directionality of flow/context IDs

In draft-ietf-masque-h3-datagram-00, flow IDs are bidirectional. During the 2021-04 MASQUE Interim, we discussed the possibility of making them unidirectional. Here are the differences:

  • Bidirectional

    • single shared namespace for both endpoints (even for client, odd for server)
    • it's easier to refer to peer's IDs, which makes negotiation easier
  • Unidirectional

    • one namespace per direction
    • every bidirectional use needs to associate both directions somehow

I'm personally leaning towards keeping them bidirectional. Unlike QUIC streams, these do not have a state machine or flow control credits, so there is no cost to only using a single direction of a bidirectional flow for unidirectional use-cases.

0-RTT text is a bit obtuse

When servers decide to accept 0-RTT data, they MUST send a
H3_DATAGRAM SETTINGS parameter greater or equal to the value they sent to the
client in the connection where they sent them the NewSessionTicket
message.

In layspeak I think this is saying "if a server sent the value 1 before, it MUST send 1 now. If a server sent 0 before, it MUST send 0 or 1 now".

Could we maybe editorialize this to make it less likely for someone to do value++ and end up sending 2?

Is the context ID optional?

This issue assumes that decide to go with the two-layer design described in #41. Given that design, some applications might not need the multiplexing provided by context IDs. We have multiple options here:

  • Make context ID mandatory
    • Applications that don't need it waste a byte per datagram
  • Negotiate the presence of context IDs using an HTTP header
    • Context IDs would still be mandatory to implement on servers because the client might send that header
  • Have the method determine whether context IDs are present or not
    • This would prevent extensibility on those methods
  • Create a REGISTER_NO_CONTEXT message
    • This assumes we use the Message design from #44
    • We add a register message that means that this stream does not use context IDs
    • This allows avoiding the byte overhead without sacrificing extensibility

Field:Flow ID cardinality

It's not really clear whether there are any expectations in relation to cardinality of flow identifiers and the field. Can two fields refer to the same flow ID? Can the same request reference multiple flow IDs?

Should there be a way to mass-register multiple context IDs?

It appears that the current defined registration method is great when one have three formats and need three different context ID for the three different formats. It is not suitable if one end up in any combinatorial cases, like formats that combine different modules where one may suddenly have 128 potential different context IDs. Pre-registering all of them appears to create a significant overhead. We had some thoughts of actually carrying the DSCP + ECN values for an IP packet payload in the Context ID. That could require up to 256 Context IDs. And registering each value individually could be done, but ends up in a large number of registrations, likely with a Context Extension TLV that consumes additionally overhead.

So is there a need for mass registration of context IDs?

Ability to accept or reject registration of context IDs

During the 2021-04 MASQUE Interim, we discussed having the ability to accept or reject registration of context IDs. An important motivation for this is the fact that context ID registrations take up memory (for example it could be a compression context which contains the data that is elided) and therefore we need to prevent endpoints from arbitrarily making their peer allocate memory. While we could flow control context ID registration, a much simpler solution would be to have a way for endpoints to close/shutdown a registration.

SF example parameter is confusing

The document currently uses the following example:

     Datagram-Flow-Id = 42; alternate=44

Motivating this with an example of how to express multiple flow identifiers.

I think that this example is potentially harmful. It implies a lot about a potential extension mechanism as part of an example, but does not commit to properly defining an extension. For examples like this I would much prefer to see something that is very clearly bogus. Otherwise, people might implement that to the point that they even achieve interoperability of a sort.

For instance, the following requires a lot more guessing about how to interpret it, to the point that I would guess that pseudo-interoperability would be hard to arrive at by accident:

     Datagram-Flow-Id = 42; camels-per-orthodoxy=17.4

See also #9.

Register a setting

In order to use this, this will need to be signaled somehow. It seems like the draft infers that the presence of the transport parameter for DATAGRAM is sufficient, but I don't believe that to be the case.

Take the case where a client offers three protocols for use via ALPN. One of those might always include DATAGRAM support and so the client is obligated to offer the transport parameter. But that does not imply that all three offered protocols support this extension equally. You need a way to unambiguously signal that the combination of DATAGRAM and h3 are supported. That means a setting.

What guarantees are there against re-using flow IDs

QUIC has strong guarantees on not re-using Stream IDs.

There was strong consensus against allowing re-use of Flow IDs(#22). The WG today was also leaning towards creating flow IDs that are not related to the Stream ID, which means preventing Flow ID re-use may be a challenge, since we can't rely on Stream ID non-reuse.

Options for enforcement include best effort ie: "MUST close the connection if reuse is detected" or try to ensure datagram flow IDs are used in order, which could allow enforcement be mandated.

Can intermediaries interact with capsules, contexts and context IDs?

In #52, capsules, contexts and context IDs are explicitly defined as end-to-end and there are requirements for intermediaries to not modify them, and to not even parse them. The motivation for this was to ensure that we retain the ability to deploy extensions end-to-end without having to modify intermediaries. @afrind mentioned that these requirements might be too strict, as he would like to have his intermediary be able to parse the context ID, for logging purposes.

DATAGRAM and STREAM+FIN reordering

H3 datagrams are associated with a stream, so it makes sense that it is only valid to send DATAGRAMs before sending the stream FIN. These packets may be received out of order however. What should receivers do with these packets?

The simplest action for the receiver is to drop any datagrams received after FIN, but I can imagine applications where the receiver would prefer to allow them to be processed for some time. Should the draft offer guidance for this?

The answer might be different if we use Flow IDs as connection scoped identifiers or the 2-layer approach (Stream ID+Flow ID) discussed at the interim.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.