Internet-Draft | QUIC event definitions for qlog | October 2023 |
Marx, et al. | Expires 25 April 2024 | [Page] |
This document describes concrete qlog event definitions and their metadata for QUIC events. These events can then be embedded in the higher level schema defined in [QLOG-MAIN].¶
Note to RFC editor: Please remove this section before publication.¶
Feedback and discussion are welcome at https://github.com/quicwg/qlog. Readers are advised to refer to the "editor's draft" at that URL for an up-to-date version of this document.¶
Concrete examples of integrations of this schema in various programming languages can be found at https://github.com/quiclog/qlog/.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 25 April 2024.¶
Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
This document describes the values of the qlog name ("category" + "event") and "data" fields and their semantics for the QUIC protocol (see [QUIC-TRANSPORT], [QUIC-RECOVERY], and [QUIC-TLS]) and some of its extensions (see [QUIC-DATAGRAM] and [GREASEBIT]).¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
The event and data structure definitions in ths document are expressed in the Concise Data Definition Language [CDDL] and its extensions described in [QLOG-MAIN].¶
The following fields from [QLOG-MAIN] are imported and used: name, category, type, data, group_id, protocol_type, importance, RawInfo, and time-related fields.¶
This document describes how the QUIC protocol is can be expressed in qlog using the schema defined in [QLOG-MAIN]. QUIC protocol events are defined with a category, a name (the concatenation of "category" and "event"), an "importance", an optional "trigger", and "data" fields.¶
Some data fields use complex datastructures. These are represented as enums or re-usable definitions, which are grouped together on the bottom of this document for clarity.¶
When any event from this document is included in a qlog trace, the "protocol_type" qlog array field MUST contain an entry with the value "QUIC".¶
When the qlog "group_id" field is used, it is recommended to use QUIC's Original Destination Connection ID (ODCID, the CID chosen by the client when first contacting the server), as this is the only value that does not change over the course of the connection and can be used to link more advanced QUIC packets (e.g., Retry, Version Negotiation) to a given connection. Similarly, the ODCID should be used as the qlog filename or file identifier, potentially suffixed by the vantagepoint type (For example, abcd1234_server.qlog would contain the server-side trace of the connection with ODCID abcd1234).¶
QUIC packets always include an AEAD authentication tag at the end. In general, the length of the AEAD tag depends on the TLS cipher suite, although all cipher suites used in QUIC v1 use a 16 byte tag.¶
As QUIC appends an authentication tag after the packet payload, the packet header_lengths can be calculated as:¶
header_length = length - payload_length - 16¶
For UDP datagrams, the calculation is simpler:¶
header_length = length - payload_length¶
In some cases, the length fields are also explicitly reflected inside of packet headers. For example, the QUIC STREAM frame has a "length" field indicating its payload size. Similarly, the QUIC Long Header has a "length" field which is equal to the payload length plus the packet number length. In these cases, those fields are intentionally preserved in the event definitions. Even though this can lead to duplicate data when the full RawInfo is logged, it allows a more direct mapping of the QUIC specifications to qlog, making it easier for users to interpret.¶
A single qlog event trace is typically associated with a single QUIC connection. However, for several types of events (for example, a Section 5.7 event with trigger value of "connection_unknown"), it can be impossible to tie them to a specific QUIC connection, especially on the server.¶
There are various ways to handle these events, each making certain tradeoffs between file size overhead, flexibility, ease of use, or ease of implementation. Some options include:¶
QUIC connections consist of different phases and interaction events. In order to model this, QUIC event types are divided into general categories: connectivity (Section 4), security (Section 6), quic Section 5, and recovery Section 7.¶
As described in Section 3.4.2 of [QLOG-MAIN], the qlog "name" field is the concatenation of category and type.¶
Table 1 summarizes the name value of each event type that is defined in this specification.¶
Name value | Importance | Definition |
---|---|---|
connectivity:server_listening | Extra | Section 4.1 |
connectivity:connection_started | Base | Section 4.2 |
connectivity:connection_closed | Base | Section 4.3 |
connectivity:connection_id_updated | Base | Section 4.4 |
connectivity:spin_bit_updated | Base | Section 4.5 |
connectivity:connection_state_updated | Base | Section 4.6 |
connectivity:mtu_updated | Extra | Section 4.7 |
quic:version_information | Core | Section 5.1 |
quic:alpn_information | Core | Section 5.2 |
quic:parameters_set | Core | Section 5.3 |
quic:parameters_restored | Base | Section 5.4 |
quic:packet_sent | Core | Section 5.5 |
quic:packet_received | Core | Section 5.6 |
quic:packet_dropped | Base | Section 5.7 |
quic:packet_buffered | Base | Section 5.8 |
quic:packets_acked | Extra | Section 5.9 |
quic:datagrams_sent | Extra | Section 5.10 |
quic:datagrams_received | Extra | Section 5.11 |
quic:datagram_dropped | Extra | Section 5.12 |
quic:stream_state_updated | Base | Section 5.13 |
quic:frames_processed | Extra | Section 5.14 |
quic:stream_data_moved | Base | Section 5.15 |
quic:datagram_data_moved | Base | Section 5.16 |
security:key_updated | Base | Section 6.1 |
security:key_discarded | Base | Section 6.2 |
recovery:parameters_set | Base | Section 7.1 |
recovery:metrics_updated | Core | Section 7.2 |
recovery:congestion_state_updated | Base | Section 7.3 |
recovery:loss_timer_updated | Extra | Section 7.4 |
recovery:packet_lost | Core | Section 7.5 |
recovery:marked_for_retransmit | Extra | Section 7.6 |
recovery:ecn_state_updated | Extra | Section 7.7 |
QUIC events extend the $ProtocolEventBody
extension point defined in
[QLOG-MAIN].¶
Importance: Extra¶
Emitted when the server starts accepting connections.¶
Definition:¶
Some QUIC stacks do not handle sockets directly and are thus unable to log IP and/or port information.¶
Importance: Base¶
Used for both attempting (client-perspective) and accepting (server-perspective) new connections. Note that this event has overlap with connection_state_updated and this is a separate event mainly because of all the additional data that should be logged.¶
Definition:¶
Some QUIC stacks do not handle sockets directly and are thus unable to log IP and/or port information.¶
Importance: Base¶
Used for logging when a connection was closed, typically when an error or timeout occurred. Note that this event has overlap with connectivity:connection_state_updated, as well as the CONNECTION_CLOSE frame. However, in practice, when analyzing large deployments, it can be useful to have a single event representing a connection_closed event, which also includes an additional reason field to provide additional information. Additionally, it is useful to log closures due to timeouts, which are difficult to reflect using the other options.¶
In QUIC there are two main connection-closing error categories: connection and application errors. They have well-defined error codes and semantics. Next to these however, there can be internal errors that occur that may or may not get mapped to the official error codes in implementation-specific ways. As such, multiple error codes can be set on the same event to reflect this.¶
Definition:¶
Importance: Base¶
This event is emitted when either party updates their current Connection ID. As this typically happens only sparingly over the course of a connection, this event allows loggers to be more efficient than logging the observed CID with each packet in the .header field of the "packet_sent" or "packet_received" events.¶
This is viewed from the perspective of the endpoint applying the new id. As such, when the endpoint receives a new connection id from the peer, it will see the dst_ fields are set. When the endpoint updates its own connection id (e.g., NEW_CONNECTION_ID frame), it logs the src_ fields.¶
Definition:¶
Importance: Base¶
To be emitted when the spin bit changes value. It SHOULD NOT be emitted if the spin bit is set without changing its value.¶
Definition:¶
Importance: Base¶
This event is used to track progress through QUIC's complex handshake and connection close procedures. It is intended to provide exhaustive options to log each state individually, but also provides a more basic, simpler set for implementations less interested in tracking each smaller state transition. As such, users should not expect to see -all- these states reflected in all qlogs and implementers should focus on support for the SimpleConnectionState set.¶
Definition:¶
These states correspond to the following transitions for both client and server:¶
Client:¶
send initial¶
get initial¶
get first Handshake packet¶
get Handshake packet containing ServerFinished¶
send ClientFinished¶
get HANDSHAKE_DONE¶
Server:¶
get initial¶
send handshake EE, CERT, CV, ...¶
send ServerFinished¶
get first handshake packet / something using a server-issued CID of min length¶
get handshake packet containing ClientFinished¶
send HANDSHAKE_DONE¶
connection_state_changed with a new state of "attempted" is the same conceptual event as the connection_started event above from the client's perspective. Similarly, a state of "closing" or "draining" corresponds to the connection_closed event.¶
Importance: Extra¶
This event indicates that the estimated Path MTU was updated. This happens as part of the Path MTU discovery process.¶
Importance: Core¶
QUIC endpoints each have their own list of of QUIC versions they support. The client uses the most likely version in their first initial. If the server does support that version, it replies with a version_negotiation packet, containing supported versions. From this, the client selects a version. This event aggregates all this information in a single event type. It also allows logging of supported versions at an endpoint without actual version negotiation needing to happen.¶
Definition:¶
Intended use:¶
Importance: Core¶
QUIC implementations each have their own list of application level protocols and versions thereof they support. The client includes a list of their supported options in its first initial as part of the TLS Application Layer Protocol Negotiation (alpn) extension. If there are common option(s), the server chooses the most optimal one and communicates this back to the client. If not, the connection is closed.¶
Definition:¶
Intended use:¶
Importance: Core¶
This event groups settings from several different sources (transport parameters, TLS ciphers, etc.) into a single event. This is done to minimize the amount of events and to decouple conceptual setting impacts from their underlying mechanism for easier high-level reasoning.¶
All these settings are typically set once and never change. However, they are typically set at different times during the connection, so there will typically be several instances of this event with different fields set.¶
Note that some settings have two variations (one set locally, one requested by the remote peer). This is reflected in the "owner" field. As such, this field MUST be correct for all settings included a single event instance. If you need to log settings from two sides, you MUST emit two separate event instances.¶
In the case of connection resumption and 0-RTT, some of the server's parameters
are stored up-front at the client and used for the initial connection startup.
They are later updated with the server's reply. In these cases, utilize the
separate parameters_restored
event to indicate the initial values, and this
event to indicate the updated values, as normal.¶
Definition:¶
Additionally, this event can contain any number of unspecified fields. This is to reflect setting of for example unknown (greased) transport parameters or employed (proprietary) extensions.¶
Importance: Base¶
When using QUIC 0-RTT, clients are expected to remember and restore the server's transport parameters from the previous connection. This event is used to indicate which parameters were restored and to which values when utilizing 0-RTT. Note that not all transport parameters should be restored (many are even prohibited from being re-utilized). The ones listed here are the ones expected to be useful for correct 0-RTT usage.¶
Definition:¶
Note that, like parameters_set above, this event can contain any number of unspecified fields to allow for additional/custom parameters.¶
Importance: Core¶
Definition:¶
The encryption_level and packet_number_space are not logged explicitly: the header.packet_type specifies this by inference (assuming correct implementation)¶
For more details on "datagram_id", see Section 5.10. It is only needed when keeping track of packet coalescing.¶
Importance: Core¶
Definition:¶
The encryption_level and packet_number_space are not logged explicitly: the header.packet_type specifies this by inference (assuming correct implementation)¶
For more details on "datagram_id", see Section 5.10. It is only needed when keeping track of packet coalescing.¶
Importance: Base¶
This event indicates a QUIC-level packet was dropped.¶
The trigger field indicates a general reason category for dropping the packet, while the details field can contain additional implementation-specific information.¶
Definition:¶
Some example situations for each of the trigger categories include:¶
For more details on "datagram_id", see Section 5.10.¶
Importance: Base¶
This event is emitted when a packet is buffered because it cannot be processed yet. Typically, this is because the packet cannot be parsed yet, and thus only the full packet contents can be logged when it was parsed in a packet_received event.¶
Definition:¶
For more details on "datagram_id", see Section 5.10. It is only needed when keeping track of packet coalescing.¶
Importance: Extra¶
This event is emitted when a (group of) sent packet(s) is acknowledged by the remote peer for the first time. This information could also be deduced from the contents of received ACK frames. However, ACK frames require additional processing logic to determine when a given packet is acknowledged for the first time, as QUIC uses ACK ranges which can include repeated ACKs. Additionally, this event can be used by implementations that do not log frame contents.¶
Definition:¶
If packet_number_space is omitted, it assumes the default value of PacketNumberSpace.application_data, as this is by far the most prevalent packet number space a typical QUIC connection will use.¶
Importance: Extra¶
When one or more UDP-level datagrams are passed to the socket. This is useful for determining how QUIC packet buffers are drained to the OS.¶
Definition:¶
Since QUIC implementations rarely control UDP logic directly, the raw data excludes UDP-level headers in all fields.¶
The "datagram_id" is a qlog-specific concept to allow tracking of QUIC packet coalescing inside UDP datagrams. Since QUIC generates many UDP datagrams, unique identifiers are required to be able to track them individually in qlog traces. However, neither UDP nor QUIC exchanges datagram identifiers on the wire. Selecting identifier values is thus left to qlog implementations, which should consider how to generate unique values within the scope of their created traces.¶
Importance: Extra¶
When one or more UDP-level datagrams are received from the socket. This is useful for determining how datagrams are passed to the user space stack from the OS.¶
Definition:¶
For more details on "datagram_ids", see Section 5.10.¶
Importance: Extra¶
When a UDP-level datagram is dropped. This is typically done if it does not contain a valid QUIC packet. If it does, but the QUIC packet is dropped for other reasons, packet_dropped (Section 5.7) should be used instead.¶
Definition:¶
Importance: Base¶
This event is emitted whenever the internal state of a QUIC stream is updated, as described in QUIC transport draft-23 section 3. Most of this can be inferred from several types of frames going over the wire, but it's much easier to have explicit signals for these state changes.¶
Definition:¶
QUIC implementations SHOULD mainly log the simplified bidirectional (HTTP/2-alike) stream states (e.g., idle, open, closed) instead of the more fine-grained stream states (e.g., data_sent, reset_received). These latter ones are mainly for more in-depth debugging. Tools SHOULD be able to deal with both types equally.¶
Importance: Extra¶
This event's main goal is to prevent a large proliferation of specific purpose events (e.g., packets_acknowledged, flow_control_updated, stream_data_received). Implementations have the opportunity to (selectively) log this type of signal without having to log packet-level details (e.g., in packet_received). Since for almost all cases, the effects of applying a frame to the internal state of an implementation can be inferred from that frame's contents, these events are aggregated into this single "frames_processed" event.¶
This event can be used to signal internal state change not resulting directly from the actual "parsing" of a frame (e.g., the frame could have been parsed, data put into a buffer, then later processed, then logged with this event).¶
Implementations logging "packet_received" and which include all of the packet's constituent frames therein, are not expected to emit this "frames_processed" event. Rather, implementations not wishing to log full packets or that wish to explicitly convey extra information about when frames are processed (if not directly tied to their reception) can use this event.¶
Note that for some events, this approach will lose some information (e.g., for which encryption level are packets being acknowledged?). If this information is important, the packet_received event can be used instead.¶
In some implementations, it can be difficult to log frames directly, even when using packet_sent and packet_received events. For these cases, this event also contains the packet_numbers field, which can be used to more explicitly link this event to the packet_sent/received events. The field is an array, which supports using a single "frames_processed" event for multiple frames received over multiple packets. To map between frames and packets, the position and order of entries in the "frames" and "packet_numbers" is used. If the optional "packet_numbers" field is used, each frame MUST have a corresponding packet number at the same index.¶
Definition:¶
For example, an instance of this event that represents four STREAM frames received over two packets would have the fields serialized as:¶
"frames":[ {"frame_type":"stream","stream_id":0,"offset":0,"length":500}, {"frame_type":"stream","stream_id":0,"offset":500,"length":200}, {"frame_type":"stream","stream_id":1,"offset":0,"length":300}, {"frame_type":"stream","stream_id":1,"offset":300,"length":50} ], "packet_numbers":[ 1, 1, 2, 2 ]¶
Importance: Base¶
Used to indicate when QUIC stream data moves between the different layers (for example passing from the application protocol (e.g., HTTP) to QUIC stream buffers and vice versa) or between the application protocol (e.g., HTTP) and the actual user application on top (for example a browser engine). This helps make clear the flow of data, how long data remains in various buffers and the overheads introduced by individual layers.¶
For example, this helps make clear whether received data on a QUIC stream is moved to the application protocol immediately (for example per received packet) or in larger batches (for example, all QUIC packets are processed first and afterwards the application layer reads from the streams with newly available data). This in turn can help identify bottlenecks, flow control issues or scheduling problems.¶
This event is only for data in QUIC streams. For data in QUIC Datagram Frames, see Section 5.16.¶
Definition:¶
Importance: Base¶
Used to indicate when QUIC Datagram Frame data (see [RFC9221]) moves between the different layers (for example passing from the application protocol (e.g., WebTransport) to QUIC Datagram Frame buffers and vice versa) or between the application protocol and the actual user application on top (for example a gaming engine or media playback software). This helps make clear the flow of data, how long data remains in various buffers and the overheads introduced by individual layers.¶
For example, this helps make clear whether received data in a QUIC Datagram Frame is moved to the application protocol immediately (for example per received packet) or in larger batches (for example, all QUIC packets are processed first and afterwards the application layer reads all Datagrams at once). This in turn can help identify bottlenecks or scheduling problems.¶
This event is only for data in QUIC Datagram Frames. For data in QUIC streams, see Section 5.15.¶
Definition:¶
Importance: Base¶
Definition:¶
Importance: Base¶
Definition:¶
Most of the events in this category are kept generic to support different recovery approaches and various congestion control algorithms. Tool creators SHOULD make an effort to support and visualize even unknown data in these events (e.g., plot unknown congestion states by name on a timeline visualization).¶
Importance: Base¶
This event groups initial parameters from both loss detection and congestion control into a single event. All these settings are typically set once and never change. Implementation that do, for some reason, change these parameters during execution, MAY emit the parameters_set event twice.¶
Definition:¶
Additionally, this event can contain any number of unspecified fields to support different recovery approaches.¶
Importance: Core¶
This event is emitted when one or more of the observable recovery metrics changes value. This event SHOULD group all possible metric updates that happen at or around the same time in a single event (e.g., if min_rtt and smoothed_rtt change at the same time, they should be bundled in a single metrics_updated entry, rather than split out into two). Consequently, a metrics_updated event is only guaranteed to contain at least one of the listed metrics.¶
Definition:¶
In order to make logging easier, implementations MAY log values even if they are the same as previously reported values (e.g., two subsequent RecoveryMetricsUpdated entries can both report the exact same value for min_rtt). However, applications SHOULD try to log only actual updates to values.¶
Additionally, this event can contain any number of unspecified fields to support different recovery approaches.¶
Importance: Base¶
This event signifies when the congestion controller enters a significant new state and changes its behaviour. This event's definition is kept generic to support different Congestion Control algorithms. For example, for the algorithm defined in the Recovery draft ("enhanced" New Reno), the following states are defined:¶
Definition:¶
The "trigger" field SHOULD be logged if there are multiple ways in which a state change can occur but MAY be omitted if a given state can only be due to a single event occurring (e.g., slow start is exited only when ssthresh is exceeded).¶
Importance: Extra¶
This event is emitted when a recovery loss timer changes state. The three main event types are:¶
In order to indicate an active timer's timeout update, a new "set" event is used.¶
Definition:¶
Importance: Core¶
This event is emitted when a packet is deemed lost by loss detection. It is RECOMMENDED to populate the optional "trigger" field in order to help disambiguate among the various possible causes of a loss declaration.¶
Definition:¶
Importance: Extra¶
This event indicates which data was marked for retransmit upon detecting a packet loss (see packet_lost). Similar to the reasoning for the "frames_processed" event, in order to keep the amount of different events low, this signal is grouped into in a single event based on existing QUIC frame definitions for all types of retransmittable data.¶
Implementations retransmitting full packets or frames directly can just log the constituent frames of the lost packet here (or do away with this event and use the contents of the packet_lost event instead). Conversely, implementations that have more complex logic (e.g., marking ranges in a stream's data buffer as in-flight), or that do not track sent frames in full (e.g., only stream offset + length), can translate their internal behaviour into the appropriate frame instance here even if that frame was never or will never be put on the wire.¶
Much of this data can be inferred if implementations log packet_sent events (e.g., looking at overlapping stream data offsets and length, one can determine when data was retransmitted).¶
Definition:¶
Importance: Extra¶
This event indicates a progression in the ECN state machine as described in section A.4 of [QUIC-TRANSPORT].¶
The token carried in an Initial packet can either be a retry token from a Retry packet, or one originally provided by the server in a NEW_TOKEN frame used when resuming a connection (e.g., for address validation purposes). Retry and resumption tokens typically contain encoded metadata to check the token's validity when it is used, but this metadata and its format is implementation specific. For that, this event includes a general-purpose "details" field.¶
The stateless reset token is carried in stateless reset packets, in transport parameters and in NEW_CONNECTION_ID frames.¶
The generic $QuicFrame
is defined here as a CDDL extension point (a "socket"
or "plug"). It can be extended to support additional QUIC frame types.¶
The QUIC frame types defined in this document are as follows:¶
In QUIC, PADDING frames are simply identified as a single byte of value 0. As such, each padding byte could be theoretically interpreted and logged as an individual PaddingFrame.¶
However, as this leads to heavy logging overhead, implementations SHOULD instead emit just a single PaddingFrame and set the payload_length property to the amount of PADDING bytes/frames included in the packet.¶
Note that the packet ranges in AckFrame.acked_ranges do not necessarily have to be ordered (e.g., [[5,9],[1,4]] is a valid value).¶
Note that the two numbers in the packet range can be the same (e.g., [120,120] means that packet with number 120 was ACKed). However, in that case, implementers SHOULD log [120] instead and tools MUST be able to deal with both notations.¶
The error_code_value field is the numerical value without VLIE encoding. This is useful because some error types are spread out over a range of codes (e.g., QUIC's crypto_error).¶
The frame_type_value field is the numerical value without VLIE encoding.¶
The QUIC DATAGRAM frame is defined in Section 4 of [RFC9221].¶
By definition, an application error is defined by the application-level protocol running on top of QUIC (e.g., HTTP/3).¶
As such, it cannot be defined here directly. Applications MAY use the provided extension point through the use of the CDDL "socket" mechanism.¶
Application-level qlog definitions that wish to define new ApplicationError strings MUST do so by extending the $ApplicationError socket as such:¶
$ApplicationError /= "new_error_name" / "another_new_error_name"¶
These errors are defined in the TLS document as "A TLS alert is turned into a QUIC connection error by converting the one-byte alert description into a QUIC error code. The alert description is added to 0x100 to produce a QUIC error code from the range reserved for CRYPTO_ERROR."¶
This approach maps badly to a pre-defined enum. As such, the crypto_error string is defined as having a dynamic component here, which should include the hex-encoded and zero-padded value of the TLS alert description.¶
The security and privacy considerations discussed in [QLOG-MAIN] apply to this document as well.¶
There are no IANA considerations.¶
Much of the initial work by Robin Marx was done at the Hasselt and KU Leuven Universities.¶
Thanks to Jana Iyengar, Brian Trammell, Dmitri Tikhonov, Stephen Petrides, Jari Arkko, Marcus Ihlar, Victor Vasiliev, Mirja Kuehlewind, Jeremy Laine, Kazu Yamamoto, and Christian Huitema for their feedback and suggestions.¶
This section is to be removed before publishing as an RFC.¶
data_moved
event¶
Major changes:¶
Smaller changes:¶