Internet-Draft MoQ Use Cases and Requirements November 2022
Gruessing & Dawkins Expires 11 May 2023 [Page]
Workgroup:
MOQ Mailing List
Internet-Draft:
draft-gruessing-moq-requirements-03
Published:
Intended Status:
Informational
Expires:
Authors:
J. Gruessing
Nederlandse Publieke Omroep
S. Dawkins
Tencent America LLC

Media Over QUIC - Use Cases and Requirements for Media Transport Protocol Design

Abstract

This document describes use cases and requirements that guide the specification of a simple, low-latency media delivery solution for ingest and distribution, using either the QUIC protocol or WebTransport.

Note to Readers

RFC Editor: please remove this section before publication

Source code and issues for this draft can be found at https://github.com/fiestajetsam/draft-gruessing-moq-requirements.

Discussion of this draft should take place on the IETF Media Over QUIC (MoQ) mailing list, at https://www.ietf.org/mailman/listinfo/moq.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 11 May 2023.

Table of Contents

1. Introduction

This document describes use cases and requirements that guide the specification of a simple, low-latency media delivery solution for ingest and distribution [MOQ-charter], using either the QUIC protocol [RFC9000] or WebTransport [WebTrans-charter].

1.1. Note for MOQ Working Group participants

This version of the document is intended to provide the MOQ working group with a starting point for work on the "Use Cases and Requirements document" milestone. The update implements the work plan described in [MOQ-ucr]. The authors intend to request MOQ working group adoption after IETF 115, so the working group can begin to focus on these topics in earnest.

2. Terminology

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

3. Use Cases Informing This Proposal

Our goal in this section is to understand the range of use cases that are in scope for "Media Over QUIC" [MOQ-charter].

For each use case in this section, we also describe

It is likely that we should add other characteristics, as we come to understand them.

3.1. Interactive Media

The use cases described in this section have one particular attribute in common - the target latency for these cases are on the order of one or two RTTs. In order to meet those targets, it is not possible to rely on protocol mechanisms that require multiple RTTs to function effectively. For example,

  • When the target latency is on the order of one RTT, it makes sense to use FEC [RFC6363] and codec-level packet loss concealment [RFC6716], rather than selectively retransmitting only lost packets. These mechanisms use more bytes, but do not require multiple RTTs in order to recover from packet loss.
  • When the target latency is on the order of one RTT, it is impossible to use congestion control schemes like BBR [I-D.draft-cardwell-iccrg-bbr-congestion-control], since BBR has probing mechanisms that rely on temporarily inducing delay, but these mechanisms can then amortize the consequences of induced delay over multiple RTTs.

This may help to explain why interactive use cases have typically relied on protocols such as RTP [RFC3550], which provide low-level control of packetization and transmission, and make no provision for retransmission.

3.1.1. Gaming

Table 1
Attribute Value
Senders/Receivers One to One
Bi-directional Yes
Latency Ull-50

Where media is received, and user inputs are sent by the client. This may also include the client receiving other types of signaling, such as triggers for haptic feedback. This may also carry media from the client such as microphone audio for in-game chat with other players.

3.1.2. Remote Desktop

Table 2
Attribute Value
Senders/Receivers One to One
Bi-directional Yes
Latency Ull-50

Where media is received, and user inputs are sent by the client. Latency requirements with this use case are marginally different than the gaming use case. This may also include signalling and/or transmitting of files or devices connected to the user's computer.

3.1.3. Video Conferencing/Telephony

Table 3
Attribute Value
Senders/Receivers Many to Many
Bi-directional Yes
Latency Ull-50 to Ull-200

Where media is both sent and received; This may include audio from both microphone(s) or other inputs, or may include "screen sharing" or inclusion of other content such as slide, document, or video presentation. This may be done as client/server, or peer to peer with a many to many relationship of both senders and receivers. The target for latency may be as large as Ull-200 for some media types such as audio, but other media types in this use case have much more stringent latency targets.

3.2. Hybrid Interactive and Live Media

For the video conferencing/telephony use case, there can be additional scenarios where the audience greatly outnumbers the concurrent active participants, but any member of the audience could participate. As this has a much larger total number of participants - as many as Live Media Streaming Section 3.3.3, but with the bi-directionality of conferencing, this should be considered a "hybrid".

3.3. Live Media

The use cases in this section, unlike the use cases described in Section 3.1, still have "humans in the loop", but these humans expect media to be "responsive", where the responsiveness is more on the order of 5 to 10 RTTs. This allows the use of protocol mechanisms that require more than one or two RTTs - as noted in Section 3.1, end-to-end recovery from packet loss and congestion avoidance are two such protocol mechanisms that can be used with Live Media.

To illustrate the difference, the responsiveness expected with videoconferencing is much greater than watching a video, even if the video is being produced "live" and sent to a platform for syndication and distribution.

3.3.1. Live Media Ingest

Table 4
Attribute Value
Senders/Receivers One to One
Bi-directional No
Latency Ull-200 to Ultra-Low

Where media is received from a source for onwards handling into a distribution platform. The media may comprise of multiple audio and/or video sources. Bitrates may either be static or set dynamically by signaling of connection information (bandwidth, latency) based on data sent by the receiver.

3.3.2. Live Media Syndication

Table 5
Attribute Value
Senders/Receivers One to One
Bi-directional No
Latency Ull-200 to Ultra-Low

Where media is sent onwards to another platform for further distribution. The media may be compressed down to a bitrate lower than source, but larger than final distribution output. Streams may be redundant with failover mechanisms in place.

3.3.3. Live Media Streaming

Table 6
Attribute Value
Senders/Receivers One to Many
Bi-directional No
Latency Ull-200 to Ultra-Low

Where media is received from a live broadcast or stream. This may comprise of multiple audio or video outputs with different codecs or bitrates. This may also include other types of media essence such as subtitles or timing signalling information (e.g. markers to indicate change of behaviour in client such as advertisement breaks). The use of "live rewind" where a window of media behind the live edge can be made available for clients to playback, either because the local player falls behind edge or because the viewer wishes to play back from a point in the past.

4. Requirements for Protocol Work

Our goal in this section is to understand the requirements that result from the use cases described in Section 3.

*Note: the initial high-level organization for this section is taken from Suhas Nandakumar's presentation, "Progressing MOQ" [Prog-MOQ], at the October 2022 MOQ virtual interim meeting, which was in turn taken from the MOQ working group charter [MOQ-charter]. We think this is a reasonable starting point. We won't be surprised to see the high-level structure change a bit as things develop, but we didn't want to have this section COMPLETELY blank when we request working group adoption.

TODO: Describe overall, high level requirements that we previously stated in earlier versions of this document.

4.1. Common Publication Protocol for Media Ingest and Distribution

Many of the use cases have bi-directional flows of media, with clients both sending and receiving media concurrently, thus the protocol should have a unified approach in connection negotiation and signalling to send and received media both at the start and ongoing in the lifetime of a session including describing when flow of media is unsupported (e.g. a live media server signalling it does not support receiving from a given client).

4.2. Client Media Request Protocol

In the initiation of a session both client and server must perform negotiation in order to agree upon a variety of details before media can move in any direction:

  • Is the client authenticated and subsequently authorised to initiate a connection?
  • What media is available, and for each what are the parameters such as codec, bitrate, and resolution etc?
  • Is sending of media from a client permitted? If so, what media is accepted?

Re-negotiation in an existing protocol should be supported to allow changes in what is being sent of received.

4.3. Naming and Addressing Media Resources

As multiple streams of media may be available for concurrent sending such as multiple camera views or audio tracks, a means of both identifying the technical properties of each resource (codec, bitrate, etc) as well as a useful identification for playback should be part of the protocol. A base level of optional metadata e.g. the known language of an audio track or name of participant's camera should be supported, but further extended metadata of the contents of the media or its ontology should not be supported.

4.4. Packaging Media

Packaging of media describes how encapsulation of media to carry the raw media will work. There are at a high level two approaches to this:

  • Within the protocol itself, where the protocol defines the carrying for each media encoding the ancillary data required for decoding the media.
  • A common encapsulation format such as ISOBMFF which defines a generic method for all media and handles ancillary decode information.

The working group must agree on which approach should be taken to the packaging of media, taking into consideration the various technical trade offs that each provide.

4.5. End-to-end Security

End-to-end security describes the use of encryption of the media stream(s) to provide confidentiality in the presence of unauthorized intermediates or observers and prevent or restrict ability to decrypt the media without authorization. Generally, there are three aspects of end-to-end media security:

  • Media Rights Management, which refers to the authorization of receivers to decode a media stream.
  • Sender-to-Receiver Media Security, which refers to the ability of media senders and receivers to transfer media while protected from authorized intermediates and observers, and
  • Node-to-node Media Security, which refers to security when authorized intermediaries are needed to transform media into a form acceptable to authorized receivers. For example, this might refer to a video transcoder between the media sender and receiver.

**Note: "Node-to-node" refers to a path segment connecting two MOQ nodes, that makes up part of the end-to-end path between the MOQ sender and ultimate MOQ receiver.

The working group must agree on a number of details here, and perhaps the first question is whether the MOQ protocol makes any provision for "node-to-node" media security, or simply treats authorized transcoders as MOQ receivers. If that's the decision all MOQ media security is "sender-to-receiver", but some "ends" may not be either senders or ultimate receivers, from a certain point of view.

5. IANA Considerations

This document makes no requests of IANA.

6. Security Considerations

As this document is intended to guide discussion and consensus, it introduces no security considerations of its own.

7. References

7.1. Normative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/rfc/rfc2119>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/rfc/rfc8174>.

7.2. Informative References

[I-D.draft-cardwell-iccrg-bbr-congestion-control]
Cardwell, N., Cheng, Y., Yeganeh, S. H., Swett, I., and V. Jacobson, "BBR Congestion Control", Work in Progress, Internet-Draft, draft-cardwell-iccrg-bbr-congestion-control-02, , <https://datatracker.ietf.org/doc/html/draft-cardwell-iccrg-bbr-congestion-control-02>.
[I-D.draft-jennings-moq-quicr-arch]
Jennings, C. and S. Nandakumar, "QuicR - Media Delivery Protocol over QUIC", Work in Progress, Internet-Draft, draft-jennings-moq-quicr-arch-01, , <https://datatracker.ietf.org/doc/html/draft-jennings-moq-quicr-arch-01>.
[I-D.draft-jennings-moq-quicr-proto]
Jennings, C., Nandakumar, S., and C. Huitema, "QuicR - Media Delivery Protocol over QUIC", Work in Progress, Internet-Draft, draft-jennings-moq-quicr-proto-01, , <https://datatracker.ietf.org/doc/html/draft-jennings-moq-quicr-proto-01>.
[I-D.draft-kpugin-rush]
Pugin, K., Frindell, A., Cenzano, J., and J. Weissman, "RUSH - Reliable (unreliable) streaming protocol", Work in Progress, Internet-Draft, draft-kpugin-rush-01, , <https://datatracker.ietf.org/doc/html/draft-kpugin-rush-01>.
[I-D.draft-lcurley-warp]
Curley, L., Pugin, K., and S. Nandakumar, "Warp - Segmented Live Media Transport", Work in Progress, Internet-Draft, draft-lcurley-warp-02, , <https://datatracker.ietf.org/doc/html/draft-lcurley-warp-02>.
[MOQ-charter]
"Media Over QUIC (moq)", , <https://datatracker.ietf.org/wg/moq/about/>.
[MOQ-ucr]
"MOQ Use Cases and Requirements", , <https://datatracker.ietf.org/meeting/interim-2022-moq-01/materials/slides-interim-2022-moq-01-sessa-progressing-moq-00.pdf>.
[Prog-MOQ]
"Progressing MOQ", , <https://datatracker.ietf.org/meeting/interim-2022-moq-01/materials/slides-interim-2022-moq-01-sessa-moq-use-cases-and-requirements-individual-draft-working-group-draft-00>.
[RFC3550]
Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550, , <https://www.rfc-editor.org/rfc/rfc3550>.
[RFC6363]
Watson, M., Begen, A., and V. Roca, "Forward Error Correction (FEC) Framework", RFC 6363, DOI 10.17487/RFC6363, , <https://www.rfc-editor.org/rfc/rfc6363>.
[RFC6716]
Valin, JM., Vos, K., and T. Terriberry, "Definition of the Opus Audio Codec", RFC 6716, DOI 10.17487/RFC6716, , <https://www.rfc-editor.org/rfc/rfc6716>.
[RFC9000]
Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based Multiplexed and Secure Transport", RFC 9000, DOI 10.17487/RFC9000, , <https://www.rfc-editor.org/rfc/rfc9000>.
[WebTrans-charter]
"WebTransport (webtrans)", , <https://datatracker.ietf.org/wg/webtrans/about/>.

Appendix A. Acknowledgements

The authors would like to thank several authors of individual drafts that fed into the "Media Over QUIC" charter process:

We would also like to thank Suhas Nandakumar for his presentation, "Progressing MOQ" [Prog-MOQ], at the October 2022 MOQ virtual interim meeting. We used his outline as a starting point for the Requirements section (Section 4).

James Gruessing would also like to thank Francesco Illy and Nicholas Book for their part in providing the needed motivation.

Authors' Addresses

James Gruessing
Nederlandse Publieke Omroep
Netherlands
Spencer Dawkins
Tencent America LLC
United States of America