Internet-Draft | Dream-Pipe or Pipe-Dream | September 2021 |
Morton | Expires 10 March 2022 | [Page] |
This memo addresses the problem of defining relevant properties and metrics with the goal of improving Internet access for all users. Where the fundamental metrics are well-defined, a framework to standardize new metrics exists and been used with success. Users consider reliability to be important, as well as latency and capacity; it really depends who you ask and their current experiences.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 10 March 2022.¶
Copyright (c) 2021 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.¶
This memo addresses the problem of defining relevant properties and metrics with the goal of improving Internet access for all users. Much has already been done, and it's important to have a common foundation as we move forward. There is certainly more to understand about the problem and the approaches to a solution.¶
Part of the motivation for examining metrics in greater detail at this time is the belief that "Internet speed" is no longer the only service dimension that matters to users. A small sample of recent surveys follows.¶
In a 2021 UK [EY-Study] EY study (Summary in Advanced Television), Decoding the digital home, Chapter 2, a survey of 2500 subscribers found that "Fifty-eight per cent of UK households believe broadband reliability is more important than speed". Also, "the appetite for a consistent connection aligns with perceptions that broadband reliability declined during the pandemic - 29 per cent across all households, rising to 46 per cent in households with children aged up to 11 years." If the family is on-line more-often, they will notice more outages.¶
Reliability is surely important, but other factors come into play: "...nearly half (47 per cent) don't think upgrading to higher-speed packages is worth the cost. Meanwhile, 29 per cent say they don't understand what broadband speed means in practice." [EY-Study] Price trade-off and knowledge about the communication details play a role when discussing speed. Reliability issues are experienced by everyone.¶
All the findings above are predicated on the current "speeds" being delivered. For example, "Fifty-two percent of rural users are frustrated that the fastest speeds are not available in their area, and only 52 per cent believe they are getting value for money from their current broadband package." [EY-Study] The proportion of Urban users was 10-15% higher when asked about value. Broadband speed available often defines the development gap in world-wide surveys, and users in under-served areas are likely glad to see increases.[N-Africa]¶
Another survey described in [DontKnow] found that 36% of Americans *don't* know their Internet speed. A higher percentage of females did not know (47%) than males (25%). Older Americans (55+) and those in lower income households are other demographics where service speed was an above-overall-average gap in their Internet knowledge. Those who did not know their speed tended to be satisfied with the speed they receive, about 20% of these indicated dissatisfaction.¶
Latency wasn't mentioned in [EY-Study], but "While the survey indicates that most households ultimately want 'the basics' to work well, those that do consider additional features as part of a broadband bundle favour privacy and security (48 per cent), reflecting wider anxieties and concerns relating to data protection experienced during the pandemic." Many users want their service provider to provide all the services.¶
User expectations are greatly influenced by their current experiences, and by what is technologically feasible at the time. Users have a view of the levels of availability, quality, and utility that constitute the overall experience in their situation of use (e.g., stationary in their home).¶
When we insert a communication network as an integral part of an activity (a task or form of entertainment), then the expectation doesn't change or make allowances without a trade-off between the new features and the new situation For example, take the home but add the ability to travel: a stretch limousine fills some of the need but with reduced capabilities (the refrigerator is smaller, no beds or lavatory; there is less of everything). But we have a TV!¶
Let's assume the TV uses analog transmission for this discussion. The TV is smaller and experiences reception outages, so we are suddenly aware of radio characteristics like fading and multipath (especially with earlier analog transmissions, there is no buffering at all). The limo is really nicely appointed, but it's not all we dreamed-of (or possibly even expected), especially the TV reception because we wanted to watch a play-off game on this trip!¶
Where did they go wrong providing the TV in the Limo? They made the communication channel a much more obvious part of the viewing experience. And the first-time users didn't expect it. They inserted unexpected impairments in the communication channel by adding mobility. The TV designer might not have been aware of the moving use case; their "portable TV" means you can pick-up the TV easily, not watch TV at high speed, so needed features were not provided (a tracking antenna, to start with...).¶
Since the goal of the workshop is to improve Internet Access for *all* users, then we have set a difficult task. Some users might want a dedicated pipe to communicate in the ways they choose or access content that they have identified. Other users are willing to share a pool of communication resources to communicate only when and where they want, for the potential benefits of being able to communicate more widely, spontaneously, and possibly for less cost than the dedicated pipe. It seems that we should focus on the subset of performance attributes that will benefit as many users as possible, and over-which our constituent organizations have some control.¶
To some, the maximum bit rate remains the primary goal.¶
The Internet Speed record previously held by the University College of London (178 Tbps) was topped in July 2021 by a team of researchers from the National Institute of Information and Communications Technology (NICT) in Japan. The new world record for Internet speed, is 319 Tbps, using 4-core fiber [WorldRecord].¶
Higher link rates and subscriber rates may not be everything to users, but there can be a cross-over dependency to latency performance. Packet serialization time is reduced at higher link speeds, directly proportional to the increased rate. Bursts of large packets arriving for one stream affect the buffer time for packets in other streams that arrive behind them in a single FIFO queue, but again the problem is relieved by using higher link rates. For example, early VoIP used very low bit rate codecs: 8kbps was common, and so were 10 Mbps LANs. But mixing bursty and periodic traffic meant unreliable delivery and delay variation for the latter. Packet marking, prioritization of multiple queues and queue management methods helped. More rate headroom and adaptation for the network impairments was also welcome.¶
Instead of providing more capacity for a single user, today's Gigabit services support more users efficiently on the same access service. So higher rates can improve the other important dimensions of performance.¶
Perhaps the ideal view of point-to-point communications is a pipe that illustrates many fundamental communication properties. A dream-pipe, so to speak.¶
The apparent rigidity of the pipe model helps identify additional properties that are needed:¶
When we say, "delivered in sufficiently unadulterated form", we could mean:¶
So, a short list of network properties that contribute to good user experience are:¶
Networking and geographic reality tells us that we are unlikely to see all properties at once, for all time, AOE (anywhere on Earth); that's the extreme Pipe-Dream.¶
But attaining a good user experience level in different communication activities/scenarios likely implies different demands among the four properties above, and places different relative importance on each of these properties.¶
Author's Note: I'm positing a rigid pipe as the object of an idealistic idea, or "dream pipe". I hope there won't be any confusion about the play on words with "pipe dream; the dream pipe is a pipe dream, especially when the dream includes zero latency between any two points. I'm not talking about a hose or flexible pipe, either.¶
The IETF has been working the problem of standardizing metrics for the Internet and the communication streams it transports for well over 20 years. Many other organizations have been successfully working in this area as well, and hopefully they will identify their literature and key results for review.¶
The IETF's efforts to define IP-Layer and Transport-Layer performance metrics and methods have largely been carried-out in the IPPM working group (IP-Performance Metrics, and later, called IP-Performance Measurements). Beginning as a somewhat joint effort with the performance-focused Benchmarking Methodology Working Group (BMWG), IPPM was chartered individually in 1997. IPPM has extensive literature relevant to Internet measurement. The IPPM working group has a strong foundation in its Framework [RFC2330] that has been updated over time, with [RFC7312] and [RFC8468]. What we can *standardize and measure*, we have as a basis to evaluate and determine whether we have made it better (or not).¶
The problem that initiated the IPPM work turned-out to be the most difficult to solve (Bulk Transfer Capacity, BTC [RFC3148]), and has taken the longest. Meanwhile, the standards for fundamental metrics other than BTC turned-out to be sufficiently challenging.¶
Here is a list of fundamental packet transfer metrics, specified in RFCs:¶
The metrics and methods above were specified with considerable flexibility, so that they could be applied in a range of specific circumstances.¶
One of the most flexible metrics is IP Packet Delay Variation, [RFC3393] which is a "derived metric", in that it requires One-Way Delay measurements for assessment. The powerful feature of [RFC3393] is the selection function, which permits comparing the delays of any pair of packets in the stream. Fortunately, the performance community predominantly uses one of two forms of delay variation, the inter-packet delay variation and the packet delay variation forms. These forms are defined and compared in efficacy for measurement uses in [RFC5481] along with many other considerations and measurement forms/processing.¶
It is possible to create new derived metrics at the IP-layer, and to measure similar quantities (loss, delay, reordering) at other layers [ForAll] [RFC6390] [RFC6076].¶
Another example of a derived metric uses loss instead of delay. This metric that does not have a direct parallel in the IETF literature is the Stream Block metric found in ITU-T Recommendation Y.1540 [Y.1540] (virtually all the IP-layer metrics are found in one standard). This metric assigns consecutive packets into multi-packet blocks, and assesses the number of lost packets in a block as a surrogate for a higher-layer process's ability to maintain good communication. For example, a Forward Error Correction process might be able to replace 2 lost packets in any order, but not 3. There is a parallel to retransmission rate limits when attempting to maintain a continued loss-free ratio with buffering to allow for the retransmission time.¶
Network and Bulk Transport Capacity have been chartered and progressed over twenty years. The performance community has seen development of Informative definitions in [RFC3148] for Framework for Bulk Transport Capacity (BTC), [RFC5136] for Network Capacity, Maximum IP-layer Capacity (in RFC9097-to-be), and the Experimental metric definitions and methods in [RFC8337], Model-Based Metrics for BTC.¶
One quantity that could be measured without too much controversy is the Maximum IP-Layer Capacity (judging by the adoption in standards bodies, RFC9097-to-be) [I-D.ietf-ippm-capacity-metric-method]. This is the basis for many service specifications (and there is a technology or configuration-limited "ground truth" for the measurement), and can be tested simply with minimal reliance on end systems. The method deploys a feedback channel from the receiver to control the sender's transmission rate in near-real-time, and search for the maximum.¶
The "invisible" radio dream pipe presents a challenge in terms of results variability for all metrics and adds at least one critical input parameter: location. It may be that the variability with location and time are key metrics to help users understand radio coverage (in addition to signal strength portrayed by the "number of "bars").¶
A network property that is very high on the list in Section 2 is Availability. There is treatment of this important property as a metric in IETF (Connectivity [RFC2678]) and somewhat different details in an alternate definition (ITU-T Point-to-Point IP Service Availability [Y.1540]), but not much deployment for such an important pre-requisite to the rest of the metrics. Both Connectivity and Availability rely on packet loss measurements, in fact they can be considered *derived metrics*, adding time constraints and/or loss ratio thresholds to the fundamental loss measurements on a packet stream.¶
Sometimes the end systems decide whether the path is available or not. In one on-line gaming system, reception of ICMP Destination Unreachable caused complete and immediate termination of the session with loss of all progress and resources accumulated. Customer service centers sometimes experienced call-in overload from users seeking to restore their player environment. But the Destination Unreachable condition was temporary and resolved by routing updates in a few seconds. The ultimate fix was to delay the session's reaction to Destination Unreachable for a few seconds, and recovery was automatic. So, when end systems play a role in the definition of connectivity or availability, they must also be cognizant that all automatic failure detection and restoration requires some amount of time. Since failures are inevitable, the dream-pipe heals itself, too (and doesn't confuse end-systems or users with error messages that "sound final").¶
Most measurement systems begin their process with a Source-Destination packet exchange prior to actual measurements. If this pre-measurement exchange fails, then the test is not conducted (and re-tried later). But the most useful information to assess continuous connectivity/Availability is the record of test set-up success or failure over time. Measurement systems that make this info readily available do a more complete job of network characterization than others.¶
We cannot leave the topic of metrics without mentioning the equally important topic of measurement streams for Active Metrics and Measurements [RFC7799]. When attempting to measure characteristics of VoIP streams, the IETF agreed on ways to produce periodic streams in an acceptable way [RFC3432]. Many measured results completely depend on the stream characteristics, and inter-packet delay variation is a great example. If the stream contains packet bursts, are the bursts preserved in transit? We would ask later-on whether preserving burst spacing matters to the communication quality or to the user's experience. The answer is likely a matter of degree, and dependent on the communications activity itself.¶
When we consider the topic of new test streams in the context of Gigabit and higher access speeds intended to support multiple users, we might consider the notion of a "standard single user's stream set". Then we might measure how many simultaneous standard users can be supported with sufficient network performance: an indicator of each user's experience in the "dream-pipe". Of course, that details of a standard user's traffic would change over time, so we couldn't argue over the current year's definition for a year... The facility to register the new standard user test streams would be a key part of such a solution.¶
If we had measured only the fundamental metrics, we might ask, "we saw a small proportion of packet losses; did this matter to users? Did losses affect their satisfaction during their activity in any way? Did any users experience an outage at this loss level?"¶
The likely role for derived metrics is to improve results interpretation by measuring a new quantity, possibly in the context of a newly-defined packet stream, thereby making the process of results-interpretation easier to perform.¶
For example, Delay Variation metrics had many possible formulations, but two main forms emerged. RFC 5481 [RFC5481] compared the IPDV and PDV forms with tasks that network and application designers were facing at the time. The RFC describes measurement considerations and results interpretation from a purely objective point of view. The most useful result interpretation was to show how PDV (a characterization of the delay distribution from minimum to a high percentile) could be used to determine the size of the de-jitter buffer needed for the tested path.¶
For other fundamental IP-Layer metrics, there is some amount of discussion of best practices and interpretation in each of the IETF IPPM Metric RFCs.¶
Many researchers (working in ITU-T Study Group 12 and elsewhere) take the information that can be derived from packet-layer measurements, plus higher layers when available, and produce objective estimates of user satisfaction by modeling user Mean Opinion Score (MOS) variation over a range of conditions. The process determines the user MOS through formal subjective testing in laboratories (or more recently prompted by COVID-19 conditions, in crowd-sourced scenarios). The corresponding objective models of user satisfaction are often determined through competition among several candidates, where the goal is to seek the most accurate model possible at the time. The modeling efforts often produce new derived metrics that facilitate automated interpretation. The main drawback is that the process described above takes significant time when conducted in the context of an industry standards body. So, user activities that are not particularly demanding of network performance do not receive much attention from researchers doing modeling; their performance is assumed to come-along for the ride (but in a system with multiple queues or other categorizations, each activities' requirements need to be quantified). Nevertheless, a process to take-in network measurements and produce a measure of user satisfaction is well-understood and used. The output of these objective models can rightly be called Quality of Experience (QoE), because a set of users' opinions is an inherent part of the result (without users, you don't have QoE! That's how QoE is defined and differs from QoS and network performance).¶
"What are the best ways to communicate these properties to service providers and network operators?" Work together with service providers and network operators. Everyone has a stake in our future.¶
There are quite a few networking professionals whose day-job is network operation, and who are participating now.¶
Whether we present a single figure of merit, or a set of relevant measurements on a dashboard summary, each numer requires a frame of reference. This is true for everyone, not just everyday users.¶
One example of a solid reference for results comes from the fundamental benchmarking specification, [RFC2544]. The measured Throughput (as defined in [RFC2544] ) must be compared to the maximum theoretical frame rate on the layer-2 technology (accounting for frame size, inter-frame gap, preamble, etc.). Tests with small frame size may not achieve the maximum frame rate due to header processing rate limitations, and tabulating the maximum with the results makes this fact very clear. Other reference levels can be made available, such as the capacity required to support 4k video (~25Mbps), etc.¶
Some people will simply want to know whether the measurement result is good, bad, or somewhere in-between. We can follow common practice here to use colors (green, red, yellow in-between), or present the numer on a gauge with suitable color cues. But we need to know the use case or the service specification accurately to do this.¶
In fact, a portion of user testing is prompted by subscribing to a new service provider, a new level of service (higher speed?), or a perception that poor-performance and trouble-shooting may be necessary; even an apparent outage may prompt test attempts using alternate devices and networks.¶
This last testing scenario is the most interesting: how can we help users when they encounter a problem? It's usually most important to isolate the problem in the complex network, and when user to network host results are failing, can the next step remove some of the components and check them in isolation (while keeping in mind that the acceptance level for a sub-network is a part of the end-to-end budget for performance!). Is it impossible to reach a particular far-end host? Does the access network appear to be unavailable, or is the problem related to interference on the WiFi radio network? With test hosts placed at strategic points in the path, it may be possible to segment the problem a user is experiencing.¶
When all components that comprise a user's activity work well-together, then there are no surprises. Given that a large proportion of user expectations are met at some point in time, then the metrics and performance levels that characterize the network's contribution to overall satisfaction are what we want to describe and maintain. Users consider reliability to be important, as well as latency and capacity; it really depends who you ask and their current experiences. End-system designers have a role to play in the process by recognizing the realities of packet networks and compensating for them: the dream-pipe absolutes are still a pipe-dream today.¶
There are many fundamental metrics already-defined. But we might find that we need new metrics that make interpreting the results easier! The notion of *derived metrics* has been applied successfully. Test streams with a known bias toward a particular class of user streams can also be useful basis for performance measurement. Where the fundamental metrics are well-defined, a framework to standardize new metrics and active test streams exists and been used with success. Metrics can be defined that immediately improve our understanding of the performance presented to users, but to understand user satisfaction requires that user opinions are part of the development process.¶
All users, both knowledgeable and newcomers, need a frame of reference to understand what numerical measurements are telling them. The clues from the expected measurement range, the results from the recent past, or the theoretical maximum value all have their place. If users are willing, measurements should help them isolate their current issue to one or more networks and/or components in the user-to-X path.¶
If we break the problem down by specific communications activities and look for specific metrics for each one, it could take a long time to complete. Perhaps a categorization of the performance metrics, numeric criteria, and reliability of a "pseudo-dream-pipe" for a set of communication activities that have similar needs is a way to move ahead.¶
Active metrics and measurements have a long history of security considerations. The security considerations that apply to any active measurement of live paths are relevant here. See [RFC4656] and [RFC5357].¶
When considering privacy of those involved in measurement or those whose traffic is measured, the sensitive information available to potential observers is greatly reduced when using active techniques which are within this scope of work. Passive observations of user traffic for measurement purposes raise many privacy issues. We refer the reader to the privacy considerations described in the Large Scale Measurement of Broadband Performance (LMAP) Framework [RFC7594], which covers active and passive techniques.¶
This memo makes no requests of IANA.¶
Thanks to all the folks who have worked on performance metric development, many of whom are the authors of the references. Many more have provided their insights along the way. It's rewarding to travel with others (even for a short time), and to meet new people on the journey.¶