Internet-Draft | Inband Flow Learning Framework | March 2022 |
Han, et al. | Expires 4 September 2022 | [Page] |
To deploy the inband performance measurement and flow information telemetry on live traffic, this document proposes a framework of an inband and flow based flow information learning mechanism called Inband Flow Learning (IFL). This document also provides different deployment approaches and considerations in practical network deployment.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 4 September 2022.¶
Copyright (c) 2022 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
Network telemetry [I-D.ietf-opsawg-ntf] is a technology for gaining network insight by applying means of network data generation, data collection, data correlation, and data consumption. It provides the network visibility to the state and behavior of a network, which is crucial for network operation and network load supervision. From operator's perspective, it is important to monitor live traffic running in the network, including the bandwidth occupied by the traffic, traffic delay, traffic jitter and traffic packet loss. Under this circumstance, inband performance measurement [I-D.ietf-mpls-inband-pm-encapsulation] [I-D.ietf-6man-ipv6-alt-mark] and inband flow information telemetry [I-D.song-opsawg-ifit-framework] work complementary to provide the network traffic supervision.¶
To deploy the inband performance measurement and flow information telemetry on live traffic, this document proposes a framework of an inband and flow based flow information learning mechanism called Inband Flow Learning (IFL). This document also provides different deployment approaches and considerations in practical network deployment. Note that this document focuses on generating telemetry data object based on inband performance measurement of data packet. Telemetry based on means other than inband performance measurement of data packet is not within the scope of this document.¶
IFL: Inband Flow Learning¶
IFITI: Inband Flow Information Telemetry Instance¶
The framework of Inband Flow Learning (IFL) includes three components of Service Discovery, Inband Flow Information Telemetry Deployment and Inband Flow Information Telemetry Adjustment shown in Figure 1.¶
+---------+---------------------+--------------------+--------------------+ |Component| Service | Inband Flow | Inband Flow | | | Discovery | Information | Information | | | |Telemetry Deployment|Telemetry Adjustment| +---------+---------------------+--------------------+--------------------+ |Function | Flow | Telemetry type | Telemetry instance | | | characteristic +--------------------+ aging | | | acquisition | Telemetry policy | | | | +--------------------+ | | | | Telemetry instance| | +---------+---------------------+--------------------+--------------------+ | Means |Configuration trigger| Controller Deploy | Data plane trigger | | +---------------------+--------------------+--------------------+ | |Live traffic sampling| Device Deploy | Controller trigger | +---------+---------------------+--------------------+--------------------+¶
Figure 1 Framework of Inband Flow Learning¶
Service Discovery: before starting the telemetry on service flows, characteristics of traffic which is currently being forwarded in network should be analyzed. The traffic characteristics can be acquired either from network operations or automatically generated from the sampling of live traffics.¶
Inband Flow Information Telemetry Deployment: after acquiring the traffic characteristics, telemetry of service flows can be planned and deployed. In IFL, telemetry is based on a class of flow characteristic and managed as an Inband Flow Information Telemetry Instance (IFITI). Before the network node starts the telemetry, the IFITI type and policy should be specified.¶
Inband Flow Information Telemetry Adjustment: when the traffic changes, telemetry instance varies as well. This components includes the identification of traffic change and further adjustment of telemetry instances.¶
Service discovery is a process of sampling to the service flow which is being transmitted in network in order to further determine which flow should be monitored. The characteristics of service flow are represented as IP source address, IP destination address, TCP/UDP port number, VRF, incoming/outgoing interface on network node, etc. To target of service discovery is to obtain the flow characteristics. There are two fundamental means to acquire the flow characteristics including configuration triggered and sampling based on live traffics. Regarding the means of triggered by configuration, not only includes the configuration of Interface/IP address/VRF/Route... configured on the network nodes, but also database of planed service flow information stored on the controller and obtained from network operations, such as a table of services between base station and core network elements. On the other hand, sampling on the live traffic means that the network node automatically samples the live traffic in network, and dynamically generate flow characteristics based on live traffic. It relies on the capability of forwarding plane of network node. The comparison of two means are provided in Figure 2.¶
+-----------+-----------------------+----------------------------+ | Means | Configuration trigger | Live traffic sampling | +-----------+-----------------------+----------------------------+ | pros | Easy | real time | +-----------+-----------------------+----------------------------+ | cons | miss of exceptions |extra FP capability required| +-----------+-----------------------+----------------------------+¶
Figure 2 Comparison of Means of Service Discovery¶
Inband flow information telemetry can be categorized into two modes: End-to-End (E2E) and Hop-by-Hop (HbH). For majority of services, E2E telemetry of service flows can meet the requirements from operators. In E2E mode shown in Figure 3, ingress node discovers the traffic characteristics and proceed on-path telemetry on device to report data to data consumer. Ingress node may also encapsulate flow identifier to facilitate the identification of flow information telemetry on egress node. Egress node identifies the flow and alternate marking identifier, proceed the record on packet number and timestamp, and further telemetry the statistics to data consumer. Transit node does not require any detection of flow information or processing of telemetry.¶
+-------------+ |Data Consumer| compute E2E flow info +-------------+ | | ___flow info__| |____flow info____ | telemetry telemetry | | | +---------+ +---------+ +---------+ +---------+ | Ingress |---| Transit | ...| Transit |---| Egress | | Node | | Node | | Node | | Node | +---------+ +---------+ +---------+ +---------+¶
Figure 3 End-to-End Telemetry Type Mode¶
The distinction of HbH mode to E2E mode is that transit node also participates the inband flow information learning and telemetry. In HbH mode shown in Figure 4, telemetry covers the flow information on every node of the forwarding path the flow packet is transmitted, which provides detailed flow information on each hop.¶
+-------------+ |Data Consumer| compute HbH flow info +-------------+ | | | | flow info telemetry ______________| | | |_________________ | ___| |___ | | | | | +---------+ +---------+ +---------+ +---------+ | Ingress |---| Transit | ...| Transit |---| Egress | | Node | | Node | | Node | | Node | +---------+ +---------+ +---------+ +---------+¶
Figure 4 Hop-by-Hop Telemetry Type Mode¶
Telemetry policy is used to determine which flow should be monitored. By configuring telemetry policy, it can increase the priority of learning and telemetry to critical flow and reduce or filter the learning and telemetry of unimportant flows. It is crucial to network deployment for two reasons, one is the number of flows can be huge, another is limited by telemetry processing capability either on the controller or the network node. There might be millions of flows in a large scale network, for example 5G mobile backhaul network. It is important to wisely choose the granularity of inband flow information telemetry. Regarding IP traffics, the telemetry policy can be based on either one of or combination of IP source/destination address, TCP/UDP port number, VRFs, or network device interfaces etc. To use an IP address with a flexible wildcard mask can be used as the telemetry to an aggregation of multiple flows. A flow identifier such as Flow-ID Label Indicator [I-D.ietf-mpls-inband-pm-encapsulation] or FlowMonID [I-D.ietf-6man-ipv6-alt-mark] is also used to identify a flow at transit or egress nodes.¶
In IFL, inband flow information telemetry is based on a class of flow characteristic and managed as an Inband Flow Information Telemetry Instance (IFITI). IFITI can be deployed on either controller or network node. When IFITI is created on controller and deployed from controller to network node. The network nodes including the ingress and egress node in E2E mode, as well as transit node in HbH mode are deployed with separate IFITI. It usually works in the need of an on-demand fault diagnose. When IFITI is created on network node, normally ingress node creates IFITI based on the received flow packets filtered and sampled by the pre-defined telemetry policy. Ingress node can also encode inband monitoring information in the flow packets. Transit or egress node detect the inband monitoring information of packets and automatically create IFITI to deploy the inband flow information telemetry. To create the IFITI on network node can greatly facilitate the dynamic and incremental deployment if needed.¶
The network node discovers the flow characteristic from the obtained service live traffic and sends it to the network controller. According to these flow characteristics, the network controller generates a Telemetry instance for monitoring the service flow. The network node obtains the instance and the corresponding identifier, such as Flow-ID, carries the identifier in the service flow to setup a relationship between the characteristic information, instance and the service flow, and performs Telemetry. The network controller also sends policies for the service discovery. The characteristic information extracting can base on the policy, preset cycle etc.¶
If the service message related to certain characteristic information is not received within the preset time, it is determined that the characteristic information is in an invalid state. And send the failure status information to the controller.¶
When route convergence happens to the network, service flow may switch to other forwarding nodes. To monitor the same flow information, new telemetry instance is required to add on the new transit or egress node. Regarding the IFITI running on the fault path, the aging of IFITI should be supported in order to recycle the network resources. IFITI should be deleted once it becomes stale. Similar to the deployment of IFITI, aging and adjustment of IFITI can be controlled by the central controller or network node. When a specific timer used for flow information telemetry timeout, the IFITI would be deleted to stop the telemetry of the flow.¶
This document has no request to IANA¶
TBD¶