Internet-Draft | Inband Flow Learning Framework | July 2023 |
Han, et al. | Expires 4 January 2024 | [Page] |
To deploy the inband performance measurement and flow information telemetry on live traffic, this document proposes a framework of an inband and flow based flow information learning mechanism called Inband Flow Learning (IFL). This document also provides different deployment approaches and considerations in practical network deployment.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 4 January 2024.¶
Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
Network telemetry [RFC9232] is a technology for gaining network insight by applying means of network data generation, data collection, data correlation, and data consumption. It provides the network visibility to the state and behavior of a network, which is crucial for network operation and network load supervision. From operator's perspective, it is important to monitor live traffic running in the network, including the bandwidth occupied by the traffic, traffic delay, traffic jitter and traffic packet loss. Under this circumstance, inband performance measurement [I-D.ietf-mpls-inband-pm-encapsulation] [I-D.ietf-6man-ipv6-alt-mark] and inband flow information telemetry [I-D.song-opsawg-ifit-framework] work complementary to provide the network traffic supervision.¶
To deploy the inband performance measurement and flow information telemetry on live traffic, this document proposes a framework of an inband and flow based flow information learning mechanism called Inband Flow Learning (IFL). This document also provides different deployment approaches and considerations in practical network deployment. Note that this document focuses on generating telemetry data object based on inband performance measurement of data packet. Telemetry based on means other than inband performance measurement of data packet is not within the scope of this document.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
IFL: Inband Flow Learning¶
IFITI: Inband Flow Information Telemetry Instance¶
The framework of Inband Flow Learning (IFL) includes three components of Service Discovery, Inband Flow Information Telemetry Deployment and Inband Flow Information Telemetry Adjustment shown in Figure 1.¶
+---------+---------------------+--------------------+--------------------+ |Component| Service | Inband Flow | Inband Flow | | | Discovery | Information | Information | | | |Telemetry Deployment|Telemetry Adjustment| +---------+---------------------+--------------------+--------------------+ |Functions| Sampling policy | Telemetry policy | | | |---------------------+--------------------+ Aging | | | Flow characteristic | Telemetry instance | | | | acquisition | | | +---------+---------------------+--------------------+--------------------+¶
Figure 1 Framework of Inband Flow Learning¶
Before starting the telemetry on service flows, the service should be discovered. Service discovery is a process of sampling to the service flow which is being transmitted in network in order to further determine which flow should be monitored. The target of service discovery function is to obtain the flow characteristics. The characteristics of flows are represented in terms of IP source address, IP destination address, TCP/UDP port number, VRF, incoming/outgoing interface etc.¶
The flow characteristics are automatically obtained from the sampling of live traffics in data plane. It creates the data base of flow characteristics can further be used to generate flow telemetry. The network node discovers the flow characteristic from the obtained service live traffic and sends them to the network controller server, if the characteristics are not included in the characteristic information base. The rules of the sampling to the flows are called sampling policy. For example, a specific priority value of IP packet can be a rule of a sampling policy. Sampling policy can be pre-configured from control and management planes via various protocols, e.g. NETCONF. Network controller also sends policies for the service discovery before the flow characteristics discovery. The characteristic information extracting can be based on the policy, and preset cycle etc.¶
After acquiring the traffic characteristics, telemetry based on the inband flow information can be planned and deployed. There are two modes to deploy inband flow information telemetry: End-to-End (E2E) and Hop-by-Hop (HbH). To deploy the inband flow information telemetry, the telemetry policy and the telemetry instance are also defined in following subsections.¶
For majority of the services, end-to-end telemetry of service flows can meet the requirements from operators. In E2E mode shown in Figure 2, ingress node discovers the traffic characteristics and proceed on-path telemetry on device to report data to data consumer. Ingress node may also encapsulate flow identifier to facilitate the identification of flow information telemetry on egress node. Egress node identifies the flow and alternate marking identifier, proceed the record on packet number and timestamp, and further telemetry the statistics to data consumer. Transit node does not require any detection of flow information or processing of telemetry.¶
+-------------+ |Data Consumer| compute E2E flow info +-------------+ | | ___flow info__| |____flow info____ | telemetry telemetry | | | +---------+ +---------+ +---------+ +---------+ | Ingress |---| Transit | ...| Transit |---| Egress | | Node | | Node | | Node | | Node | +---------+ +---------+ +---------+ +---------+¶
Figure 2 End-to-End Telemetry Mode¶
The distinction of HbH mode to E2E mode is that transit node also participates the inband flow information learning and telemetry. In HbH mode shown in Figure 3, telemetry covers the flow information on every node of the forwarding path the flow packet is transmitted, which provides detailed flow information on each hop. Hop-by-Hop telemetry usually works in the need of an on-demand fault diagnose.¶
+-------------+ |Data Consumer| compute HbH flow info +-------------+ | | | | flow info telemetry ______________| | | |_________________ | ___| |___ | | | | | +---------+ +---------+ +---------+ +---------+ | Ingress |---| Transit | ...| Transit |---| Egress | | Node | | Node | | Node | | Node | +---------+ +---------+ +---------+ +---------+¶
Figure 3 Hop-by-Hop Telemetry Mode¶
The telemetry mode is also indicated by the service packet in data plane to help the transit node to differentiate the needs of telemetry.¶
Telemetry policy is used to determine which flow should be monitored. By configuring telemetry policy, it can increase the priority of learning and telemetry to critical flow and reduce or filter the learning and telemetry of unimportant flows. It is crucial to network deployment for two reasons, one is the number of flows can be huge, another is the limitation of processing capability either on the controller or the network node. There might be millions of flows in a large scale network, for example 5G mobile backhaul network. It is important to wisely choose the granularity of inband flow information telemetry.¶
Regarding IP traffics, the telemetry policy can be based on either one of or combination of flow characteristics, such as IP source/destination address, TCP/UDP port number, VRFs, or network device interfaces etc. An IP address with a flexible wildcard mask can also be used as means to provide telemetry policy to an aggregation of flows. Flow-ID Label Indicator [I-D.ietf-mpls-inband-pm-encapsulation] or FlowMonID [I-D.ietf-6man-ipv6-alt-mark] is also an alternative to identify the telemetry policy at transit or egress nodes.¶
Inband flow learning function manages the inband flow information telemetry based on Inband Flow Information Telemetry Instance (IFITI), in short called telemetry instance.¶
According to the flow characteristics, a telemetry instance for monitoring the service flow is generated by the network control plane in either distributed or centralized way. Ingress node can filter the received flows based on the pre-defined telemetry policy and create telemetry instance by itself. Network node can also obtain the instance and the corresponding identifier such as Flow-ID, encapsulate the identifier in the service flow to setup the relationship between the characteristic information, telemetry instance and the service flow, and perform telemetry.¶
Once the telemetry instance is created, ingress node can start the telemetry of flow information based on the method of alternative marking. At the same time, ingress node encodes inband monitoring information for example the flow ID in the service packets. Transit or egress node detect the inband monitoring information of packets and automatically create telemetry instance to deploy the inband flow information telemetry. The automatic creation of telemetry instance on network node can greatly facilitate the dynamic and incremental deployment.¶
The controller can also initiate the creation of telemetry instance and assign the telemetry instance to the ingress node to start the telemetry.¶
If the service message related to certain characteristic information is not received within the preset time, it is determined that the characteristic information is in an invalid state. And send the failure status information to the controller.¶
If the service message related to certain characteristic information is not received within the preset time, the characteristic information is determined to come into an invalid state. Further the failure status information is sent to the network controller.¶
When route convergence happens to the network, service flow may switch to other forwarding nodes. When the traffic changes, telemetry instance varies as well. Regarding the telemetry instance running on the fault path, the aging of IFITI should be supported in order to recycle the network resources. IFITI should be deleted once it becomes stale. To monitor the same flow information, new telemetry instance is required to add on the new transit or egress node. Note that aging and adjustment of IFITI can be initiated by controller or network node. When a specific timer used for flow information telemetry timeout, the IFITI would be deleted to stop the telemetry of the flow.¶
This document has no request to IANA¶
TBD¶