Internet-Draft Inband Flow Learning Framework July 2023
Han, et al. Expires 4 January 2024 [Page]
Workgroup:
OPSAWG Working Group
Internet-Draft:
draft-hwy-opsawg-ifl-framework-03
Published:
Intended Status:
Informational
Expires:
Authors:
L. Han
China Mobile
M. Wang
China Mobile
X. Wang
Huawei
T. Zhou
Huawei

Inband Flow Learning Framework

Abstract

To deploy the inband performance measurement and flow information telemetry on live traffic, this document proposes a framework of an inband and flow based flow information learning mechanism called Inband Flow Learning (IFL). This document also provides different deployment approaches and considerations in practical network deployment.

Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 4 January 2024.

Table of Contents

1. Introduction

Network telemetry [RFC9232] is a technology for gaining network insight by applying means of network data generation, data collection, data correlation, and data consumption. It provides the network visibility to the state and behavior of a network, which is crucial for network operation and network load supervision. From operator's perspective, it is important to monitor live traffic running in the network, including the bandwidth occupied by the traffic, traffic delay, traffic jitter and traffic packet loss. Under this circumstance, inband performance measurement [I-D.ietf-mpls-inband-pm-encapsulation] [I-D.ietf-6man-ipv6-alt-mark] and inband flow information telemetry [I-D.song-opsawg-ifit-framework] work complementary to provide the network traffic supervision.

To deploy the inband performance measurement and flow information telemetry on live traffic, this document proposes a framework of an inband and flow based flow information learning mechanism called Inband Flow Learning (IFL). This document also provides different deployment approaches and considerations in practical network deployment. Note that this document focuses on generating telemetry data object based on inband performance measurement of data packet. Telemetry based on means other than inband performance measurement of data packet is not within the scope of this document.

2. Terminology and Conventions

2.1. Requirement Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

2.2. Terminology

IFL: Inband Flow Learning

IFITI: Inband Flow Information Telemetry Instance

3. Framework of Inband Flow Learning

The framework of Inband Flow Learning (IFL) includes three components of Service Discovery, Inband Flow Information Telemetry Deployment and Inband Flow Information Telemetry Adjustment shown in Figure 1.

   +---------+---------------------+--------------------+--------------------+
   |Component|      Service        |   Inband Flow      |   Inband Flow      |
   |         |     Discovery       |   Information      |   Information      |
   |         |                     |Telemetry Deployment|Telemetry Adjustment|
   +---------+---------------------+--------------------+--------------------+
   |Functions|   Sampling policy   |  Telemetry policy  |                    |
   |         |---------------------+--------------------+       Aging        |
   |         | Flow characteristic | Telemetry instance |                    |
   |         |     acquisition     |                    |                    |
   +---------+---------------------+--------------------+--------------------+

Figure 1 Framework of Inband Flow Learning

3.1. Service Discovery

Before starting the telemetry on service flows, the service should be discovered. Service discovery is a process of sampling to the service flow which is being transmitted in network in order to further determine which flow should be monitored. The target of service discovery function is to obtain the flow characteristics. The characteristics of flows are represented in terms of IP source address, IP destination address, TCP/UDP port number, VRF, incoming/outgoing interface etc.

The flow characteristics are automatically obtained from the sampling of live traffics in data plane. It creates the data base of flow characteristics can further be used to generate flow telemetry. The network node discovers the flow characteristic from the obtained service live traffic and sends them to the network controller server, if the characteristics are not included in the characteristic information base. The rules of the sampling to the flows are called sampling policy. For example, a specific priority value of IP packet can be a rule of a sampling policy. Sampling policy can be pre-configured from control and management planes via various protocols, e.g. NETCONF. Network controller also sends policies for the service discovery before the flow characteristics discovery. The characteristic information extracting can be based on the policy, and preset cycle etc.

3.2. Inband Flow Information Telemetry Deployment

After acquiring the traffic characteristics, telemetry based on the inband flow information can be planned and deployed. There are two modes to deploy inband flow information telemetry: End-to-End (E2E) and Hop-by-Hop (HbH). To deploy the inband flow information telemetry, the telemetry policy and the telemetry instance are also defined in following subsections.

3.2.1. Telemetry Mode

For majority of the services, end-to-end telemetry of service flows can meet the requirements from operators. In E2E mode shown in Figure 2, ingress node discovers the traffic characteristics and proceed on-path telemetry on device to report data to data consumer. Ingress node may also encapsulate flow identifier to facilitate the identification of flow information telemetry on egress node. Egress node identifies the flow and alternate marking identifier, proceed the record on packet number and timestamp, and further telemetry the statistics to data consumer. Transit node does not require any detection of flow information or processing of telemetry.

                    +-------------+
                    |Data Consumer| compute E2E flow info
                    +-------------+
                       |        |
         ___flow info__|        |____flow info____
        |   telemetry                telemetry    |
        |                                         |
 +---------+   +---------+    +---------+   +---------+
 | Ingress |---| Transit | ...| Transit |---| Egress  |
 |   Node  |   |   Node  |    |   Node  |   |   Node  |
 +---------+   +---------+    +---------+   +---------+

Figure 2 End-to-End Telemetry Mode

The distinction of HbH mode to E2E mode is that transit node also participates the inband flow information learning and telemetry. In HbH mode shown in Figure 3, telemetry covers the flow information on every node of the forwarding path the flow packet is transmitted, which provides detailed flow information on each hop. Hop-by-Hop telemetry usually works in the need of an on-demand fault diagnose.

                  +-------------+
                  |Data Consumer| compute HbH flow info
                  +-------------+
                    |   |  |   |  flow info telemetry
      ______________|   |  |   |_________________
     |               ___|  |___                  |
     |              |          |                 |
 +---------+   +---------+    +---------+   +---------+
 | Ingress |---| Transit | ...| Transit |---| Egress  |
 |   Node  |   |   Node  |    |   Node  |   |   Node  |
 +---------+   +---------+    +---------+   +---------+

Figure 3 Hop-by-Hop Telemetry Mode

The telemetry mode is also indicated by the service packet in data plane to help the transit node to differentiate the needs of telemetry.

3.2.2. Telemetry Policy

Telemetry policy is used to determine which flow should be monitored. By configuring telemetry policy, it can increase the priority of learning and telemetry to critical flow and reduce or filter the learning and telemetry of unimportant flows. It is crucial to network deployment for two reasons, one is the number of flows can be huge, another is the limitation of processing capability either on the controller or the network node. There might be millions of flows in a large scale network, for example 5G mobile backhaul network. It is important to wisely choose the granularity of inband flow information telemetry.

Regarding IP traffics, the telemetry policy can be based on either one of or combination of flow characteristics, such as IP source/destination address, TCP/UDP port number, VRFs, or network device interfaces etc. An IP address with a flexible wildcard mask can also be used as means to provide telemetry policy to an aggregation of flows. Flow-ID Label Indicator [I-D.ietf-mpls-inband-pm-encapsulation] or FlowMonID [I-D.ietf-6man-ipv6-alt-mark] is also an alternative to identify the telemetry policy at transit or egress nodes.

3.2.3. Telemetry Instance

Inband flow learning function manages the inband flow information telemetry based on Inband Flow Information Telemetry Instance (IFITI), in short called telemetry instance.

According to the flow characteristics, a telemetry instance for monitoring the service flow is generated by the network control plane in either distributed or centralized way. Ingress node can filter the received flows based on the pre-defined telemetry policy and create telemetry instance by itself. Network node can also obtain the instance and the corresponding identifier such as Flow-ID, encapsulate the identifier in the service flow to setup the relationship between the characteristic information, telemetry instance and the service flow, and perform telemetry.

Once the telemetry instance is created, ingress node can start the telemetry of flow information based on the method of alternative marking. At the same time, ingress node encodes inband monitoring information for example the flow ID in the service packets. Transit or egress node detect the inband monitoring information of packets and automatically create telemetry instance to deploy the inband flow information telemetry. The automatic creation of telemetry instance on network node can greatly facilitate the dynamic and incremental deployment.

The controller can also initiate the creation of telemetry instance and assign the telemetry instance to the ingress node to start the telemetry.

If the service message related to certain characteristic information is not received within the preset time, it is determined that the characteristic information is in an invalid state. And send the failure status information to the controller.

4. Inband Flow Information Telemetry Adjustment

If the service message related to certain characteristic information is not received within the preset time, the characteristic information is determined to come into an invalid state. Further the failure status information is sent to the network controller.

When route convergence happens to the network, service flow may switch to other forwarding nodes. When the traffic changes, telemetry instance varies as well. Regarding the telemetry instance running on the fault path, the aging of IFITI should be supported in order to recycle the network resources. IFITI should be deleted once it becomes stale. To monitor the same flow information, new telemetry instance is required to add on the new transit or egress node. Note that aging and adjustment of IFITI can be initiated by controller or network node. When a specific timer used for flow information telemetry timeout, the IFITI would be deleted to stop the telemetry of the flow.

5. IANA Considerations

This document has no request to IANA

6. Security Considerations

TBD

7. References

7.1. Normative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.

7.2. Informative References

[I-D.hwyh-ippm-ps-inband-flow-learning]
Han, L., Wang, M., Yang, F., and J. Huang, "Problem Statement and Requirement for Inband Flow Learning", Work in Progress, Internet-Draft, draft-hwyh-ippm-ps-inband-flow-learning-01, , <https://datatracker.ietf.org/doc/html/draft-hwyh-ippm-ps-inband-flow-learning-01>.
[I-D.ietf-6man-ipv6-alt-mark]
Fioccola, G., Zhou, T., Cociglio, M., Qin, F., and R. Pang, "IPv6 Application of the Alternate-Marking Method", Work in Progress, Internet-Draft, draft-ietf-6man-ipv6-alt-mark-17, , <https://datatracker.ietf.org/doc/html/draft-ietf-6man-ipv6-alt-mark-17>.
[I-D.ietf-mpls-inband-pm-encapsulation]
Cheng, W., Min, X., Zhou, T., Dai, J., and Y. Peleg, "Encapsulation For MPLS Performance Measurement with Alternate Marking Method", Work in Progress, Internet-Draft, draft-ietf-mpls-inband-pm-encapsulation-06, , <https://datatracker.ietf.org/doc/html/draft-ietf-mpls-inband-pm-encapsulation-06>.
[I-D.song-opsawg-ifit-framework]
Song, H., Qin, F., Chen, H., Jin, J., and J. Shin, "Framework for In-situ Flow Information Telemetry", Work in Progress, Internet-Draft, draft-song-opsawg-ifit-framework-20, , <https://datatracker.ietf.org/doc/html/draft-song-opsawg-ifit-framework-20>.
[RFC9232]
Song, H., Qin, F., Martinez-Julia, P., Ciavaglia, L., and A. Wang, "Network Telemetry Framework", RFC 9232, DOI 10.17487/RFC9232, , <https://www.rfc-editor.org/info/rfc9232>.

Authors' Addresses

Liuyan Han
China Mobile
Beijing
China
Minxue Wang
China Mobile
Beijing
China
Xuanxuan Wang
Huawei
Nanjing
China
Tianran Zhou
Huawei
Beijing
China