Internet-Draft Cloud Optical Problem Statement March 2022
Liu, et al. Expires 8 September 2022 [Page]
Workgroup:
CCAMP Working Group
Internet-Draft:
draft-liu-ccamp-optical2cloud-problem-statement-00
Published:
Intended Status:
Standards Track
Expires:
Authors:
S. Liu
China Mobile
H. Zheng
Huawei Technologies
A. Guo
Futurewei Technologies
Y. Zhao
China Mobile

Problem Statement and Requirements of Accessing Cloud via Optical Network

Abstract

This document describes the problem statement and requirements for accessing cloud via optical network. The supported scenarios include the multi-cloud access, optical leased line and cloud VR.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 8 September 2022.

Table of Contents

1. Introduction

Cloud-related applications are becoming popular and widely deployed in enterprises and vertical industries. Companies with multiple campuses are interconnected together with the remote cloud for storage and computing. Such cloud services require high quality experiences including high availability, low latency, on- demand bandwidth adjustments and so on.

Optical network is playing an increasingly important role for bearing cloud traffic due to its large bandwidth and low latency. With the TDM switching technology, there is no need for queuing and scheduling in optical networks as opposed to IP-based networks, which can drastically improve the users experience on service quality.

Optical network using OTN (Optical Transport Network) or wavelength-switching provides TDM-based connections with an access bandwidth granularity of 1.25Gbps, i.e. ODU0 (Optical Data Unit) and above, which is usually more than the demand for normal user, and user traffic are usually aggregated before they are carried into the network. However, recent development in ITU-T work items have aimed to enable OTN to support small-granularity services of 2Mbps-1Gbps through the introduction of Optical Service Unit (OSU). This potentially allows L2/L3 services to be carried directly over optical networks and transport end to end, making it even a more suitable solution for bearing cloud network traffic.

[I-D.ietf-rtgwg-net2cloud-problem-statement] and [I-D.ietf-rtgwg-net2cloud-gap-analysis] gave a detailed description on the coordination requirements between the network and the cloud assuming the network is IP-based. This document complements the analysis by further examining the requirements from an optical network perspective.

1.1. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

2. Scenarios

With the prevalence of cloud services, enterprises services, home services such as AR/VR, accessing clouds with optical networks is increasingly attractive and becoming an option for the users. Following scenarios provide a few typical applications.

2.1. Multi-cloud access

Cloud services are usually supported by multiple interconnected data centers (DCs). Besides the on-demand, scalable, high available and uses-based billing, mentioned in [I-D.ietf-rtgwg-net2cloud-problem-statement], there are also needs for Data Centre Interconnect (DCI) about high requirements on capacity, latency, and flexible scheduling. This use case requires specific capabilities of advanced OTN for DCIs.

      //------\\                                               /----\
    ||Enterprise|\\                                          |Vertical|
    ||   CPE    || \\        ------------          +-----+   /|Cloud |
      \\------//     \ +---*/            \*---+    |Cloud| //  \----/
                       |O-A|              |O-E|----+ GW  |/
                       +---+              +---+    +-----+
                      |      OTN Networks      |
      //-----\\       ++---+              +---+    +-----+     /-----\
    || Vertical||-----+ O-A|              |O-E|----+Cloud|---||Private||
     |   CPE   |      +----*\            /*---+    | GW  |    | Cloud |
      \\-----//              ------------          +-----+     \-----/

Figure 1: Cloud Accessing through Optical Network

A data center is a physical facility consisting of multiple bays of interconnected servers, that performs computing, storage, and communication needed for cloud services. Infrastructure-as-a-service
may be deployed in both public and private clouds, where virtual servers and other virtual resources are made available to users on demand and by self-service.

One typical scenario is the intra-city DCs, which communicate with each other via the intra-city DCI network to meet the high availability requirements. The active-active and Virtual Machine (VM) migration services which require low latency are provided by the intra-city DCI network. The intra-city DCI network supports the public and/or the private cloud services, such as video, games, desktop cloud, and cloud Internet cafe services. To ensure low latency, intra-city DCI network is deployed in the same city or adjacent cities. The distance is typically less than 100 km and more likely less than 50km. One city may have several large DCs.

DCs are ideally interconnected through Layer 2/3 switches or routers with full mesh connectivity. However, to improve interaction efficiency as well as service experience, OTN is also evaluated as an option to be used for DC interconnection.

There are three kinds of the connection relationship, point to point access, single to multiple point access, and multiple to multiple point access. Different types of connections are referring different shapes, single point accessing single cloud, single point accessing multiple clouds and multiple points accessing multiple clouds.

2.2. High-quality leased line

The high-quality private line provides high security and reliability and is suitable to ensure the end-to-end user experience for large enterprises such as financial, medical centers and education customers. The main advantages and drivers of the high quality private line are as follows.

  • High quality private lines provide large bandwidth, low latency, secure and reliable for any type of connection.
  • Accelerate the deployment of cloud services. The high quality and high security of the private line connecting to the cloud can enable enterprises to move more core assets to the cloud and use low-latency services on the cloud. Cloud-based deployment helps enterprises reduce heavy asset allocation and improve energy saving, so that enterprises can focus on their major business.
  • Reduce operator's CAPEX and OPEX. The end-to-end service provisioning system enables quick provisioning of private line services and improves user experience. Fault management can be done from the device level to reduce the complexity of location.
  • Enable operators to develop value-added services by providing enterprise users with latency maps, availability maps, comprehensive SLA reports, customized latency levels, and dynamic bandwidth adjustment packages.

2.3. Cloud virtual reality (VR)

Cloud VR offloads computing and cloud rendering in VR services from local dedicated hardware to a shared cloud infrastructure. Cloud rendered video and audio outputs are encoded, compressed, and transmitted to user terminals through fast and stable networks. In contrast to current VR services, where good user experience primarily relies on the end user purchasing expensive high-end PCs for local rendering, cloud VR promotes the popularization of VR services by allowing users to enjoy various VR services where rendering is carried out in the cloud.

Cloud VR service experience is impacted by several factors that influence the achieved sense of reality, interaction, and immersion, which are related to the network properties, e.g. bandwidth, latency and packet loss. The network performance indicators, such as bandwidth, latency, and packet loss rate, need to meet the requirements to realize a pleasurable experience.

The current network may be able to support early versions of cloud VR (e.g. 4K VR) with limited user experience, but will not meet the requirements for large scale deployment of cloud VR with enhanced experience (e.g. Interactive VR applications, cloud games). To support more applications and ensure a high-quality experience, much higher available and guaranteed bandwidth (e.g. larger than 1 Gbps), lower latency (e.g. less than 10 ms) and lower jitter (e.g. less than 5 ms) are required.

3. Requirement and problem Statement

3.1. LxVPN over optical networks for multiple-to-multiple access

L2VPN or L3VPN are used as overlay services on an optical network to support multi-cloud access. Therefore, it is required for optical networks as underlay to support multipoint-to-multipoint (MP2MP) connections.

3.2. Service-awareness

Overlay packet-based services are usually configured separately from the configuration of underly connections in optical networks. The connections in optical networks are treated as static connections for packet routing, therefore, they usually result in suboptimal routing of traffic and inefficient use of network resources at both packet and optical layer, making the network unable to adapt to dynamic network traffic changes.

To support carrying dynamic cloud traffic, an optical network should be capable of understanding the traffic type and patterns, as well as the bandwidth and QoS requirement of the traffic, and map the traffic onto the best feasible connections in the optical network. This requires both the control and management plane of optical networks to be able to sense the traffic and exchange the feasible QoS of underlay optical connections with the packet layer, such that the packet layer can make the best route selection.

3.3. Deterministic performance

Accessing cloud-based services requires deterministic performance from the underlay optical networks in order to achieve good user experience. Connections built on optical networks need to be deterministic in many quality factors, such as end-to-end latency, delay jitter, bandwidth, and availability supported by end-to-end protection and restoration. These deterministic performances are hard to reach on shared resources but can be achieved relatively easier on TDM-based optical networks.

Traditionally in an optical network, connections are pre-configured and the speed of dynamic restoration and reconfiguration of connections are in the order of several hundred milliseconds to several minutes. The control and management plane of the optical network should be enhanced to significantly improve the speed of connection operations and be able to convey accurate estimate of the performance to the upper layer to achieve end-to-end deterministic performance. Extensions to existing control plane and management interfaces are likely needed to support this capability.

3.4. High performance and high reliability

To support the above-mentioned applications some of the network properties are critical to promise the Quality of Services (QoS). For instance, high bandwidth (e.g. larger than 1 Gbps), low latency (e.g. no more than 10 ms) and low jitter (e.g. no more than 5 ms), are required for Cloud VR. In addition, small-granularity container is required to improve the efficiency of the networks.

It is also critical to support highly reliable DCI for cloud services. With advanced optical transport network protection and automatic recovery technologies, services can still run properly even fiber cuts occur in the DCI network. Specific protection and restoration schemes are required, to provide high reliability for the networks.

4. Manageability Considerations

TBD

5. Security Considerations

TBD

6. IANA Considerations

This document requires no IANA actions.

7. References

7.1. Normative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.

7.2. Informative References

[I-D.ietf-rtgwg-net2cloud-gap-analysis]
Dunbar, L., Malis, A. G., and C. Jacquenet, "Networks Connecting to Hybrid Cloud DCs: Gap Analysis", Work in Progress, Internet-Draft, draft-ietf-rtgwg-net2cloud-gap-analysis-07, , <https://www.ietf.org/archive/id/draft-ietf-rtgwg-net2cloud-gap-analysis-07.txt>.
[I-D.ietf-rtgwg-net2cloud-problem-statement]
Dunbar, L., Consulting, M., Jacquenet, C., and M. Toy, "Dynamic Networks to Hybrid Cloud DCs Problem Statement", Work in Progress, Internet-Draft, draft-ietf-rtgwg-net2cloud-problem-statement-11, , <https://www.ietf.org/archive/id/draft-ietf-rtgwg-net2cloud-problem-statement-11.txt>.

Acknowledgments

TBD

Authors' Addresses

Sheng Liu
China Mobile
Haomian Zheng
Huawei Technologies
Aihua Guo
Futurewei Technologies
Yang Zhao
China Mobile