Internet-Draft | COIN Use Cases | March 2022 |
Kunze, et al. | Expires 8 September 2022 | [Page] |
Computing in the Network (COIN) comes with the prospect of deploying processing functionality on networking devices, such as switches and network interface cards. While such functionality can be beneficial in several contexts, it has to be carefully placed into the context of the general Internet communication.¶
This document discusses some use cases to demonstrate how real applications can benefit from COIN and to showcase essential requirements that have to be fulfilled by COIN applications.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 8 September 2022.¶
Copyright (c) 2022 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
The Internet was designed as a best-effort packet network that offers limited guarantees regarding the timely and successful transmission of packets. Data manipulation, computation, and more complex protocol functionality is generally provided by the end-hosts while network nodes are kept simple and only offer a "store and forward" packet facility. This design choice has shown suitable for a wide variety of applications and has helped in the rapid growth of the Internet.¶
However, with the expansion of the Internet, there are more and more fields that require more than best-effort forwarding including strict performance guarantees or closed-loop integration to manage data flows. In this context, allowing for a tighter integration of computing and networking resources, enabling a more flexible distribution of computation tasks across the network, e.g., beyond 'just' endpoints, may help to achieve the desired guarantees and behaviors as well as increase overall performance. The vision of 'in-network computing' and the provisioning of such capabilities that capitalize on joint computation and communication resource usage throughout the network is core to the efforts in the COIN RG; we refer to those capabilities as 'COIN capabilities' in the remainder of the document.¶
We believe that such vision of 'in-network computing' can be best outlined along four dimensions of use cases, namely those that (i) provide new user experiences through the utilization of COIN capabilities (referred to as 'COIN experiences'), (ii) enable new COIN systems, e.g., through new interactions between communication and compute providers, (iii) improve on already existing COIN capabilities and (iv) enable new COIN capabilities. Sections 3 through 6 capture those categories of use cases and provide the main structure of this document. The goal is to present how the presence of computing resources inside the network impacts existing services and applications or allows for innovation in emerging fields.¶
Through delving into some individual examples within each of the above categories, we aim to outline opportunities and propose possible research questions for consideration by the wider community when pushing forward the 'in-network computing' vision. Furthermore, insights into possible requirements for an evolving solution space of collected COIN capabilities is another objective of the individual use case descriptions. This results in the following taxonomy used to describe each of the use cases:¶
In Section 7, we will summarize the key research questions across all use cases and identify key requirements across all use cases. This will provide a useful input into future roadmapping on what COIN capabilities may emerge and how solutions of such capabilities may look like. It will also identify what open questions remain for these use cases to materialize as well as define requirements to steer future (COIN) research work.¶
The following terminology has been partly aligned with [I-D.draft-kutscher-coinrg-dir]:¶
(COIN) Program: a set of computations requested by a user¶
(COIN) Program Instance: one currently executing instance of a program¶
(COIN) Function: a specific computation that can be invoked as part of a program¶
COIN Capability: a feature enabled through the joint processing of computation and communication resources in the network¶
COIN Experience: a new user experience brought about through the utilization of COIN capabilities¶
Programmable Network Devices (PNDs): network devices, such as network interface cards and switches, which are programmable, e.g., using P4 or other languages.¶
(COIN) Execution Environment: a class of target environments for function execution, for example, a JVM-based execution environment that can run functions represented in JVM byte code¶
COIN System: the PNDs (and end systems) and their execution environments, together with the communication resources interconnecting them, operated by a single provider or through interactions between multiple providers that jointly offer COIN capabilities¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].¶
The scenario can be exemplified in an immersive gaming application, where a single user plays a game using a VR headset. The headset hosts functions that "display" frames to the user, as well as the functions for VR content processing and frame rendering combining with input data received from sensors in the VR headset.¶
Once this application is partitioned into constituent (COIN) programs and deployed throughout a COIN system, utilizing the COIN execution environment, only the "display" (COIN) programs may be left in the headset, while the compute intensive real-time VR content processing (COIN) programs can be offloaded to a nearby resource rich home PC or a PND in the operator's access network, for a better execution (faster and possibly higher resolution generation).¶
Partitioning a mobile application into several constituent (COIN) programs allows for denoting the application as a collection of (COIN) functions for a flexible composition and a distributed execution. In our example above, most functions of a mobile application can be categorized into any of three, "receiving", "processing" and "displaying" function groups.¶
Any device may realize one or more of the (COIN) programs of a mobile application and expose them to the (COIN) system and its constituent (COIN) execution environments. When the (COIN) program sequence is executed on a single device, the outcome is what you see today as applications running on mobile devices.¶
However, the execution of (COIN) functions may be moved to other (e.g., more suitable) devices, including PNDs, which have exposed the corresponding (COIN) programs as individual (COIN) program instances to the (COIN) system by means of a 'service identifier'. The result of the latter is the equivalent to 'mobile function offloading', for possible reduction of power consumption (e.g., offloading CPU intensive process functions to a remote server) or for improved end user experience (e.g., moving display functions to a nearby smart TV) by selecting more suitable placed (COIN) program instances in the overall (COIN) system.¶
Figure 1 shows one realization of the above scenario, where a 'DPR app' is running on a mobile device (containing the partitioned Display(D), Process(P) and Receive(R) COIN programs) over an SDN network. The packaged applications are made available through a localized 'playstore server'. The mobile application installation is realized as a 'service deployment' process, combining the local app installation with a distributed (COIN) program deployment (and orchestration) on most suitable end systems or PNDs ('processing server').¶
Such localized deployment could, for instance, be provided by a visiting site, such as a hotel or a theme park. Once the 'processing' (COIN) program is terminated on the mobile device, the 'service routing' (SR) elements in the network route (service) requests instead to the (previously deployed) 'processing' (COIN) program running on the processing server over an existing SDN network. Here, capabilities and other constraints for selecting the appropriate (COIN) program, in case of having deployed more than one, may be provided both in the advertisement of the (COIN) program and the service request itself.¶
As an extension to the above scenarios, we can also envision that content from one processing (COIN) program may be distributed to more than one display (COIN) program, e.g., for multi/many-viewing scenarios, thereby realizing a service-level multicast capability towards more than one (COIN) program.¶
NOTE: material on solutions like ETSI MEC will be added here later¶
Virtual Reality (VR), Augmented Reality (AR) and immersive media (the metaverse) taken together as Extended Reality (XR) are the drivers of a number of advances in interactive technologies. XR is one example of the Multisource-Multidestination Problem that combines video, haptics, and tactile experiences in interactive or networked multi-party and social interactions. While initially associated with gaming and entertainment, XR applications now include remote diagnosis, maintenance, telemedicine, manufacturing and assembly, autonomous systems, smart cities, and immersive classrooms.¶
Because XR requirements include the need to provide real-time interactivity for immersive and increasingly mobile immersive applications with tactile and time-sensitive data and high bandwidth for high resolution images and local rendering for 3D images and holograms, they are difficult to run over traditional networks; in consequence innovation is needed to deply the full potential of the applications.¶
Collaborative XR experiences are difficult to deliver with a client-server cloud-based solution as they require a combination of: stream synchronization, low delays and delay variations, means to recover from losses and optimized caching and rendering as close as possible to the user at the network edge. XR deals with personal information and potentially protected content this an XR application must also provide a secure environment and ensure user privacy. Additionally, the sheer amount of data needed for and generated by the XR applications can use recent trend analysis and mechanisms, including machine learning to find these trends and reduce the size of the data sets. Video holography and haptics require very low delay or generate large amounts of data, both requiring a careful look at data filtering and reduction, functional distribution and partitioning.¶
The operation of XR over networks requires some computing in the nodes from content source to destination. But a lot of these remain in the realm of research to resolve the resource allocation problem and provide adequate quality of experience. These include multi-variate and heterogeneous goal optimization problems at merging nodes requiring advanced analysis. Image rendering and video processing in XR leverages different HW capabilities combinations of CPU and GPU at the edge (even at the mobile edge) and in the fog network where the content is consumed. It is important to note that the use of in-network computing for XR does not imply a specific protocol but targets an architecture enabling the deployment of the services.¶
In-network computing for XR profits from the heritage of extensive research in the past years on Information Centric Networking, Machine Learning, network telemetry, imaging and IoT as well as distributed security and in-network coding.¶
This use case covers live productions of the performing arts where the performers and audience are in different physical locations. The performance is conveyed to the audience through multiple networked streams which may be tailored to the requirements of individual audience members; and the performers receive live feedback from the audience.¶
There are two main aspects: i) to emulate as closely as possible the experience of live performances where the performers and audience are co-located in the same physical space, such as a theatre; and ii) to enhance traditional physical performances with features such as personalisation of the experience according to the preferences or needs of the audience members.¶
Examples of personalisation include:¶
There are several chained functional entities which are candidates for being deployed as (COIN) Programs.¶
Personalisation functions¶
These are candidates for deployment as (COIN) Programs in PNDs rather than being located in end-systems (at the performers' site, the audience members' premises or in a central cloud location) for several reasons:¶
Note: Existing solutions for some aspects of this use case are covered in the Mobile Application Offloading, Extended Reality, and Content Delivery Networks use cases.¶
While the best-effort nature of the Internet enables a wide variety of applications, there are several domains whose requirements are hard to satisfy over regular best-effort networks.¶
Consequently, there is a large number of specialized appliances and protocols designed to provide the required strict performance guarantees, e.g., regarding real-time capabilities.¶
Time-Sensitive-Networking [TSN] as an enhancement to the standard Ethernet, e.g., tries to achieve these requirements on the link layer by statically reserving shares of the bandwidth. However, solutions on the link layer alone are not always sufficient.¶
The industrial domain, e.g., currently evolves towards increasingly interconnected systems in turn increasing the complexity of the underlying networks, making them more dynamic, and creating more diverse sets of requirements. Concepts satisfying the dynamic performance requirements of modern industrial applications thus become harder to develop. In this context, COIN offers new possibilities as it allows to flexibly distribute computation tasks across the network and enables novel forms of interaction between communication and computation providers.¶
This document illustrates the potential for new COIN systems using the example of the industrial domain by characterizing and analyzing specific scenarios to showcase potential requirements, as specifying general requirements is difficult due to the domain's mentioned diversity.¶
Common components of industrial networks can be divided into three categories as illustrated in Figure 2. Following [I-D.mcbride-edge-data-discovery-overview], EDGE DEVICES, such as sensors and actuators, constitute the boundary between the physical and digital world. They communicate the current state of the physical world to the digital world by transmitting sensor data or let the digital world interact with the physical world by executing actions after receiving (simple) control information. The processing of the sensor data and the creation of the control information is done on COMPUTING DEVICES. They range from small-powered controllers close to the EDGE DEVICES, to more powerful edge or remote clouds in larger distances. The connection between the EDGE and COMPUTING DEVICES is established by NETWORKING DEVICES. In the industrial domain, they range from standard devices, e.g., typical Ethernet switches, which can interconnect all Ethernet-capable hosts, to proprietary equipment with proprietary protocols only supporting hosts of specific vendors.¶
The control of physical processes and components of a production line is essential for the growing automation of production and ideally allows for a consistent quality level. Traditionally, the control has been exercised by control software running on programmable logic controllers (PLCs) located directly next to the controlled process or component. This approach is best-suited for settings with a simple model that is focused on a single or few controlled components.¶
Modern production lines and shop floors are characterized by an increasing amount of involved devices and sensors, a growing level of dependency between the different components, and more complex control models. A centralized control is desirable to manage the large amount of available information which often has to be pre-processed or aggregated with other information before it can be used. PLCs are not designed for this array of tasks and computations could theoretically be moved to more powerful devices. These devices are no longer close to the controlled objects and induce additional latency. Moving compute functionality onto COIN execution environments inside the network offers a new solution space to these challenges.¶
A control process consists of two main components as illustrated in Figure 3: a system under control and a controller.¶
In feedback control, the current state of the system is monitored, e.g., using sensors and the controller influences the system based on the difference between the current and the reference state to keep it close to this reference state.¶
Apart from the control model, the quality of the control primarily depends on the timely reception of the sensor feedback which can be subject to tight latency constraints, often in the single-digit millisecond range. While low latencies are essential, there is an even greater need for stable and deterministic levels of latency, because controllers can generally cope with different levels of latency, if they are designed for them, but they are significantly challenged by dynamically changing or unstable latencies. The unpredictable latency of the Internet exemplifies this problem if, e.g., off-premise cloud platforms are included.¶
Control functionality is traditionally executed on PLCs close to the machinery. These PLCs typically require vendor-specific implementations and are often hard to upgrade and update which makes such control processes inflexible and difficult to manage. Moving computations to more freely programmable devices thus has the potential of significantly improving the flexibility. In this context, directly moving control functionality to (central) cloud environments is generally possible, yet only feasible if latency constraints are lenient.¶
COIN offers the possibility of bringing the system and the controller closer together, thus possibly satisfying the latency requirements, by performing simple control logic on PNDs and/or in COIN execution environments.¶
While control models, in general, can become involved, there is a variety of control algorithms that are composed of simple computations such as matrix multiplication. These are supported by some PNDs and it is thus possible to compose simplified approximations of the more complex algorithms and deploy them in the network. While the simplified versions induce a more inaccurate control, they allow for a quicker response and might be sufficient to operate a basic tight control loop while the overall control can still be exercised from the cloud.¶
Opportunities:¶
Bringing the required computations to PNDs is challenging as these devices typically only allow for integer precision computation while floating-point precision is needed by most control algorithms. Additionally, computational capabilities vary for different available PNDs [KUNZE]. Yet, early approaches like [RUETH] and [VESTIN] have already shown the general applicability of such ideas, but there are still a lot of open research questions not limited to the following:¶
Research Questions:¶
RQ 4.2.1: How to derive simplified versions of the global (control) function?¶
RQ 4.2.2: How to distribute the simplified versions in the network?¶
In modern industrial networks, processes and machines can be monitored closely resulting in large volumes of available information. This data can be used to find previously unknown correlations between different parts of the value chain, e.g., by deploying machine learning (ML) techniques, which in turn helps to improve the overall production system. Newly gained knowledge can be shared between different sites of the same company or even between different companies [PENNEKAMP].¶
Traditional company infrastructure is neither equipped for the management and storage of such large amounts of data nor for the computationally expensive training of ML approaches. Off-premise cloud platforms offer cost-effective solutions with a high degree of flexibility and scalability, however, moving all data to off-premise locations poses infrastructural challenges. Pre-processing or filtering the data already in COIN execution environments can be a new solution to this challenge.¶
Processes in the industrial domain are monitored by distributed sensors which range from simple binary (e.g., light barriers) to sophisticated sensors measuring the system with varying degrees of resolution. Sensors can further serve different purposes, as some might be used for time-critical process control while others are only used as redundant fallback platforms. Overall, there is a high level of heterogeneity which makes managing the sensor output a challenging task.¶
Depending on the deployed sensors and the complexity of the observed system, the resulting overall data volume can easily be in the range of several Gbit/s [GLEBKE]. Using off-premise clouds for managing the data requires uploading or streaming the growing volume of sensor data using the companies' Internet access which is typically limited to a few hundred of Mbit/s. While large networking companies can simply upgrade their infrastructure, most industrial companies rely on traditional ISPs for their Internet access. Higher access speeds are hence tied to higher costs and, above all, subject to the supply of the ISPs and consequently not always available. A major challenge is thus to devise a methodology that is able to handle such amounts of data over limited access links.¶
Another aspect is that business data leaving the premise and control of the company further comes with security concerns, as sensitive information or valuable business secrets might be contained in it. Typical security measures such as encrypting the data make COIN techniques hardly applicable as they typically work on unencrypted data. Adding security to COIN approaches, either by adding functionality for handling encrypted data or devising general security measures, is thus an auspicious field for research which we describe in more detail in Section 8.¶
Sensors are often set up redundantly, i.e., part of the collected data might also be redundant. Moreover, they are often hard to configure or not configurable at all which is why their resolution or sampling frequency is often larger than required. Consequently, it is likely that more data is transmitted than is needed or desired.¶
Current approaches for handling such large amounts of information typically build upon stream processing frameworks such as Apache Flink. While they allow for handling large volume applications, they are tied to performant server machines and upscaling the information density also requires a corresponding upscaling of the compute infrastructure.¶
PNDs and COIN execution environments are in a unique position to reduce the data rates due to their line-rate packet processing capabilities. Using these capabilities, it is possible to filter out redundant or undesired data before it leaves the premise using simple traffic filters that are deployed in the on-premise network. There are different approaches to how this topic can be tackled.¶
A first step could be to scale down the available sensor data to the data rate that is needed. For example, if a sensor transmits with a frequency of 5 kHz, but the control entity only needs 1 kHz, only every fifth packet containing sensor data is let through. Alternatively, sensor data could be filtered down to a lower frequency while the sensor value is in an uninteresting range, but let through with higher resolution once the sensor value range becomes interesting.¶
While the former variant is oblivious to the semantics of the sensor data, the latter variant requires an understanding of the current sensor levels. In any case, it is important that end-hosts are informed about the filtering so that they can distinguish between data loss and data filtered out on purpose.¶
Opportunities:¶
There are manifold computations that can be performed on the sensor data in the cloud. Some of them are very complex or need the complete sensor data during the computation, but there are also simpler operations which can be done on subsets of the overall dataset or earlier on the communication path as soon as all data is available. One example is finding the maximum of all sensor values which can either be done iteratively at each intermediate hop or at the first hop, where all data is available.¶
Using expert knowledge about the exact computation steps and the concrete transmission path of the sensor data, simple computation steps can be deployed in the on-premise network to reduce the overall data volume and potentially speed up the processing time in the cloud.¶
Related work has already shown that in-network aggregation can help to improve the performance of distributed ML applications [SAPIO]. Investigating the applicability of stream data processing techniques to PNDs is also interesting, because sensor data is usually streamed.¶
Opportunities:¶
RQ 4.4.1: Which kinds of COIN programs can be leveraged for (pre-)processing steps?¶
Despite an increasing automation in production processes, human workers are still often necessary. Consequently, safety measures have a high priority to ensure that no human life is endangered. In traditional factories, the regions of contact between humans and machines are well-defined and interactions are simple. Simple safety measures like emergency switches at the working positions are enough to provide a decent level of safety.¶
Modern factories are characterized by increasingly dynamic and complex environments with new interaction scenarios between humans and robots. Robots can either directly assist humans or perform tasks autonomously. The intersect between the human working area and the robots grows and it is harder for human workers to fully observe the complete environment. Additional safety measures are essential to prevent accidents and support humans in observing the environment.¶
Industrial safety measures are typically hardware solutions because they have to pass rigorous testing before they are certified and deployment-ready. Standard measures include safety switches and light barriers. Additionally, the working area can be explicitly divided into 'contact' and 'safe' areas, indicating when workers have to watch out for interactions with machinery.¶
These measures are static solutions, potentially relying on specialized hardware, and are challenged by the increased dynamics of modern factories where the factory configuration can be changed on demand. Software solutions offer higher flexibility as they can dynamically respect new information gathered by the sensor systems, but in most cases they cannot give guaranteed safety.¶
Due to the importance of safety, there is a wide range of software-based approaches aiming at enhancing security. One example are tag-based systems, e.g., using RFID, where drivers of forklifts can be warned if pedestrian workers carrying tags are nearby. Such solutions, however, require setting up an additional system and do not leverage existing sensor data.¶
COIN systems could leverage the increased availability of sensor data and the detailed monitoring of the factories to enable additional safety measures. Different safety indicators within the production hall can be combined within the network so that PNDs can give early responses if a potential safety breach is detected.¶
One possibility could be to track the positions of human workers and robots. Whenever a robot gets too close to a human in a non-working area or if a human enters a defined safety zone, robots are stopped to prevent injuries. More advanced concepts could also include image data or combine arbitrary sensor data.¶
Opportunities:¶
Delivery of content to end users often relies on Content Delivery Networks (CDNs) storing said content closer to end users for latency reduced delivery with DNS-based indirection being utilized to serve the request on behalf of the origin server.¶
From the perspective of this draft, a CDN can be interpreted as a (network service level) set of (COIN) programs, implementing a distributed logic for distributing content from the origin server to the CDN ingress and further to the CDN replication points which ultimately serve the user-facing content requests.¶
NOTE: material on solutions will be added here later¶
Studies such as those in [FCDN] have shown that content distribution at the level of named content, utilizing efficient (e.g., Layer 2) multicast for replication towards edge CDN nodes, can significantly increase the overall network and server efficiency. It also reduces indirection latency for content retrieval as well as reduces required edge storage capacity by benefiting from the increased network efficiency to renew edge content more quickly against changing demand.¶
In addition to those request question for Section 3.1:¶
Requirements 3.1.1 through 3.1.6 also apply for CDN service access. In addition:¶
Layer 2 connected compute resources, e.g., in regional or edge data centres, base stations and even end-user devices, provide the opportunity for infrastructure providers to offer CFaaS type of offerings to application providers. App and service providers may utilize the compute fabric exposed by this CFaaS offering for the purposes defined through their applications and services. In other words, the compute resources can be utilized to execute the desired (COIN) programs of which the application is composed, while utilizing the inter-connection between those compute resources to do so in a distributed manner.¶
We foresee those CFaaS offerings to be tenant-specific, a tenant here defined as the provider of at least one application. For this, we foresee an interaction between CFaaS provider and tenant to dynamically select the appropriate resources to define the demand side of the fabric. Conversely, we also foresee the supply side of the fabric to be highly dynamic with resources being offered to the fabric through, e.g., user-provided resources (whose supply might depend on highly context-specific supply policies) or infrastructure resources of intermittent availability such as those provided through road-side infrastructure in vehicular scenarios.¶
The resulting dynamic demand-supply matching establishes a dynamic nature of the compute fabric that in turn requires trust relationships to be built dynamically between the resource provider(s) and the CFaaS provider. This also requires the communication resources to be dynamically adjusted to interconnect all resources suitably into the (tenant-specific) fabric exposed as CFaaS.¶
NOTE: material on solutions will be added here later¶
Similar to those for Section 3.1. In addition:¶
For the provisioning of services atop the CFaaS, requirements 3.1.1 through 3.1.6 should be addressed, too. In addition:¶
The term "virtual network programming" is proposed to describe mechanisms by which tenants deploy and operate COIN programs in their virtual network. Such COIN programs can for example be P4 programs, OpenFlow rules, or higher layer programs. This feature can enable other use cases described in this draft to be deployed using virtual networks services, over underlying networks such as datacenters, mobile networks, or other fixed or wireless networks.¶
For example COIN programs could perform the following on a tenant's virtual network:¶
To provide a concrete example of virtual COIN programming, we consider a use case using a 5G underlying network, the 5GLAN virtualization technology, and the P4 programming language and environment. Section 5.1 of [I-D.ravi-icnrg-5gc-icn] provides a description of the 5G network functions and interfaces relevant to 5GLAN, which are otherwise specified in [TS23.501] and [TS23.502]. From the 5GLAN service customer/tenant standpoint, the 5G network operates as a switch.¶
In the use case depicted in Figure 4, the tenant operates a network including a 5GLAN network segment (seen as a single logical switch), as well as fixed segments. This can be in a plant or enterprise network, using for an example a 5G Non-Public Network (NPN). The tenant uses P4 programs to determine the operation of the fixed and 5GLAN switches. The tenant provisions a 5GLAN P4 program into the mobile network, and can also operate a controller. The mobile devices (or User Equipment nodes) UE1, UE2, UE3 and UE4 are in the same 5GLAN, as well as Device1 and Device2 (through UE4).¶
Looking in more details in Figure 5, the 5GLAN P4 program can be split between multiple data plane nodes (PDU Session Anchor (PSA) User Plane Functions (UPF), other UPFs, or even mobile devices), although in some cases the P4 program may be hosted on a single node. In the most general case, a distributed deployment is useful to keep traffic on optimal paths, because, except in simple cases, within a 5GLAN all traffic will not pass through a single node. In this example, P4 programs could be deployed in UPF1, UPF2, UPF3, UE3 and UE4. UE1-UE2 traffic is using a local switch on PSA UPF1, UE1-UE3 traffic is tunneled between PSA UPF1 and PSA UPF2 through the N19 interface, and UE1-UE4 traffic is forwarded throughan external Data Network (DN). Traffic between Device1 and Device2 is forwarded through UE4.¶
Research has been conducted, for example by [Stoyanov], to enable P4 network programming of individual virtual switches. To our knowledge, no complete solution has been developped for deploying virtual COIN programs over mobile or datacenter networks.¶
Virtual network programming by tenants could bring benefits such as:¶
There is a growing range of use cases demanding for the realization of AI capabilities among distributed endpoints. Such demand may be driven by the need to increase overall computational power for large-scale problems. From a COIN perspective, those capabilities may be realized as (COIN) programs and executed throughout the COIN system, including in PNDs.¶
Some solutions may desire the localization of reasoning logic, e.g., for deriving attributes that better preserve privacy of the utilized raw input data. Quickly establishing (COIN) program instances in nearby compute resources, including PNDs, may even satisfy such localization demands on-the-fly (e.g., when a particular use is being realized, then terminated after a given time).¶
Examples for large-scale AI problems include biotechnology and astronomy related reasoning over massive amounts of observational input data. Examples for localizing input data for privacy reasons include radar-like application for the development of topological mapping data based on (distributed) radio measurements at base stations (and possibly end devices), while the processing within radio access networks (RAN) already constitute a distributed AI problem to a certain extent albeit with little flexibility in distributing the execution of the AI logic.¶
Reasoning frameworks, such as TensorFlow, may be utilized for the realization of the (distributed) AI logic, building on remote service invocation through protocols such as gRPC [GRPC] or MPI [MPI] with the intention of providing an on-chip NPU (neural processor unit) like abstraction to the AI framework.¶
NOTE: material on solutions like ETSI MEC and 3GPP work will be added here later¶
Requirements 3.1.1 through 3.1.6 also apply for general distributed AI capabilities. In addition:¶
The goal of this analysis is to identify aspects that are relevant across all use cases to help in shaping the research agenda of COINRG. For this purpose, this section will condense the opportunities, research questions, as well as requirements of the different presented use cases and analyze these for similarities across the use cases.¶
Through this, we intend to identify cross-cutting opportunities, research questions as well as requirements (for COIN system solutions) that may aid the future work of COINRG as well as the larger research community.¶
To be added later.¶
After carefully considering the different use cases along with their research questions, we propose the following layered categorization to structure the content of the research questions which we illustrate in Figure 6.¶
Three categories deal with concretizing fundamental building blocks of COIN and COIN itself.¶
Additionally, there are use-case near research questions that are heavily influenced by the specific constraints and goals of the use cases. We call this category "applicability areas" and refine it into the following subgroups:¶
The following research questions presented in the use cases belong to this category:¶
3.1.8, 3.2.1, 3.3.5, 3.3.6, 3.3.7, 5.3.3, 6.1.2, 6.1.4¶
The research questions centering around the COIN VISION dig into what is considered COIN and what scope COIN functionality should have. In contrast to the ENABLING TECHNOLOGIES, this section looks at the problem from a more philosophical perspective.¶
The first aspect of this is where/on which devices COIN programs will/should be executed (3.3.5). In particular, it is debatable whether COIN programs will/should only be executed in PNDs or whether other "adjacent" computational nodes are also in scope. In case of the latter, an arising question is whether such computations are still to be considered as "in-network processing" and where the exact line is between "in-network processing" and "routing to end systems" (3.3.7). In this context, it is also interesting to reason about the desired feature sets of PNDs (and other COIN execution environments) as these will shift the line between "in-network processing" and "routing to end systems" (3.1.8).¶
Digging deeper into the desired feature sets, some research questions address the question of which domains are to be considered of interest/relevant to COIN. For example, whether computationally-intensive tasks are suitable candidates for (COIN) Programs (3.3.6).¶
Turning the previous aspect around, some questions try to reason whether COIN can be sensibly used for specific tasks. For example, it is a question of whether current PNDs are fast and expressive enough for complex filtering operations (3.2.1).¶
There are also more general notions of this question, e.g., what "in-network capabilities" might be used to address certain problem patterns (6.1.4) and what new patterns might be supported (6.1.2). What is interesting about these different questions is that the former raises the question of whether COIN can be used for specific tasks while the latter asks which tasks in a larger domain COIN might be suitable for.¶
The final topic addressed in this part deals with the deployment vision for COIN programs (5.3.3).¶
In general, multiple programs can be deployed on a single PND/COIN element. However, to date, multi-tenancy concepts are, above all, available for "end-host-based" platforms, and, as such, there are manifold questions centering around (1) whether multi-tenancy is desirable for PNDs/COIN elements and (2) how exactly such functionality should be shaped out, e.g., which (new forms of) hardware support needs to be provided by PNDs/COIN elements.¶
The following research questions presented in the use cases belong to this category:¶
3.1.7, 3.1.8, 3.2.2, 4.3.4, 4.4.4, 5.1.1, 5.1.2, 5.1.6, 5.3.1, 6.1.3, 6.1.4,¶
The research questions centering around the ENABLING TECHNOLOGIES for COIN dig into what technologies are needed to enable COIN, which of the existing technologies can be reused for COIN and what might be needed to make the VISION(S) for COIN a reality. In contrast to the VISION(S), this section looks at the problem from a practical perspective.¶
Picking up on the topics discussed in Section 7.2.2.1.1 and Section 7.2.2.1.2, this category deals with how such technologies might be realized in PNDs and with which functionality should even be realized (3.1.8).¶
Another group of research questions focuses on "traditional" networking tasks, i.e., L2/L3 switching and routing decisions.¶
For example, how COIN-powered routing decisions can be provided at line-rate (3.1.7). Similarly, how (L2) multicast can be used for COIN (vice versa) (5.1.1), which (new) forwarding capabilities might be required within PNDs to support the concepts (5.1.2), and how scalability limits of existing multicast capabilities might be overcome using COIN (5.1.6).¶
In this context, it is also interesting how these technologies can be used to address quickly changing receiver sets (6.1.3), especially in the context of collective communication (6.1.4).¶
Some research questions deal with questions around how COIN (functionality) can be included in existing systems.¶
For example, if COIN is used to perform traffic filtering, how end-hosts can be made aware that data/information/traffic is deliberately withheld (4.3.4). Similarly, if data is pre-processed by COIN, how can end-hosts be signaled the new semantics of the received data (4.4.4).¶
In particular, these are not only questions concerning the functionality scope of PNDs or protocols but might also depend on how programming frameworks for COIN are designed. Overall, this category deals with how to handle knowledge and action imbalances between different nodes within COIN networks (5.3.1).¶
Finally, the increasing diversity of devices within COIN raises interesting questions of how the capabilities of the different devices can be combined and optimized (3.2.2).¶
The following research questions presented in the use cases belong to this category:¶
3.1.1, 3.2.3, 3.3.1, 3.3.2, 3.3.3, 3.3.5, 4.2.1, 4.2.2, 4.3.2/4.4.2, 4.3.3/4.4.3, 4.3.4, 4.4.4, 5.2.1, 5.2.2, 5.2.3, 5.3.1, 5.3.2, 5.3.3, 5.3.4, 5.3.5,¶
This category mostly deals with how COIN programs can be deployed and orchestrated.¶
One aspect of this topic is how the exact functional scope of COIN programs can/should be defined. For example, it might be an idea to define an "overall" program that then needs to be deployed to several devices (5.3.2). In that case, how should this composition be done: manually or automatically? Further aspects to consider here are how the different computational capabilities of the available devices can be taken into account and how these can be leveraged to obtain suitable distributed versions of the overall program (4.2.1).¶
In particular, it is an open question of how "service-level" frameworks can be combined with "app-level" packaging methods (3.1.1) or whether virtual network models can help facilitate the composition of COIN programs (5.3.5). This topic also again includes the considerations regarding multi-tenancy support (5.3.3, cf. Section 7.2.2.1.4) as such function distribution might necessitate deploying functions of several entities on a single device.¶
In this context, another interesting aspect is where exactly functions should be placed and who should influence these decisions. Such function placement could, e.g., be guided by the available devices (3.3.5, c.f. Section 7.2.2.1.1) and their position with regards to the communicating entities (3.3.1), and it could also be specified in terms of the "distance" from the "direct" network path (3.3.2).¶
However, it might also be an option to leave the decision to users or at least provide means to express requirements/constraints (3.3.3). Here, the main question is how tenant-specific requirements can actually be conveyed (5.2.1).¶
Once the position for deployment is fixed, a next problem that arises is how the functions can actually be deployed (4.3.2,4.4.2). Here, first relevant questions are how COIN programs/program instances can be identified (3.1.4) and how preferences for specific COIN program instances can be noted (3.1.5). It is then interesting to define how different COIN program can be coordinated (4.3.2,4.4.2), especially if there are program dependencies (4.2.2, cf. Section 7.2.2.3.1).¶
In addition to static solutions to the described problems, the increasing dynamics of today's networks will also require dynamic solutions. For example, it might be necessary to dynamically change COIN programs at run-time (4.3.3, 4.4.3) or to include new resources, especially if service-specific constraints or tenant requirements change (5.2.2). It will be interesting to see if COIN frameworks can actually support the sometimes required dynamic changes (3.2.4). In this context, providing availability and accountability of resources can also be an important aspect.¶
COIN systems will potentially not only exist in isolation, but will have to interact with existing systems. Thus, there are also several questions addressing the integration of COIN systems into existing ones. As already described in Section 7.2.2.2.3, the semantics of changes made by COIN programs, e.g., filtering packets or changing payload, will have to be communicated to end-hosts (4.3.4,4.4.4). Overall, there has to be a common middleground so that COIN systems can provide new functionality while not breaking "legacy" systems. How to bridge different levels of "network awareness" (5.3.1) in an explicit and general manner might be a crucial aspect to investigate.¶
A final category deals with meta objectives that should be tackled while thinking about how to realize the new concepts. In particular, devising strategies for achieving an optimal function allocation/placement are important to effectively the high heterogeneity of the involved devices (3.2.3).¶
On another note, security in all its facets needs to be considered as well, e.g., how to protect against misuse of the systems, unauthorized traffic and more (5.3.4). We acknowledge that these issues are not yet discussed in detail in this document.¶
The following research questions presented in the use cases belong to this category:¶
3.1.2¶
Further research questions concerning transport solutions are discussed in more detail in [TRANSPORT].¶
Today's transport protocols are generally intended for end-to-end communications. Thus, one important question is how COIN program interactions should be handled, especially if the deployment locations of the program instances change (quickly) (3.1.2).¶
The following research questions presented in the use cases belong to this category:¶
4.3.1, 5.1.1, 5.1.3, 5.1.5¶
The possibility of incorporating COIN resources into application programs increases the scope for how applications can be designed and implemented. In this context, the general question of how the applications can be designed and which (low-level) triggers could be included in the program logic comes up (4.3.1). Similarly, providing sensible constraints to route between compute and network capabilities (when both kinds of capabilities are included) is also important (5.1.3). Many of these considerations boil down to a question of trade-off, e.g, between storage and frequent updates (5.1.5), and how (new) COIN capabilities can be sensibly used for novel application design (5.1.1).¶
The following research questions presented in the use cases belong to this category:¶
3.2.3, 4.4.1, 4.5.2¶
Many of the use cases deal with novel ways of processing data using COIN. Interesting questions in this context are which types of COIN programs can be used to (pre-)process data (4.4.1) and which parts of packet information can be used for these processing steps, e.g., payload vs. header information (4.5.2). Additionally, data processing within COIN might even be used to support a better localization of the COIN functionality (3.2.3).¶
The following research questions presented in the use cases belong to this category:¶
3.1.2, 3.1.3, 3.1.4, 3.1.5, 3.1.6, 5.1.2, 5.1.3, 5.1.4, 6.1.5,¶
Being a central functionality of traditional networking devices, routing and forwarding are also prime candidates to profit from enhanced COIN capabilities. In this context, a central question, also raised as part of the framework in Section 7.2.2.3.3, is how different COIN entities can be identified (3.1.4) and how the choice for a specific instance can be signalled (3.1.5). Building upon this, next questions are which constraints could be used to make the forwarding/routing decisions (5.1.3), how these constraints can be signalled in a scalable manner (3.1.3), and how quickly changing COIN program locations can be included in these concepts, too (3.1.2).¶
Once specific instances are chosen, higher-level questions revolve around "affinity". In particular, how affinity on service-level can be provided (3.1.6), whether traffic steering should actually be performed on this level of granularity or rather on a lower level (5.1.4) and how invocation for arbitrary application-level protocols, e.g., beyond HTTP, can be supported (6.1.5). Overall, a question is what specific forwarding methods should or can be supported using COIN (5.1.2).¶
The following research questions presented in the use cases belong to this category:¶
3.2.4, 3.3.1, 3.3.4, 4.2.1, 4.4.1, 4.5.1¶
The final applicability area deals with use cases exercising some kind of control functionality. These processes, above all, require low latencies and might thus especially profit from COIN functionality. Consequently, the aforementioned question of function placement (cf. Section 7.2.2.3.2, e.g., close to one of the end-points or deep in the network, is also a very relevant question for this category of applications (3.3.1).¶
Focusing more explicitly on control processes, one idea is to deploy different controllers with different control granularities within a COIN system. On the one hand, it is an interesting question how these controllers with different granularities can be derived based on one original controller (4.2.1). On the other hand, how to achieve synchronisation between these controllers or, more generally, between different entities or flows/streams within the COIN system is also a relevant problem (3.3.4). Finally, it is still to be found out whether using COIN for such control processes indeed improves the existing systems, e.g., in terms of safety (4.5.1) or in terms of performance (3.2.4).¶
To be added later.¶
Note: This section will need consolidation once new use cases are added to the draft. Current in-network computing approaches typically work on unencrypted plain text data because today's networking devices usually do not have crypto capabilities.¶
As is already mentioned in Section 4.3.2, this above all poses problems when business data, potentially containing business secrets, is streamed into remote computing facilities and consequently leaves the control of the company. Insecure on-premise communication within the company and on the shop-floor is also a problem as machines could be intruded from the outside.¶
It is thus crucial to deploy security and authentication functionality on on-premise and outgoing communication although this might interfere with in-network computing approaches. Ways to implement and combine security measures with in-network computing are described in more detail in [I-D.fink-coin-sec-priv].¶
This draft presented use cases gathererd from several fields that can and could profit from capabilities that are provided by in-network and, more generally, distributed compute capabilities. We distinguished between use cases in which COIN may (i) enable new experiences, (ii) expose new features or (iii) improve on existing system capabilities, and (iv) other use cases where COIN capabilities enable totally new applications, for example, in industrial networking.¶
Beyond the mere description and characterization of those use cases, we identified opportunities arising from utilizing COIN capabilities as well as research questions that may need to be addressed to reap those opportunities. We also outlined possible requirements for realizing a COIN system addressing these use cases.¶
But of course this is only a snapshot of the potential COIN use cases. In fact, the decomposition of many current client server applications into node by node transit could identify other opportunities for adding computing to forwarding notably in supply-chain, health care, intelligent cities and transportation and even financial services (amonsts others). As these become better defined they will be added to the list presented here. We are, however, confident that our analysis across all use cases in those dimensions of opportunities, research questions, and requirements has identified commonalities that will support future work in this space. Hence, the use cases presented are directly positioned as input into the milestones of the COIN RG in terms of required functionalities.¶