DTN Management Architecture

Internet-Draft	DTNMA	October 2022
Birrane, et al.	Expires 8 April 2023	[Page]

Abstract

This document describes the motivation for, and services required of, the management of devices deployed in a Delay-Tolerant Networking (DTN) environment. Together, this set of information outlines a conceptual DTN Management Architecture (DTNMA) suitable for deployment in any of the challenged and constrained DTN operational environments.¶

The DTNMA is supported by two types of asynchronous behavior. First, the DTNMA does not presuppose any synchronized transport behavior between managed and managing devices. Second, the DTNMA does not support any query-response semantics. In this way, the DTNMA allows for operation in extremely challenging conditions, to include over uni-directional links and cases where delays/disruptions prevent operation over traditional transport layers.¶

1. Introduction

The Delay-Tolerant Networking (DTN) architecture (as described in [RFC4838]) has been designed to cope with data exchange in challenged networks. Just as the DTN architecture requires new capabilities for transport and transport security, special consideration must be given for the management of DTN devices.¶

This document describes the DTN Management Architecture (DTNMA) designed to provide configuration, monitoring, and local control of both application and network services on a managed device operating either within or across a challenged network.¶

The structure of the DTNMA is derived from the unique properties of challenged networks are defined in [RFC7228]. These properties include cases where an end-to-end transport path may not exist at any moment in time and when delivery delays may prevent timely communications between a network operator and a managed device. These challenges may be caused by physical impairments such as long signal propagations and frequent link disruptions, or by other factors such as quality-of-service prioritizations, service-level agreements, and other consequences of traffic management and scheduling.¶

Device management in these environments must occur without human interactivity, without system-in-the-loop synchronous function, and without requiring a synchronous underlying transport layer. This means that managed devices need to determine their own schedules for data reporting, their own operational configuration, and perform their own error discovery and mitigation. Importantly, these capabilities must be designed and implemented in a way that results in outcomes that are determinable by an outside observer, as such observers may need to connect with a managed device after significant periods of disconnectivity.¶

The desire to define asynchronous and autonomous device management is not new. However, challenged networks (in general) and the DTN environment (in particular) represent unique deployment scenarios and impose unique design constraints. To the extent that these environments differ from more traditional, enterprise networks, their management may also differ from the management of enterprise networks. Therefore, existing techniques may need to be adapted to operate in the DTN environment or new techniques may need to be created.¶

Ultimately, the DTNMA is designed to leverage any transport, network, and security solutions designed for challenged networks. However the DTNMA is designed to be usable in any environment in which the Bundle Protocol (BPv7) [RFC9171] may be deployed.¶

1.1. Scope

This document describes the motivation, services, desirable properties, roles/responsibilities, logical data model, and system model that form the DTNMA. These descriptions comprise a concept of operations for management of challenged networks.¶

This document is not a normative standardization of a physical data model or any individual protocol. Instead, it serves as informative guidance to authors and users of such models and protocols.¶

The DTNMA is independent of transport and network layers. It does not, for example, require the use of BP, TCP, or UDP. Similarly, it does not pre-suppose the use of IPv4 or IPv6.¶

The DTNMA is not bound to a particular security solution and does not presume that transport layers can exchange messages in a timely manner. It is assumed that any network using this architecture supports services such as naming, addressing, routing, and security that are required to communicate DTNMA messages as would be the case with any other messages in the network.¶

While possible that a challenged network may interface with an unchallenged network, this document does not specifically address compatibility with other management approaches.¶

1.2. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].¶

1.3. Organization

The remainder of this document is organized into the following seven sections, described as follows.¶

Terminology - This section identifies those terms critical to understanding DTNMA concepts. Whenever possible, these terms align in both word selection and meaning with their analogs from other management protocols.¶
Challenged Management Characteristics¶
Desirable Properties - This section identifies the properties that guide the definition of the system and logical models that comprise the DTNMA.¶
Current Management Approaches¶
Motivation - This section provides an overall motivation for this work, to include explaining why this approach is a useful alternative to existing network management approaches.¶
Management Concept of Operations¶
Reference Model - This section defines a reference model that can be used to reason about the DTNMA operational concept absent a given network management implementation. This model identifies the logical elements of the system and the high-level relationships amongst those elements.¶
Desired Services - This section identifies and defines the DTNMA services provided to network and mission operators.¶
Logical Autonomy Model - This section provides an exemplar data model that can be used to reason about DTNMA control and data flows. This model is intentionally abstracted from both any specific implementation and any specific modeling approach.¶
Use Cases - This section presents multiple use cases accommodated by the DTNMA architecture. Each use case is presented as a set of control and data flows.¶

2. Terminology

Actor - A software service running on either managed or managing devices for the purpose of implementing management protocols between such devices. Actors may implement the "Manager" role, "Agent" role, or both.¶
Agent Role (or Agent) - A role associated with a managed device, responsible for reporting performance data, accepting/performing controls, error handling and validation, and executing any autonomous behaviors. DTNMA Agents exchange information with DTNMA Managers operating either on the same device or on a remote managing device.¶
DTN Management - Management that does not depend on stateful connections or real time delivery of management messages. Such management allows for asynchronous commanding to autonomous managers running on managed devices. This management is designed to run in any environment conformant to the DTN architecture and/or in any environment deploying a BPv7 network.¶
Externally Defined Data (EDD) - Information made available to a DTNMA Agent by a managed device, but not computed directly by the DTNMA Agent itself.¶
Variables (VARs) - Typed information that is computed by a DTNMA Agent, typically as a function of EDD values and/or other Variables.¶
Constants (CONST) - A Constant represents a typed, immutable value that is referred to by a semantic name. Constants are used in situations where substituting a name for a fixed value provides useful semantic information. For example, using the named constant PI rather than the literal value 3.14159.¶
Controls (CTRLs) - Procedures run by a DTNMA Actor to change the behavior, configuration, or state of an application or protocol being managed within a DTN. Controls may also be used to request data from an Agent and define the rules associated with generation and delivery.¶
Literals (LITs) - A Literal represents a typed value without a semantic name. Literals are used in cases where adding a semantic name to a fixed value provides no useful semantic information. For example, the number 4 is a Literal value.¶
Macros (MACROs) - A named, ordered collection of Controls and/or other Macros.¶
Manager Role (or Manager) - A role associated with a managing device responsible for configuring the behavior of, and eventually receiving information from, DTNMA Agents. DTNMA Managers interact with one or more DTNMA Agents located on the same device and/or on remote devices in the network.¶
Operator (OP) - The enumeration and specification of a mathematical function used to calculate variable values and construct expressions to evaluate DTNMA Agent state.¶
Report (RPT) - A typed, ordered collection of data values gathered by one or more DTNMA Agents and provided to one or more DTNMA Managers. Reports only contain typed data values and the identity of the Report Template (RPTT) to which they conform.¶
Report Template (RPTT) - A named, typed, ordered collection of data types that represent the schema of a Report. This template is generated by a DTNMA Manager and communicated to one or more other DTNMA Managers and DTNMA Agents.¶
Rule - A unit of autonomous specification that provides a stimulus-response relationship between time or state on a DTNMA Agent and the actions or operations to be run as a result of that time or state. A Rule might trigger actions such as updating a Variable, producing a Report or a Table, and running a Control.¶
State-Based Rule (SBR) - Any Rule triggered by the calculable internal state of the DTNMA Agent.¶
Synchronous Management - Management that assumes messages will be delivered and acted upon in real or near-real-time. Synchronous management often involves immediate replies of acknowledgment or error status. Synchronous management is often bound to underlying transport protocols and network protocols to ensure reliability of source and sender identification.¶
Table (TBL) - A typed collection of data values organized in a tabular way in which columns represent homogeneous types of data and rows represent unique sets of data values conforming to column types. Tables only contain typed data values and the identity of the Table Template (TBLT) to which they conform.¶
Table Template (TBLT) - A named, typed, ordered collection of columns that comprise the structure for representing tabular data values. This template forms the structure of a table (TBL).¶
Time-Based Rule (TBR) - A time-based rule is a specialization, and simplification, of a state-based rule in which the rule stimulus is triggered by relative or absolute time on a DTNMA Agent.¶

3. Defining DTN Network Management

This section describes those design properties that are desirable when defining an architecture that must operate across challenged links in a network. These properties ensure that network management capabilities are retained even as delays and disruptions in the network scale. Ultimately, these properties are the driving design principles for the DTNMA.¶

Early work on the rationale and motivation for specialized management for the DTN architecture was captured in [BIRRANE1], [BIRRANE2], and [BIRRANE3]. Prototyping work done in accordance with the DTN Research Group within the IRTF as documented in [I-D.irtf-dtnrg-dtnmp] provides some of the desirable properties and necessary adaptations for this proposed management system for challenged networks.¶

The unique nature and constraints that characterize challenged networks require the development of new network capabilities to deliver expected network functions. For example, the distinctive constraints of the DTN architecture required the development of BPv7 [RFC9171] for transport functions and the Bundle Protocol Security (BPSec) Extensions [RFC9172] to provide end-to-end security. Similarly, a new approach to network management and the associated capabilities is necessary for operation in these challenged environments and when using these new transport and security mechanisms.¶

This section discusses the characteristics of challenged networks and how they may violate the assumptions made by non-DTNMA approaches about the operating environment.¶

3.1. Challenged Networks

Constrained networks are defined as networks where "some of the characteristics pretty much taken for granted with link layers in common use in the Internet at the time of writing are not attainable." [RFC7228]. This broad definition captures a variety of potential issues relating to physical, technical, and regulatory constraints on message transmission. Constrained networks typically include nodes that regularly reboot or are otherwise turned off for long periods of time, transmit at low or asynchronous bitrates, or have very limited computational resources [RFC7228].¶

Separately, a challenged network is defined as one that "has serious trouble maintaining what an application would today expect of the end-to-end IP model" [RFC7228]. This definition includes networks where there is never simultaneous end-to-end connectivity, when such connectivity is interrupted at planned or unplanned intervals, or when delays exceed those that could be accommodated by IP-based transport. Links in such networks are often unavailable due to attenuation, propagation delays, mobility, occultation, and other limitations imposed by energy and mass considerations.¶

3.1.1. Properties of Challenged Networks

Challenged networks exhibit the following properties that impact the way in which the function of network management is considered. These properties can make the establishment of sessions, synchronous data exchange, and the transmission of larger payloads in these networking environments difficult or impossible.¶

No end-to-end path is guaranteed to exist at any given time between any two nodes.¶
Round-trip communications between any two nodes within any given time window may be impossible.¶
Latencies on the order of seconds, hours, or days must be tolerated.¶
Links may be uni-directional.¶
Bi-directional links may have asymmetric data rates.¶
Dependence on external infrastructure, software, systems, or processes such as Domain Name Service (DNS) or Certificate Authorities (CAs) cannot be guaranteed.¶

Finally, it is noted that "all challenged networks are constrained networks ... but not all constrained networks are challenged networks ... Delay-Tolerant Networking (DTN) has been designed to cope with challenged networks" [RFC7228].¶

Challenged networks differ from other kinds of constrained networks, in part, in the way that the topology and roles and responsibilities of the network may evolve over time. From the time at which data is generated to the time at which that data is delivered, the topology of the network and the roles assigned to various nodes, devices, and other actors may have changed several times. In certain circumstances, the physical node receiving messages for a given logical destination may have also changed.¶

Challenged networks cannot guarantee that a timely data exchange can be maintained between managing and managed devices. The topological changes characteristic of these networks can impact the path of messages, requiring the transport to wait to establish the incremental connectivity necessary to advance messages along their expected route. The BPv7 transport protocol implements this store-and-forward operation for DTNs.¶

3.1.1.1. Management of Challenged Networks

When topological change impacts the semantic roles and responsibilities of nodes in the network then local configuration and autonomy must be present at the node to determine and execute time-variant changes. For example, the BPSec protocol does not encode security destinations and, instead, requires nodes in a network to identify themselves as security verifiers or acceptors when receiving secured messages.¶

When applied to network management, the semantic roles of Agent and Manager may also change with the evolving topology of the network. Individual nodes must implement desirable behavior without relying on a single configuration oracle or other coordinating function such as an operator-in-the-loop and/or supporting infrastructure. These mechanisms cannot be supported by an asynchronous, challenged network.¶

The support for changing roles implies that there must not be a defined relationship between a particular managing and managed device in a network. A network management architecture for challenged networks must support the association of multiple managing devices with a single managed device, allow "control from" and "reporting to" managing devices to function independent of one another, and allow the logical role of a managing device to be physically shared among assets and change over time..¶

Together, this means that a network management architecture suitable for challenged environments must account for certain operational situations.¶

Managed devices that are only accessible via a uni-directional link, or via a link whose duration is shorter than a single round-trip propagation time.¶
Links that may be significantly constrained by capacity or reliability, but at (predictable or unpredictable) times may offer significant throughput.¶
Multi-hop challenged networks that interconnect two or more unchallenged networks such that managed and managing devices exist in different networks.¶
Networks unable to support session-based transport. For example, when propagation delays exceed the Maximum Segment Lifetime (MSL) of the Transmission Control Protocol (TCP).¶

In these and related scenarios, managed devices need to operate with local autonomy because managing devices may not be available within operationally-relevant timeframes. Managing devices deliver instruction sets that govern the local, autonomous behavior of the managed device. These behaviors include (but are not limited to) collecting performance data, state, and error conditions, and applying pre-determined responses to pre-determined events. The goal is asynchronous and autonomous communication between the device being managed and the manager, at times never expecting a reply, and with knowledge that commands and queries may be delivered much later than the initial request.¶

4. Desirable Properties

4.1. Asynchronous, Dynamic, and Highly Logical Architecture

A DTNMA built to support DTN must be agnostic of the underlying physical topology, transport protocols, security solutions, and supporting infrastructure. The DTNMA shall be limited to only the network management protocols, message structure, and information content, including but not limited to the type of objects to manage and the expected behavior and interaction upon access or execution of those objects. There shall be no prescribed association between between a manager and an agent other than those defined in the responsibilities associated with each in this document. There should be no limitation to the number of managers that can control an agent, the number of managers that an agent should report to, or any requirement that a manager and agent relationship implies a pair.¶

4.2. Model-derived and Hierarchically Organized Definition of Information

A model to define a shared contract between agent and manager has long been an approach to network management solutions. A model is a schema that defines this contract and defines all sources of information that can be retrieved, configured, or executed, as well as the various functions for parameterization, filtering, or event driven behavior. A model gives way to concise representation of information, intelligent suffixing, and patterning. The DTNMA model shall be designed with a limited set of object and data types to allow and be organized hierarchally to provide for highly compressible and concise encoding. This allows the agents and managers to infer context with limited link utilization necessary in DTNs.¶

4.3. Intelligent Push of Information

Pull management mechanisms require that a Manager send a query to an Agent and then wait for the response to that query. This practice implies a control-session between entities and increases the overall message traffic in the network. Challenged networks cannot guarantee that the round-trip data-exchange will occur in a timely fashion. In extreme cases, networks may be comprised of solely uni-directional links which drastically increases the amount of time needed for a round-trip data exchange. Therefore, pull mechanisms must be avoided in favor of push mechanisms.¶

Push mechanisms, in this context, refer to the ability of Agents to leverage rule-based criteria to determine when and what information should be sent to Managers. This could be based solely off logic applied to existing VARs or EDDs, based off operations applied to data elements, or triggered as a function of relative time.¶

Push mechanisms do not require round-trip communications as Managers do not request each reporting instance; Managers need only request once, in advance, that information be produced in accordance with a predetermined schedule or in response to a predefined state on the Agent. In this way information is "pushed" from Agents to Managers and the push is "intelligent" because it is based on some internal evaluation performed by the Agent.¶

4.4. Minimize Message Size Not Node Processing

Protocol designers must balance message size versus message processing time at sending and receiving nodes. Verbose representations of data simplify node processing whereas compact representations require additional activities to generate/parse the compacted message. There is no asynchronous management advantage to minimizing node processing time in a challenged network. However, there is a significant advantage to smaller message sizes in such networks. Compact messages require smaller periods of viable transmission for communication, incur less re-transmission cost, and consume less resources when persistently stored en-route in the network. A DTN Management Protocol (DTNMP) should minimize PDUs whenever practical, to include packing and unpacking binary data, variable-length fields, and pre-configured data definitions.¶

4.5. Absolute Data Identification

Elements within the management system must be uniquely identifiable so that they can be individually manipulated. Identification schemes that are relative to system configuration make data exchange between Agents and Managers difficult as system configurations may change faster than nodes can communicate.¶

Consider the following common technique for approximating an associative array lookup. A manager wishing to do an associative lookup for some key K1 will (1) query a list of array keys from the agent, (2) find the key that matches K1 and infer the index of K1 from the returned key list, and (3) query the discovered index on the agent to retrieve the desired data.¶

Ignoring the inefficiency of two pull requests, this mechanism fails when the Agent changes its key-index mapping between the first and second query. Rather than constructing an artificial mapping from K1 to an index, an AMP must provide an absolute mechanism to lookup the value K1 without an abstraction between the Agent and Manager.¶

4.6. Custom Data Definition

Custom definition of new data from existing data (such as through data fusion, averaging, sampling, or other mechanisms) provides the ability to communicate desired information in as compact a form as possible. Specifically, an Agent should not be required to transmit a large data set for a Manager that only wishes to calculate a smaller, inferred data set. These new defined data elements could be calculated and used both as parameters for local stimulus-response rule-based criteria or simply serve to populate custom reports and tables. Since the identification of custom data sets is likely to occur in the context of a specific network deployment, AMPs must provide a mechanism for their definition.¶

Aggregation of controls and custom formatting of reports and tables are equally important. Custom reporting provides the flexibility allowing the manager to define the desired format of all information to be sent over the challenged network from the agents, serving to both save link capacity and increase the value of returned information. Aggregation of controls allows a Manager to specify a set of controls to execute, specifying both the order and criteria of execution. This aggregate set of controls can be sent as a single command rather than a series of sequential operands. In this case it is additionally possible to use outputs of one command to serve as an input to the next at the Agent.¶

4.7. Autonomous Operation

DTNMA network functions must be achievable using only knowledge local to the Agent. Rather than directly controlling an Agent, a Manager configures an engine of the Agent to take its own action under the appropriate conditions in accordance with the Agent's notion of local state and time.¶

Such an engine may be used for simple automation of predefined tasks or to support semi-autonomous behavior in determining when to run tasks and how to configure or parameterize tasks when they are run. Wholly autonomous operations may be supported where required. Generally, autonomous operations should provide the following benefits.¶

Distributed Operation - The concept of pre-configuration allows the Agent to operate without regular contact with Managers in the system. The initial configuration (and periodic update) of the system remains difficult in a challenged network, but an initial synchronization on stimuli and responses drastically reduces needs for centralized operations.¶
Deterministic Behavior - Such behavior is necessary in critical operational systems where the actions of a platform must be well understood even in the absence of an operator in the loop. Depending on the types of stimuli and responses, these systems may be considered to be maintaining simple automation or semi-autonomous behavior. In either case, this preserves the ability of a frequently-out-of-contact Manager to predict the state of an Agent with more reliability than cases where Agents implement independent and fully autonomous systems.¶
Engine-Based Behavior - Several operational systems are unable to deploy "mobile code" based solutions due to network bandwidth, memory or processor loading, or security concerns. Engine-based approaches provide configurable behavior without incurring these types of concerns associated with mobile code.¶
Intelligent Authentication, Authorization, Accounting (AAA), and Error Checking - A means of autonomous AAA, error checking, and validation of data and controls will be required in all cases where agents or managers are disconnected from the rest of the network. In addition, there is a need to handle conflicts including messages that arrive out of order, or at the same time, from different managers whose controls would otherwise conflict. The need to perform these operations still exists however they will need to be performed with context provided with controls sent or in accordance with pre-defined behavior and policy.¶

5. Current Network Management Approaches and Limitations

Several network management solutions have been developed for both local-area and wide-area networks. Their capabilities range from simple configuration and report generation to complex modeling of device settings, state, and behavior. Each of these approaches are successful in the domains for which they have been built, but are not all equally functional when deployed in a challenged network.¶

Generally, network management solutions that require managing and managed devices to push and pull large sets of data may fail to operate in a challenged (and thus, constrained) environment as a function of transmit power, bitrates, and the ability of the network to store and forward large data volumes over long periods of time.¶

Newer network management approaches are exploring the application of more efficient message-based management, less reliance on end-to-end transport sessions, and increased levels of autonomy on managed devices. These approaches focus on problems different from those described above for challenged networks. For example, much of the autonomous network management work currently undertaken focuses more on well-resourced, unchallenged networks where devices self-configure, self-heal, and self-optimize with other nodes in their vicinity. While an important and transformational capability, such solutions will not be deployable in a challenged network environment.¶

This section describes some of the well-known, standardized protocols for network management and contrasts their purposes with the needs of challenged network management solutions.¶

5.1. Simple Network Management Protocol (SNMP)

Early network management tools designed for unchallenged networks provide synchronous mechanisms for communicating locally-collected data from devices to operators. Applications are managed using a "pull" mechanism, requiring a managing device to explicitly request the data to be produced and transmitted by a managed device.¶

The de facto example of this architecture is the Simple Network Management Protocol (SNMP) [RFC3416]. SNMP utilizes a request/response model to set and retrieve data values such as host identifiers, link utilizations, error rates, and counters between application software on managing and managed devices. Data may be directly sampled or consolidated into representative statistics. Additionally, SNMP supports a model for unidirectional push notification messages, called traps, based on predefined triggering events.¶

SNMP managing devices can query agents for status information, send new configurations, and request to be informed when specific events have occurred. Traps and queryable data are defined in a data model known as Managed Information Bases (MIBs) which define the information for a particular data standard, protocol, device, or application.¶

While there is a large installation base for SNMP, there are several aspects of the protocol that make it inappropriate for use in a challenged network. SNMP relies on sessions with low round-trip latency to support its "pull" model that challenged networks cannot maintain. Complex management can be achieved, but only through craftful orchestration using a series of real-time, end-to-end, managing-device-generated query-and-response logic that is not possible in challenged networks.¶

The SNMP trap model provides some low-fidelity Agent-side processing. Traps are typically used for alerting purposes, as they do not support an agent response to the event occurrence. In a challenged network where the delay between a managing device receiving an alert and sending a response can be significant, the SNMP trap model is insufficient for event handling.¶

Adaptive modifications to SNMP to support challenged networks and more complex application-level management would alter the basic function of the protocol (data models, control flows, and syntax) so as to be functionally incompatible with existing SNMP installations. This approach is therefore not suitable for use in challenged networks.¶

5.2. YANG Data Model and NETCONF, RESTCONF, and CORECONF

5.2.1. The YANG Data Model

Yet Another Next Generation (YANG) [RFC6020] is a data modeling language used to model configuration and state data of managed devices and applications. The YANG model defines a schema for organizing and accessing a device's configuration or operational information. Once a model is developed, it is loaded to both the client and server, and serves as a contract between the two. A YANG model can be complex, describing many containers of managed elements, each providing methods for device configuration or reporting of operational state.¶

YANG supports the definition of parameterized Remote Procedure Calls (RPCs) to be executed on managed nodes as well as the definition of push notifications within the model. The RPCs are used to execute commands on a device, generating an expected, structured response. However, RPC execution is strictly limited to those issued by the client. Commands are executed immediately and sequentially as they are received by the server, and there is no method to autonomously execute RPCs triggered by specific events or conditions.¶

YANG defines the schema for data used by network management protocols such as NETCONF [RFC6241], RESTCONF [RFC8040], and CORECONF [I-D.ietf-core-comi]. These protocols provide the mechanisms to install, manipulate, and delete the configuration of network devices.¶

5.2.2. YANG-Based Management Protocols

NETCONF is a stateful, XML-based protocol that provides a RPC syntax to retrieve, edit, copy, or delete any data nodes or exposed functionality on the server. It requires that underlying transport protocols support long-lived, reliable, low-latency, sequenced data delivery sessions. NETCONF connections are required to provide authentication, data integrity, confidentiality, and replay protection through secure transport protocols such as SSH or TLS. A bi-directional NETCONF session must be established before any data transfer can occur.¶

NETCONF uses verbose XML files to provide the ability to update and fetch multiple data elements simultaneously. These XML files are not easily or efficiently compressed, which is an important consideration for challenged networks.¶

RESTCONF is a stateless RESTful protocol based on HTTP. RESTCONF configures or retrieves individual data elements or containers within YANG data models by passing JSON over REST. This JSON encoding is used to GET, POST, PUT, PATCH, or DELETE data nodes within YANG modules. RESTCONF requires the use of a secure transport such as TLS.¶

Unlike NETCONF, RESTCONF is stateless. However, the transfer of large data sets, such as configuration changes of many data elements, or the collection of information, depends greatly on the support of synchronous communication.¶

CORECONF is stateless, as RESTCONF is, and is built atop the Constrained Application Protocol (CoAP) [RFC7252] which defines a messaging construct developed to operate specifically on constrained devices and networks by limiting message size and fragmentation. CORECONF requires the use of DTLS or Object Security for Constrained RESTful Environments (OSCORE) [RFC8613] to fulfill its security requirements. COAP supports a store and forward operation similar to DTN; however, it operates strictly at the application layer and requires specification of pre-determined proxies and moments of bi-directional communication.¶

CORECONF leverages the Concise Binary Object Representation (CBOR) [RFC8949] of YANG modules [I-D.ietf-core-yang-cbor] and provides further compressibility through the use of YANG Schema Item iDentifiers (SIDs) [I-D.ietf-core-sid]. While these design choices offer reductions in encoded data size, data compressibility is still dependent on underlying transport protocols and limited by the organization of the YANG schema.¶

5.2.3. Limitations of YANG-Based Approaches

YANG notifications are promising for challenged network management, defined as subscriptions to both YANG notifications [RFC8639] and YANG PUSH notifications [RFC8641]. In this model, a client may subscribe to the delivery of specific containers or data nodes defined in the model, either on a periodic or "on change" basis. The notification events can be filtered according to XPath [xpath] or subtree [RFC6241] filtering as described in [RFC8639] Section 2.2.¶

While the YANG model provides great flexibility for configuring a homogeneous network of devices, it becomes a burden in challenged networks where concise encoding is necessary. The YANG schema provides flexibility in the organization of data to the model developer. The YANG schema supports a broad range of data types noted in [RFC6991]. All the data nodes within a YANG model are referenced by a verbose, string-based path of the module, sub-module, container, and any data nodes such as lists, leaf-lists, or leaves, without any explicit hierarchical organization based on data or object type.¶

Recent efforts for compression of the YANG model have used CBOR [RFC9254] and SIDs [I-D.ietf-core-sid] to address YANG data nodes through integer identifiers. However, these compression strategies lack a formal hierarchical structure. The manual mapping of SIDs to YANG modules and data nodes limits the portability of these models and further increases the size of any encoding scheme.¶

5.3. Takeaways from Existing Network Management Protocols

While the protocols described above are useful and well-realized for different applications and networking environments, they simply do not meet the requirements for the management of challenged networks. However, that does not exclude features from each from contributing to the design of DTNMA.¶

The concept of a data model for describing network configuration elements has been used by many protocols to ensure compliance between managing and managed devices. A data model provides error checking and bounds operations, which is necessary when controlling mission critical devices.¶

The SNMP MIBs provide well-organized, hierarchical OIDs which support the compressibility necessary for challenged DTNs. YANG, NETCONF, and RESTCONF support notification abilities needed for DTN network management, but have limited features for describing autonomous execution and behavior.¶

CORECONF provides CBOR encoding and concise reference abilities using SIDs, but lack a hierarchical structure or authoritative planning to allocation. While this approach will become too verbose and prove limiting in the future, the encoding considerations from CORECONF can be used to inform the design of the DTNMA.¶

8. Reference Model

There are a multitude of ways in which both existing and emerging network management protocols, APIs, and applications can be integrated for use in challenged environments. However, expressing the needed behaviors of the DTNMA in the context of any of these pre-existing elements risks conflating systems requirements, operational assumptions, and implementation design constraints.¶

One way to avoid such conflation is to, instead, develop a reference model that can be used to reason about a system independent of implementation. Such a DTNMA reference model is provided in Figure 1 below.¶

DTNMA Reference Model¶

         Managed Device                             Managing Device
 +----------------------------+             +-----------------------------+
 | +------------------------+ |             | +-------------------------+ |
 | |Applications & Services | |             | | Applications & Services | |
 | +----------^-------------+ |             | +-----------^-------------+ |
 |            |               |             |             |               |
 | +----------v-------------+ |             | +-----------v-------------+ |
 | | DTNMA  +-------------+ | |             | | +-----------+   DTNMA   | |
 | | AGENT  | Monitor and | | |  Controls   | | |  Policy   |  MANAGER  | |
 | |        |   Control   | | |<============| | | Encoding  |           | |
 | | +------+-------------+ | |             | | +-----------+-------+   | |
 | | |Admin | Data Fusion | | |============>| | | Reporting | Admin |   | |
 | | +------+-------------+ | |    Reports  | | +-----------+-------+   | |
 | +------------------------+ |             | +-------------------------+ |
 +----------------------------+             +-----------------------------+
                 ^                                        ^
                 |          Pre-Shared Definitions        |
                 |      +---------------------------+     |
                 +------| - Autonomy Model          |-----+
                        | - Application Data Models |
                        | - Runtime Data Stores     |
                        +---------------------------+

Figure 1

In this reference model, applications and services on a managing device communicate with a DTNMA Manager (DM) which uses pre- shared definitions to create a set of directives that can be sent to a managed device's DTNMA Agent (DA). The DA provides local monitoring and control of the applications and services resident on the managed device. The DA also performs local data fusion as necessary to synthesize data products (such as reports) that can be sent back to the DM when appropriate.¶

This model preserves the familiar concept of "managers" resident on managing devices and "agents" resident on managed devices. However, the DTNMA model is unique in how the DM and DA operate. The DM is used to pre-configure DAs in the network with management policies. it is expected that the DAs, themselves, perform monitoring and control functions on their own. In this way, a properly configured DA may operate without a timely, reliable connection back to a DM.¶

8.1. Functional Elements

The reference model illustrated in Figure 1 implies the existence of certain logical elements whose roles and responsibilities are discussed in this section.¶

8.1.1. Managed Applications and Services

By definition, managed applications and services reside on a managed device. These software entities can be controlled through some interface by the DA and their state can be sampled as part of periodic monitoring. It is presumed that the DA on the managed device has the proper data model, control interface, and permissions to alter the configuration and behavior of these software applications.¶

8.1.2. DTNMA Agent

A DTNMA Agent resides on a managed device. As is the case with other network management approaches, this agent is responsible for the monitoring and control of the applications local to that device. Unlike other network management approaches, the agent accomplishes this task without a regular connection to a DTNMA Manager.¶

The DTNMA Agent performs three major functions on a managed device: the monitoring and control of local applications, production of data analytics, and the administrative control of the agent itself.¶

8.1.2.1. Monitoring and Control

DTNMA Agents monitor the status of applications running on their managed device and selectively control those applications as a function of that monitoring. The following components are used to perform monitoring and control on an agent.¶

Rules Database: A DTNMA Agent monitors the state of the managed device looking for pre-defined stimuli and, when encountered, issuing a pre-defined response. The tuple of stimulus-response is termed a "rule". Within the DTNMA these rules are the embodiment of policy expressions received from managed and evaluated at regular intervals by the autonomy engine. The rules database is the collection of active rules known to the DA.¶
Autonomy Engine: The DA autonomy engine is configured with policy expressions describing expected reactions to potential events. This engine is configured by managers during periods of connectivity. Once configured, the engine may function without other access to any managing device. This engine may also reconfigure itself as a function of policy.¶
Application Control Interfaces: DTNMA Agents must support control interfaces for all managed applications. Control interfaces are used to alter the configuration and behavior of an application. These interfaces may be custom for each application, or as provided through a common framework such as provided by an operating system.¶

8.1.2.2. Data Fusion

DTNMA Agents generate new data elements as a function of the current state of the managed device and its applications. These new data products may take the form of individual data values, or new collections of data used for reporting. The logical components responsible for these behaviors are as follows.¶

Application Data Interfaces: DAs must support mechanisms by which important state is retrieved from various applications resident on the managed device. These data interfaces may be custom for each application, or as provided through a common framework such as provided by an operating system.¶
Data Value Generators: DAs may support the generation of new data values as a function of other values collected from the managed device. These data generators may be configured with descriptions of data values and the data values they generate may be included in the overall monitoring and reporting associated with the managed device.¶
Report Generators: DAs may, as appropriate, generate collections of data values for transmission to managers. Reports can be generated as a matter of policy or in response to the handling of critical events (such as errors), or other logging needs. The generation of a report is independent of whether there exists any connectivity between a DA and a DM. It is assumed that reports are queued on an agent pending transmit opportunities.¶

8.1.2.3. Administration

Agents in the DTNMA must perform a variety of administrative services in support of their configuration. The significant such administrative services are as follows.¶

Manager Mapping: The DTNMA allows for a many-to-many relationship amongst DTNMA Agents and Managers. A single DM may configure multiple DAs, and a single DA may be configured by multiple DMs. Multiple managers may exist in a network for at least two reasons. First, different managers may exist to control different applications on a device. Second, multiple managers increase the likelihood of an agent encountering a manager when operating in a sparse or challenged environment.¶
Data Validators: DAs might handle large amounts of data produced by various sources, to include data from local managed applications, remote managers, and self-calculated values. DAs should ensure that externally generated data values are both verified and validated. DAs should also verify, at a minimum, the integrity and confidentiality of data values.¶
Access Controllers: DAs support authorized access to the management of individual applications, to include the administrative management of the agent itself. This means that a manager may only set policy on the agent pursuant to verifying that the manager is authorized to do so.¶

8.1.3. Managing Applications and Services

Managing applications and services reside on a managing device and serve as the both the source of DA policy statements and the target of DA reporting. They may operate with or without an operator in the loop.¶

Unlike management applications in unchallenged networks, these applications cannot exert closed-loop control over any managed device application. Instead, these applications must be built to exercise open-loop control by producing policies that can be configured and enforced on managed devices by DAs.¶

8.1.4. DTNMA Manager

A DTNMA Manager resides on a managing device. This manager provides an interface between various managing applications and services and the DTNMA Agents that enforce their policies. In providing this interface, DMs translate between whatever native interface exists to various managing applications and the autonomy models used to encode management policy.¶

The DTNMA Manager performs three major functions on a managing device: policy encoding, reporting, and administration.¶

8.1.4.1. Policy Encoding

DTNMA Managers translate policy directives from managing applications and services into standardized policy expressions that can be recognized by DTNMA Agents. The following logical components are used to perform this policy encoding.¶

Application Control Interfaces: DTNMA Managers must support control interfaces for managing applications. These control interfaces are used to receive desired policy statements from applications. These interfaces may be custom for each application, or as provided through a common framework, protocol, or operating system.¶
Policy Encoders: DTNMA Agents implement a standardized autonomy model comprising standardized data elements. The open-loop control structures provided by managing applications must be represented in this common language. Policy encoders perform this encoding function.¶
Policy Aggregators: DTNMA Managers must collect multiple encoded policies into messages that can be sent to DAs over the network. This implies the proper addressing of agents and the creation of messages that support store-and-forward operation. It is recommended that control messages be packaged using the BPv7 when there may be intermittent connectivity between DMs and DAs.¶

8.1.4.2. Reporting

DTNMA Managers receive reports on the status of managed devices during period of connectivity with the DTNMA agents on those devices. The following logical components are needed to implement reporting capabilities on a manager.¶

Report Collectors: DTNMA Managers receive reports from DTNMA Agents in an asynchronous manner. This means that reports may be received out of chronological order and in ways that are difficult or impossible to associate with a specific policy from a managing application. DMs collect these reports and extract their data in support of subsequent data analytics.¶
Data Analyzers: DTNMA Managers review sets of data reports from DTNMA Agents with the purpose of extracting relevant data to communicate with managing applications. This may include simple data extraction or may include more complex processing such as data conversion, data fusion, and appropriate data analytics.¶
Application Data Interfaces: DMs must support mechanisms by which data retrieved from agent may be provided back to managing devices. These interfaces may be custom for each application, or as provided through a common framework, protocol, or operating system.¶

8.1.4.3. Administration

Agents in the DTNMA must perform a variety of administrative services in support of their proper configuration and operation. This includes the following logical components.¶

Agent Mappings: The DTNMA allows DMs to communicate with multiple DAs. However, not every agent in a network is expected to support the same set of Application Data Models or otherwise have the same set of managed applications running. For this reason, DMs must determine individual DA capabilities to ensure that only appropriate controls are sent to a DA.¶
Data Validators: DMs handle large amounts of data produced by various sources, to include data from managing applications and DAs. Managers must ensure that all data values are both verified and validated. In particular, managers must verify, at a minimum, the integrity and confidentiality of data values received from agents over a network.¶
Access Controllers: DMs should only send controls to agents when the manager is configured with appropriate access to both the agent and the applications being managed.¶

8.1.5. Pre-Shared Definitions

A consequence of operating in a challenged environment is the potential inability to negotiate information in real-time. For this reason, the DTNMA requires that managed and managing devices operate using pre-shared definitions rather than relying on data definition negotiation.¶

The three types of pre-shared definitions in the DTNMA are the DTNMA Agent autonomy model, managed application data models, and any runtime data shared by managers and agents.¶

Autonomy Model: A DTNMA autonomy model represents the data elements and associated autonomy structures that define the behavior of the agent autonomy engine. A standardized autonomy model allows for individual implementations of DTNMA Agents, and DTNMA Managers to interoperate. A standardized model also provides guidance to the design and implementation of both managed and managing applications.¶

NOTE: A standardized autonomy model is required for the interoperable encoding of policy statements. However, the DTNMA does not standardize a specific transport of those policy statements between agents and managers. The DTNMA also does not specify any transport-related encoding.¶
Application Data Models: As with other network management architectures the DTNMA pre-supposes that managed applications (and services) define their own data models. These data models include the data produced by, and controls implemented by, the application. These models are expected to be static for individual applications and standardized for applications implementing standard protocols.¶
Runtime Data Stores: Runtime data stores, by definition, include data that is defined at runtime. As such, the data is not pre-shared prior to the deployment of managers and agents. Pre-sharing in this context means that managers and agents are able to define and synchronize data elements prior to their operational use in the system. This synchronization happens during periods of connectivity between managers and agents.¶

9. Desired Services

This section provides a description of the services provided by DTNMA elements on both managing and managed devices. These service descriptions differ from other management descriptions because of the unique characteristics of the DTNMA operating environment.¶

9.1. Local Monitoring and Control

DTNMA monitoring is associated with the agent autonomy engine. The term monitoring implies timely and regular access to information such that state changes may be acted upon within some response time period. Within the DTNMA, connections between a managed and managing device are unable to provide such a connection and, thus, monitoring functions must be handled on the managed device.¶

Predicate autonomy on a managed device should collect state associated with the device at regular intervals and evaluate that collected state for any changes the require a preventative or corrective action. Similarly, this monitoring may cause the device to generate one or more reports destined to the managing device.¶

Similar to monitoring, DTNMA control results in actions by the agent to change the state or behavior of the managed device. All control in the DTNMA is local control. In cases where there exists a timely connection to a manager, received controls a are still run through the autonomy engine. In this case, the stimulus is the direct receipt of the control and the response is to immediately run the control. In this way, there is never a dependency on a session or other stateful exchange with any remote entity.¶

9.2. Local Data Fusion

DTNMA Fusion services produce new data products from existing state on the managed device. These fusion products can be anything from simple summations of sampled counters complex calculations of behavior over time.¶

Fusion is an important service in the DTNMA because fusion products are part of the overall state of a managed device. Complete knowledge of this overall state is important for the management of the device, particularly in a stimulus-response system whose stimuli are evaluated against this state.¶

While some fusion is performed in any management system, the DTNMA requires fusion to occur on the managed device itself. If the network is partitioned such that no connection to a managing device is available, fusion must happen locally. Similarly, connections to a managing device might not remain active long enough for round-trip data exchange or may not have the bandwidth to send all sampled data.¶

9.3. Remote Configuration

DTNMA configuration services must update the local configuration of a managed device with the intent to impact the behavior and capabilities of that device. The change of device configurations is a common service provided by many network management systems. The DTNMA has a unique approach to configuration for the following reasons.¶

The DTNMA configuration service is unique in that the selection of managed device configurations must occur, itself, as a function of the state of the device. This implies that management proxies on the device store multiple configuration functions that can be applied as needed without consultation from a managing device.¶

When detecting stimuli, the agent autonomy engine must support a mechanism for evaluating whether application monitoring data or runtime data values are recent enough to indicate a change of state. In cases where data has not been updated recently, it may be considered stale and not used to reliably indicate that some stimulus has occurred.¶

9.4. Remote Reporting

DTNMA reporting services collect information known to the managed device and prepare it for eventual transmission to one or more managing devices. The creation of these reports are intelligent in that the contents and frequency of this reporting occurs as a function of the state of the managed device, independent of the managing device.¶

Once generated, it is expected that reports might be queued pending a connection back to a managing device. Therefore, reports must be differentiable as a function of the time they were generated.¶

When reports are sent to a managing device over a challenged network, they may arrive out of order due to taking different paths through the network or being delayed due to retranmissions. A managing device should not infer meaning from the order in which reports are received, not should a given report be associated with a specific control or autonomy action on a given managed device.¶

9.5. Authorization

Both local and remote services provided by the DTNMA affect the behavior of multiple applications on a managed device and may interface with multiple managing devices. It is expected that transport protocols used in any DTNMA implementation support security services such as integrity and confidentiality.¶

Authorization services enforce the potentially complex mapping of other DTNMA services amongst managed and managing devices in the network. For example, fine-grained access control can determine which managing devices receive which reports, and what controls can be used to alter which managed applications.¶

This is particularly beneficial in networks that either deal with multiple administrative entities or overlay networks that cross administrative boundaries. Whitelists, blacklists, key-based infrastructures, or other schemes may be used for this purpose.¶

10. Logical Autonomy Model

An important characteristic of the DTNMA is the shift in the role of a managing device. In the DTNMA, managers configure the autonomy engines on agents, and it is the agents that provide local device management. One way to describe the behavior of the agent autonomy engine is to describe the characteristics of the autonomy model it implements.¶

This section describes a logical autonomy model in terms of the abstract data elements that would comprise the model. Defining abstract data elements allows for an unambiguous discussion of the behavior of an autonomy model without mandating a particular design, encoding, or transport associated with that model.¶

10.1. Overview

Managing autonomy on a potentially disconnected device must behave in both an expressive and deterministic way. Expressivity allows for the model to be configured for a wide range of future situations. Determinism allows for the forensic reconstruction of device behavior as part of debugging or recovery efforts.¶

The DTNMA autonomy model is built on a stimulus-response model in which the autonomy system responses to pre-identified stimuli with pre-configured responses. Stimuli are identified using simple predicate logic that examine aspects of the state of the managed device. Responses are implemented by running one or more procedures on the managed device.¶

As with many such systems, behavior can be captured using the construct:¶

IF stimulus THEN response¶

DTNMA Autonomy Model¶


   Managed Applications  |           DTNMA Agent           |   DTNMA Manager
-------------------------+---------------------------------+-----------------+
                         |   +---------+                   |
                         |   |  Local  |                   |   Encoded
                         |   | Rule DB |<--------------------- Policy
                         |   +---------+                   |   Expressions
                         |        ^                        |
                         |        |                        |
                         |        v                        |
                         |   +----------+     +---------+  |
      Monitoring Data------->|   Agent  |     | Runtime |  |
                         |   | Autonomy |<--->|  Data   |<---- Definitions
  Application Control<-------|  Engine  |     |  Store  |  |
                         |   +----------+     +---------+  |
                         |         |                       |
                         |         +--------------------------> Reports
                         |                                 |

Figure 2

The flow of data into and out of the agent autonomy engine is illustrated in Figure 2. In this model, the autonomy engine stores the combination of stimulus conditions and associated responses as a set of "rules" in a rules database. This database is updated through the execution of the autonomy engine and as configured from policy statements received by managers.¶

Stimuli are detected by examining the state of applications as reported through application monitoring interfaces and through any locally-derived data. Local data is calculated in accordance with definitions also provided by managers as part of the runtime data store.¶

Responses to stimuli are run as updated to the rules database, updated to the runtime data store, controls sent to applications, and the generation of reports.¶

10.2. Model Characteristics

There are a number of ways to represent data values, and many data modeling languages exist for this purpose. When considering how to model data in the context of the DTNMA autonomy model there are some modeling features that should be present to enable functionality. There are also some modeling features that should be prevented to avoid ambiguity.¶

Traditional network management approaches favor flexibility in their data models. The DTNMA stresses deterministic behavior that supports forensic analysis of agent activities "after the fact". As such, the following statements should be true of all data representations relating to DTNMA autonomy.¶

Strong Typing - The predicates and expressions that comprise the autonomy services in the DTNMA should require strict data typing. This avoids errors associated with implicit data conversions and helps detect misconfiguration.¶
Acyclic Dependency - Many dependencies exist in an autonomy model, particularly when combining individual expressions or results to create complex behaviors. Implementations that conform to the DTNMA must prevent circular dependencies.¶
Fresh Data - Autonomy models operating on data values presume that their data inputs represent the actionable state of the managed device. If a data value has failed to be refreshed within a time period, autonomy might incorrectly infer an operational state. Regardless of whether a data value has changed, DTNMA implementations must provide some indicator of whether the data value is "fresh" meaning that is still represents the current state of the device.¶
Pervasive Parameterization - Where possible, autonomy model objects should support parameterization to allow for flexibility in the specification. Parameterization allows for the definition of fewer unique model objects and also can support the substitution of local device state when exercising device control or data reporting.¶
Configurable Cardinality - The number of data values that can be supported in a given implementation is finite. For devices operating in challenged environments, the number of supported objects may be far fewer than that which can be supported by devices in well-resourced environments. DTNMA implementations should define limits to the number of supported objects that can be active in a system at one time, as a function of the resources available to the implementation.¶
Control-Based Updates - The agent autonomy engine changes the state of the managed device by running controls on the device. This is different from other approaches where the behavior of a managed device is updated only by updated configuration values, such as in a table or datastore. Altering behavior via one or more controls allows checking all pre-conditions before making changes as well as providing more granularity in the way in which the device is updated. Where necessary, controls can be defined to perform bulk updated of configuration data so as not to lose that update modality.¶

10.3. Data Value Representation

The expressive representation of data values is fundamental to the successful construction and evaluation of predicates in the DTNMA autonomy model. This section describes the characteristics of data representation for this model, both as individual data values and ways to aggregate these values into collections.¶

There is a useful distinction that can be made regarding the way in which data values are assigned in the context of an autonomy system. This section discusses four categories of assigning strategies and proposes mnemonics to differentiate each.¶

The four categories of value assignment can be derived by determining whether values are calculated internal or external to the autonomy model and whether, once calculated, these values can be changed.¶

Table 1: Data Value Categories and Mnemonics
	Immutable	Mutable
Internally Defined	CONST	LIT
Externally Defined	VAR	EDD

Constants (CONST) - Constant data values are named values that are defined in the context of the autonomy model. Both the name and the value of the constant are fixed and cannot be changed. An example of a constant would be defining the numerical value PI to 2 digits of precision (PI_2_DIGITS = 3.14).¶

Literals (LIT) - Literal data values are those whose name and value are the same. These values are used to represent atomic values that are too simple to be represented a constant. For example, the number 4 is a literal value. The name "4" and the value 4 are the same and inseparable. Literal values cannot change ("4" could not be used to mean 5) and they are defined external to the autonomy model (the autonomy model is not expected to redefine what 4 means).¶

Variables (VAR) - Variables are named data values defined by the autonomy model itself. They can be added and removed as a function of the function of the autonomy model, and the autonomy model is the sole determiner of their value. An example of a variable in an autonomy model would be the number of times that a particular predicate evaluated to true.¶

Externally-Defined Data (EDD) - External data values are those provided to the autonomy model from its hosting environment. These values are the foundation of state-based autonomy as they capture the state of the managed device. The autonomy model treats these values as read-only inputs. Examples of externally defined values include temperature sensor readings and the instantaneous data rate from a radio.¶

10.4. Data Reporting

The DTNMA autonomy model should, as required, report on the state of its managed device (to include the state of the model itself). This reporting should be done as a function of the changing state of the managed device, independent of the connection to any managing device. Queuing reports allows for later forensic analysis of device behavior, which is a desirable property of DTNMA management.¶

There are at least four useful categories of reporting mechanism that should be present in the DTNMA These categories can be distinguished by whether the reported data share a common structure or not, and whether the report mechanism represents a scheme or data adherent to that schema.¶

Table 2: Data Reporting Mechanisms and Mnemonics
	Schema	Values
Common Structure	TBLT	TBL
Mixed Structure	RPTT	RPT

10.4.1. Tabular Reports (TBLs) and Tabular Report Templates (TBLTs)

Relational database tables provide collection, filtering, and reporting efficiencies when representing series of data collections that share a common syntactic structure and semantic meaning. Tables have a fixed structure identified by one or more vertical columns. They are populated by zero or more data collections, with one row per represented data collection.¶

To the extent that DTNMA reporting includes data collections similarly adhering to a common structure, these reports can be modeled similarly to tables. Such reports are called tabular reports (TBLs).¶

Every TBL is populated in accordance to a pre-defined schema, which is termed the Tabular Report Template (TBLT). This template defines the columns that comprise the TBL and associated constraints on data values for those columns.¶

Dissimilar to relational database tables, TBLs are reporting mechanisms. They represent a report generated at a specific moment in time. Therefore, a managed device may produce and queue for transmission multiple TBLs for the same TBLT.¶

10.4.2. Reports (RPT) and Report Templates (RPTT)

Not all reportable data collections are efficiently represented in a tabular structure. In cases where there is no processing or encoding advantage to a tabular report, a non-tabular representation is needed. This representation is termed the DTNMA report (RPT).¶

A RPT is a snapshot of a collection of data values at a given moment in time. The type, number, order, and other details of these data values is given by a schema called the Report Template (RPTT).¶

Separating the structure (RPTT) and content (RPT) of a general purpose reporting mechanism reduces the size of generated traffic, which is an important property of the DTNMA.¶

10.5. Command Execution

The agent autonomy engine requires that managed devices issue commands on themselves as if they were otherwise being controlled by a managing device. The ability to support this type of commanding in the autonomy model is one of the unique requirements of the DTNMA. This approach is not dissimilar to the concept of Remote Procedure Calls (RPCs) that are sometimes used in low- latency, high-availability approaches to network management mechanisms.¶

Command execution in the DTNMA happens through the use of controls and macros.¶

Controls (CTRL) - A control represents a parameterized, predefined procedure that is run by the agent autonomy engine. CTRLs are conceptually similar to RPCs in that they represent parameterized functions run on the managed device. However, they are conceptually dissimilar from RPCs in that they do not have a concept of a return code as they must operate over an asynchronous transport. The concept of return code in an RPC implies a synchronous relationship between the caller of the procedure and the procedure being called, which might not be possible within the DTNMA.¶

NOTE: The use of the term Control in the DTNMA is derived in part from the concept of Command and Control (C2) where control implies the operational instructions that must be undertaken to implement (or maintain) a commanded objective. The agent autonomy engine controls a managed device to allow it to fulfill some purpose as commended by a (possibly disconnected) managing device.¶

For example, attempting to maintain a safe internal thermal environment for a spacecraft is considered "thermal control" ( not "thermal commanding") even though thermal control involves sending commands to heaters, louvers, radiators, and other temperature-affecting components.¶

Even when CTRLs are received from a managing device with the intent to be run immediately, the control-vs-command distinction still applies. The CTRL run on the managed device is in service of the command received from the managing device to immediately change the local state of the device.¶

The success or failure of a CTRL may be handled locally by the agent autonomy engine. Otherwise, the externally observable impact of a CTRL can be understood through the generation and eventual examination of data reports produced by the managed device.¶

Macros (MACRO) - A Macro represents an ordered sequence of CTRLs execution. They may be implemented as a set of CTRLs, or as a mixed set of both MACRO and CTRL objects. Similar to CTRLs, a MACRO object should support parameterization and should not support a return code back to a caller.¶

10.6. Predicate Autonomy

The core function of the agent autonomy engine is to apply predetermined responses to predetermined state on a managed device. This involves the ability to calculate predicate expressions and the ability to associate the positive evaluation of these expressions with command execution.¶

10.6.1. Expressions

There are a few instances within the DTNMA autonomy model where a value must be calculated by the model itself, to include the following.¶

Calculating the value of a VAR.¶
Evaluating a predicate to see if it is true.¶

In cases such as these, the DTNMA must support an efficient, configurable syntax for defining expressions, calculating the value of these expressions based on the local state of the managed device, and using the calculated value in an appropriate way.¶

Expression (EXPR) - An Expression is a combination of operators and operands used to construct a numerical value from a series of other data values in the autonomy model.¶

Operator (OP) - An Operator represents a operation performed on at least one operand and returning a single result that, itself, can be used as an operand to some other operator. OPs may represent simple (+, -) or complex (sin, avg) mathematical functions or custom functions defined for the managed device.¶

Operands may be built from any autonomy model object that can be associated with a data value, to include the CONST, LIT, VAR, and EDD types, the result of an OP, and the result of a fully evaluated EXPR.¶

Predicate Expression (PRED) - A Predicate Expression is an EXPR whose evaluated data value is interpreted in a logical way as being either true or false.¶

10.6.2. Rules

A stimulus-response system associated stimulus detection with a commanded response. In the DTNMA, this relationship is captured through the definition of rules. These rules may be defined as focused on either the state of the managed device or optimized to only examine how time has passed on the managed device.¶

State-Based Rules (SBRs) - A state-based rule is one whose stimulus is indicated when a given PRED evaluates to true. Since the PRED is a combination of sampled and calculated data values on the managed device, evaluation of the PRED is evaluating the relevant state of the device. A SBR is one of the form:¶

IF PRED THEN MACRO¶

Time-Based Rules (TBRs) - A time-based rule is a specialization of a SBR that is optimized to only consider the passage of time on the managed device. A TBR is one of the form:¶

EVERY interval THEN MACRO¶

11. Use Cases

Using the autonomy model mnemonics defined in Section 10, this section describes flows through sample configurations conforming to the DTNMA. These use cases illustrate remote configuration, local monitoring and control, multiple manager support, and data fusion.¶

11.1. Notation

The use cases presented in this section are documented with a shorthand notation to describe the types of data sent between managers and agents. This notation, outlined in Table 3, leverages the mnemonic definitions of autonomy model elements defined in Section 10.¶

Table 3: Terminology
Term	Definition	Example
EDD#	Enumerated EDD definition.	EDD1
V#	Enumerated VAR definition.	V1 = EDD1 + V0.
ACL#	Enumerated Access Control List.	ACL1
DEF([ACL],ID,EXPR)	Define ID from expression. Allow managers in ACL to see this ID.	DEF([ACL1], V1, EDD1 + EDD2)
PROD(P,ID)	Produce ID according to predicate P. P may be a time period (1s) or an expression (EDD1 > 10).	PROD(1s, EDD1)
RPT(ID)	A report containing data named ID.	RPT(EDD1)

These notations do not imply any implementation approach. They only provide a succinct syntax for expressing the data flows in the use case diagrams in the remainder of this section.¶

11.2. Serialized Management

This is the nominal configuration of network management where a Manager interacts with a set of Agents. The control flows for this are outlined in Figure 3.¶

Serialized Management Control Flow¶

           +-----------+           +---------+           +---------+
           | Manager A |           | Agent A |           | Agent B |
           +----+------+           +----+----+           +----+----+
                |                       |                     |
                |-----PROD(1s, EDD1)--->|                     | (1)
                |----------------------------PROD(1s, EDD1)-->|
                |                       |                     |
                |                       |                     |
                |<-------RPT(EDD1)------|                     | (2)
                |<----------------------------RPT(EDD1)-------|
                |                       |                     |
                |                       |                     |
                |<-------RPT(EDD1)------|                     |
                |<----------------------------RPT(EDD1)-------|
                |                       |                     |
                |                       |                     |
                |<-------RPT(EDD1)------|                     |
                |<----------------------------RPT(EDD1)-------|
                |                       |                     |

Figure 3

In a simple network, a Manager interacts with multiple Agents.¶

In this figure, the Manager A sends a policy to Agents A and B to report the value of an EDD (EDD1) every second in (step 1). Each agent receives this policy and configures their respective autonomy engines for this production. Thereafter, (step 2) each agent produces a report containing data element EDD1 and sends those reports back to the manager.¶

This behavior continues without any additional communications from the manager and without requiring that there exist a connection back to the manager.¶

11.3. Intermittent Connectivity

This is a challenged configuration of network management where connectivity between Agent B and the Manager is temporarily lost. Flows in this case are outlined in Figure 4.¶

Challenged Management Control Flow¶

           +-----------+           +---------+           +---------+
           | Manager A |           | Agent A |           | Agent B |
           +----+------+           +----+----+           +----+----+
                |                       |                     |
                |-----PROD(1s, EDD1)--->|                     | (1)
                |----------------------------PROD(1s, EDD1)-->|
                |                       |                     |
                |                       |                     |
                |<-------RPT(EDD1)------|                     | (2)
                |<----------------------------RPT(EDD1)-------|
                |                       |                     |
                |                       |                     |
                |<-------RPT(EDD1)------|                     |
                |<----------------------------RPT(EDD1)-------|
                |                       |                     |
                |                       |                     |
                |<-------RPT(EDD1)------|                     |
                |                       |            RPT(EDD1)| (3)
                |                       |                     |
                |                       |                     |
                |<-------RPT(EDD1)------|                     |
                |                       |            RPT(EDD1)| (4)
                |                       |                     |
                |                       |                     |
                |<-------RPT(EDD1)------|                     |
                |<----------------RPT(EDD1), RPT(EDD1)--------| (5)
                |                       |                     |

Figure 4

In a challenged network, agents store reports pending a transmit opportunity.¶

In this figure, Manager A sends a policy to Agents A and B to produce an EDD (EDD1) every second in (step 1). Each agent receives this policy and configures their respective autonomy engines for this production. Products reports are transmitted when produced (step 2).¶

At some point, Agent B loses the ability to transmit in the network (steps 3 and 4). During this time period, reports continue to be produced, but queued. This queuing might be done by the agent itself or by a supporting transport such as BPv7. Eventually, Agent B is able to transmit in the network again (step 5) and all queued reports are sent at that time.¶

11.4. Open-Loop Reporting

The open-loop control paradigm of the DTNMA does not support a one-to-one relationship between a manager's expression of policy and an agent's reporting of the state of its managed device. This use case illustrates the concept of open-loop control. In this paradigm, agents in the network manage themselves in accordance with policies and build consolidated reports of their state.¶

This flow is shown in Figure 5, where multiple policies configured by a manager are represented in a single reporting activity from an agent.¶

Consolidated Management Control Flow¶

           +-----------+           +---------+           +---------+
           | Manager A |           | Agent A |           | Agent B |
           +----+------+           +----+----+           +----+----+
                |                       |                     |
                |-----PROD(1s, EDD1)--->|                     | (1)
                |----------------------------PROD(1s, EDD1)-->|
                |                       |                     |
                |                       |                     |
                |<-------RPT(EDD1)------|                     | (2)
                |<----------------------------RPT(EDD1)-------|
                |                       |                     |
                |                       |                     |
                |----------------------------PROD(1s, EDD2)-->| (3)
                |                       |                     |
                |                       |                     |
                |<-------RPT(EDD1)------|                     |
                |<--------------------------RPT(EDD1,EDD2)----| (4)
                |                       |                     |
                |                       |                     |
                |<-------RPT(EDD1)------|                     |
                |<--------------------------RPT(EDD1,EDD2)----|
                |                       |                     |

Figure 5

There is not a one-to-one mapping between management policy and device state reporting.¶

In this figure, Manager A sends a policy to Agents A and B to produce an EDD (EDD1) every second (step 1). Each agent receives this policy and configures their respective autonomy engines for this production. Reports are transmitted when produced (step 2).¶

At a later time (step 3) Manager A sends an additional policy to Agent B to also produce an EDD (EDD2) ever second. This policy is received and configured on the autonomy engine on Agent B.¶

Thereafter (step 4) Agent A will continue to produce EDD1 and Agent B will produce both EDD1 and EDD2. However, Agent B may produce these values together in a single report rather than 2 independent reports. In this way, there is no direct mapping between the single consolidated report sent by Agent B (step 4) and the two different policies sent to Agent B that caused that report to be generated (steps 1 and 3).¶

11.5. Multiple Administrative Domains

The managed applications on an agent may be controlled by different administrative entities in a network. The DTNMA allows agents to communicate with multiple managers in the network, such as cases where there exists one manager per administrative domain.¶

Whenever a manager sends a policy expression to an agent, that policy expression may be annotated with authorization information. One method of representing this is an ACL.¶

The ability for one manager to access the results of policy expressions configured by some other manager will be limited to the authorization annotations of those policy expressions.¶

An example of multi-manager authorization is illustrated in Figure 6.¶

Multiplexed Management Control Flow¶

   +-----------+               +---------+                 +-----------+
   | Manager A |               | Agent A |                 | Manager B |
   +-----+-----+               +----+----+                 +-----+-----+
         |                          |                            |
         |---DEF(ACL1,V1,EDD1*2)--->|<---DEF(ACL2, V2, EDD2*2)---| (1)
         |                          |                            |
         |---PROD(1s, V1)---------->|<---PROD(1s, V2)------------| (2)
         |                          |                            |
         |<--------RPT(V1)----------|                            | (3)
         |                          |--------RPT(V2)------------>|
         |<--------RPT(V1)----------|                            |
         |                          |--------RPT(V2)------------>|
         |                          |                            |
         |                          |<---PROD(1s, V1)------------| (4)
         |                          |                            |
         |                          |----ERR(V1 no perm.)------->|
         |                          |                            |
         |--DEF(NULL,V3,EDD3*3)---->|                            | (5)
         |                          |                            |
         |---PROD(1s, V3)---------->|                            | (6)
         |                          |                            |
         |                          |<----PROD(1s, V3)-----------|
         |                          |                            |
         |<--------RPT(V3)----------|--------RPT(V3)------------>| (7)
         |<--------RPT(V1)----------|                            |
         |                          |--------RPT(V2)------------>|
         |<-------RPT(V3)-----------|--------RPT(V3)------------>|
         |<-------RPT(V1)-----------|                            |
         |                          |--------RPT(V2)------------>|

Figure 6

Complex networks require multiple managers interfacing with agents.¶

In this figure, both Managers A and B send policies to Agent A (step 1). Manager A defines a VAR (V1) whose value is given by the mathematical expression (EDD1 * 2) and provides an ACL (ACL1) that restricts access to V1 to Manager A. Similarly, Manager B defines a VAR (V2) whose value is given by the mathematical expression (EDD2 * 2) and provides an ACL (ACL2) that restricts access to V2 to Manager B.¶

Both Managers A and B also send policies to Agent A to report on the values of their VARs at 1 second intervals (step 2). Since Manager A can access V1 and Manager B can access V2, there is no authorization issue with these policies and they are both accepted by the autonomy engine on Agent A. Agent A produces reports as expected, sending them to their respective managers (step 3).¶

Later (step 4) Manager B attempts to configure Agent A to also report to it the value of V1. Since Manager B does not have authorization to view this VAR, Agent A does not include this in the configuration of its autonomy engine and, instead, some indication of permission error is included in any regular reporting back to Manager B.¶

Manager A also send a policy to Agent A (step 5) that defines a VAR (V3) whose value is given by the mathematical expression ( EDD3*3). and provides no ACL, indicating that any manager can access V3. In this instance, both Manager A and Manager B can then send policies to Agent A to report the value of V3 (step 6). Since there is no authorization restriction on V3, these policies are accepted by the autonomy engine on Agent A and reports are generated to both Manager A and B over time (step 7).¶

11.6. Cascading Management

There are times where a single network device may serve as both a manager for other agents in the network and, itself, as a device managed by someone else. This may be the case on nodes service as gateway or proxies. The DTNMA accommodates this case by allowing a single device to run both an Agent and a Manager.¶

An example of this configuration is illustrated in Figure 7.¶

Data Fusion Control Flow¶

                 ---------------------------------------
                 |                 Node B              |
                 |                                     |
  +-----------+  |    +-----------+      +---------+   |    +---------+
  | Manager A |  |    | Manager B |      | Agent B |   |    | Agent C |
  +---+-------+  |    +-----+-----+      +----+----+   |    +----+----+
      |          |          |                 |        |         |
      |---------------DEF(NULL,V0,EDD1+EDD2)->|        |         | (1)
      |------------------PROD(EDD1&EDD2,V0)-->|        |         |
      |          |          |                 |        |         |
      |          |          |                 |        |         |
      |          |          |--------------------PROD(1s, EDD2)->| (2)
      |          |          |                 |        |         |
      |          |          |                 |        |         |
      |          |          |<--------------------RPT(EDD2)------| (3)
      |          |          |                 |        |         |
      |<------------------RPT(V0)-------------|        |         | (4)
      |          |          |                 |        |         |
      |          |          |                 |        |         |
                 |                                     |
                 |                                     |
                 ---------------------------------------

Figure 7

A device can house both a Manager and an Agent.¶

In this example, we presume that Agent B is able to sample a given EDD (EDD1) and that Agent C is able to sample a different EDD (EDD2). Node B houses Manager B controlling Agent C, and also Agent B, which is controlled by Manager A. Manager A must periodically receive some new value that is calculated as a function of both EDD1 and EDD2.¶

The sequence of events that can enable this scenario is as follows. Manager A sends a policy to Agent B to define a VAR (V0) whose value is given by the mathematical expression (EDD1 + EDD2) without a restricting ACL. Further, Manager A sends a policy to Agent B to report on the value of V0 every second (step 1).¶

Agent B can requires the ability to monitor both EDD1 and EDD2. However, the only way to receive EDD2 values is to have them reported back to Node B and included in the Node B runtime data stores. Therefore, Manager B sends a policy to Agent C to reports on the value of EDD2 (step 2).¶

Agent C receives the policy in its autonomy engine and produces reports on the value of EDD2 every second (step 3).¶

Agent B may locally sample EDD1 and EDD2 and uses that to compute values of V0 and report on those values at regular intervals as well (step 4).¶

While a trivial example, the mechanism of associating fusion with the Manager function rather than the Agent function scales with fusion complexity. Within the DTNMA, Agents and Managers are not required to be separate software implementations. There may be a single software application running on Node B implementing both Manager B and Agent B roles.¶

14. Informative References

[BIRRANE1]: Birrane, E.B. and R.C. Cole, "Management of Disruption-Tolerant Networks: A Systems Engineering Approach", 2010.
[BIRRANE2]: Birrane, E.B., Burleigh, S.B., and V.C. Cerf, "Defining Tolerance: Impacts of Delay and Disruption when Managing Challenged Networks", 2011.
[BIRRANE3]: Birrane, E.B. and H.K. Kruse, "Delay-Tolerant Network Management: The Definition and Exchange of Infrastructure Information in High Delay Environments", 2011.
[I-D.ietf-core-comi]: Veillette, M., Van der Stok, P., Pelov, A., Bierman, A., and I. Petrov, "CoAP Management Interface (CORECONF)", Work in Progress, Internet-Draft, draft-ietf-core-comi-11, 17 January 2021, <https://www.ietf.org/archive/id/draft-ietf-core-comi-11.txt>.
[I-D.ietf-core-sid]: Veillette, M., Pelov, A., Petrov, I., and C. Bormann, "YANG Schema Item iDentifier (YANG SID)", Work in Progress, Internet-Draft, draft-ietf-core-sid-16, 24 June 2021, <https://www.ietf.org/archive/id/draft-ietf-core-sid-16.txt>.
[I-D.ietf-core-yang-cbor]: Veillette, M., Petrov, I., Pelov, A., and C. Bormann, "CBOR Encoding of Data Modeled with YANG", Work in Progress, Internet-Draft, draft-ietf-core-yang-cbor-16, 24 June 2021, <https://www.ietf.org/archive/id/draft-ietf-core-yang-cbor-16.txt>.
[I-D.irtf-dtnrg-dtnmp]: Birrane, E. and V. Ramachandran, "Delay Tolerant Network Management Protocol", Work in Progress, Internet-Draft, draft-irtf-dtnrg-dtnmp-01, 31 December 2014, <http://www.ietf.org/internet-drafts/draft-irtf-dtnrg-dtnmp-01.txt>.
[RFC2119]: Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <https://www.rfc-editor.org/info/rfc2119>.
[RFC3416]: Presuhn, R., Ed., "Version 2 of the Protocol Operations for the Simple Network Management Protocol (SNMP)", STD 62, RFC 3416, DOI 10.17487/RFC3416, December 2002, <https://www.rfc-editor.org/info/rfc3416>.
[RFC4838]: Cerf, V., Burleigh, S., Hooke, A., Torgerson, L., Durst, R., Scott, K., Fall, K., and H. Weiss, "Delay-Tolerant Networking Architecture", RFC 4838, DOI 10.17487/RFC4838, April 2007, <https://www.rfc-editor.org/info/rfc4838>.
[RFC6020]: Bjorklund, M., Ed., "YANG - A Data Modeling Language for the Network Configuration Protocol (NETCONF)", RFC 6020, DOI 10.17487/RFC6020, October 2010, <https://www.rfc-editor.org/info/rfc6020>.
[RFC6241]: Enns, R., Ed., Bjorklund, M., Ed., Schoenwaelder, J., Ed., and A. Bierman, Ed., "Network Configuration Protocol (NETCONF)", RFC 6241, DOI 10.17487/RFC6241, June 2011, <https://www.rfc-editor.org/info/rfc6241>.
[RFC6991]: Schoenwaelder, J., Ed., "Common YANG Data Types", RFC 6991, DOI 10.17487/RFC6991, July 2013, <https://www.rfc-editor.org/info/rfc6991>.
[RFC7228]: Bormann, C., Ersue, M., and A. Keranen, "Terminology for Constrained-Node Networks", DOI 10.17487/RFC7228, RFC 7228, May 2014, <https://www.rfc-editor.org/info/rfc7228>.
[RFC7252]: Shelby, Z., Hartke, K., and C. Bormann, "The Constrained Application Protocol (CoAP)", RFC 7252, DOI 10.17487/RFC7252, June 2014, <https://www.rfc-editor.org/info/rfc7252>.
[RFC7575]: Behringer, M., Pritikin, M., Bjarnason, S., Clemm, A., Carpenter, B., Jiang, S., and L. Ciavaglia, "Autonomic Networking: Definitions and Design Goals", RFC 7575, DOI 10.17487/RFC7575, June 2015, <https://www.rfc-editor.org/info/rfc7575>.
[RFC7576]: Jiang, S., Carpenter, B., and M. Behringer, "General Gap Analysis for Autonomic Networking", RFC 7576, DOI 10.17487/RFC7576, June 2015, <https://www.rfc-editor.org/info/rfc7576>.
[RFC8040]: Bierman, A., Bjorklund, M., and K. Watsen, "RESTCONF Protocol", RFC 8040, DOI 10.17487/RFC8040, January 2017, <https://www.rfc-editor.org/info/rfc8040>.
[RFC8199]: Bogdanovic, D., Claise, B., and C. Moberg, "YANG Module Classification", RFC 8199, DOI 10.17487/RFC8199, July 2017, <https://www.rfc-editor.org/info/rfc8199>.
[RFC8613]: Selander, G., Mattsson, J., Palombini, F., and L. Seitz, "Object Security for Constrained RESTful Environments (OSCORE)", RFC 8613, DOI 10.17487/RFC8613, July 2019, <https://www.rfc-editor.org/info/rfc8613>.
[RFC8639]: Voit, E., Clemm, A., Gonzalez Prieto, A., Nilsen-Nygaard, E., and A. Tripathy, "Subscription to YANG Notifications", RFC 8639, DOI 10.17487/RFC8639, September 2019, <https://www.rfc-editor.org/info/rfc8639>.
[RFC8641]: Clemm, A. and E. Voit, "Subscription to YANG Notifications for Datastore Updates", RFC 8641, DOI 10.17487/RFC8641, September 2019, <https://www.rfc-editor.org/info/rfc8641>.
[RFC8949]: Bormann, C. and P. Hoffman, "Concise Binary Object Representation (CBOR)", DOI 10.17487/RFC8949, STD 94, RFC 8949, December 2020, <https://www.rfc-editor.org/info/rfc8949>.
[RFC8993]: Behringer, M., Ed., Carpenter, B., Eckert, T., Ciavaglia, L., and J. Nobre, "A Reference Model for Autonomic Networking", RFC 8993, DOI 10.17487/RFC8993, May 2021, <https://www.rfc-editor.org/info/rfc8993>.
[RFC9171]: Burleigh, S., Fall, K., Birrane, E., and III., "Bundle Protocol Version 7", RFC 9171, DOI 10.17487/RFC9171, January 2022, <https://www.rfc-editor.org/info/rfc9171>.
[RFC9172]: Birrane, E., III., and K. McKeever, "Bundle Protocol Security (BPSec)", RFC 9172, DOI 10.17487/RFC9172, January 2022, <https://www.rfc-editor.org/info/rfc9172>.
[RFC9254]: Veillette, M., Petrov, I., Pelov, A., Bormann, C., and M. Richardson, "Encoding of Data Modeled with YANG in the Concise Binary Object Representation (CBOR)", DOI 10.17487/RFC9254, RFC 9254, July 2022, <https://www.rfc-editor.org/info/rfc9254>.
[xpath]: Clark, J.C. and R.D. DeRose, "XML Path Language (XPath) Version 1.0", 1999.