Internet-Draft CoAP Simple Management Protocol August 2022
Duffy (ed), et al. Expires 18 February 2023 [Page]
Workgroup:
Independant Submission
Internet-Draft:
draft-duffy-csmp-00
Published:
Intended Status:
Informational
Expires:
Authors:
P. Duffy (ed)
Cisco Systems, Inc.
J. Bhasin
Cisco Systems, Inc.
K. Leung
Cisco Systems, Inc.
H. She
Cisco Systems, Inc.
L. Zhou
Cisco Systems, Inc.

CoAP Simple Management Protocol

Abstract

CoAP Simple Management Protocol (CSMP) provides lifecycle management for resource constrained IoT devices deployed within large-scale, bandwidth constrained IoT networks. This document describes the design and operation of CSMP.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 18 February 2023.

Table of Contents

1. Introduction

Low Power Wide Area Network (LPWAN) technologies provide long range, low power connectivity for Internet of Things (IoT) applications. LPWANs typically operate over distances of several kilometers with link bandwidths as low as 10s of Kbps. LPWAN devices are often compute, storage and power constrained (optimized to operate for years on a single battery charge).

A large LPWAN may contain millions of devices requiring a Network Management System (NMS) able to provide at-scale lifecycle management. The management protocol must be able to operate within the constrained performance envelope of an LPWAN. The management protocol must offer an efficient message encoding, be optimized for efficient and secure messaging flows across the LPWAN, and support classic NMS functions such as device on-boarding, device configuration, device status reporting, securing the network, etc.

This document describes the design and operation of the CoAP Simple Management Protocol (CSMP), which provides management capabilities for constrained IoT devices deployed within large scale LPWANs. Features include:

  1. Onboarding. Device startup registration and capabilities anouncement with an NMS.
  2. Configuration management. Device acquisition of configuration from the NMS. Subsequent NMS configuration reads and updates to the device.
  3. Metrics reporting. Periodic device metrics reporting to the NMS. NMS on-demand metrics requests to a device.
  4. NMS commanded device operations. NMS command issuance to a single device or group of devices.
  5. Secure device firmware update.

2. Conventions and Definitions

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

3. Protocol Specification

CSMP is a usage profile of the Constrained Application Protocol [RFC7252], which is designed for implementing RESTful messaging for resource constrained devices. It is fair to view CoAP as a binary encoded functional subset of HTTP operating over UDP which also supports multicast messaging. Resources (addressable objects) transported within CSMP message payloads are implemented using the Protocol Buffers compact binary encoding [PB].

It is assumed the reader is familiar with:

  1. The basic concepts of RESTful architecture.
  2. The operation, message formats, and terminology of CoAP [RFC7252].
  3. Protocol Buffers [PB], which is used to describe and encode CSMP message payloads.
  4. OpenAPI [OPENAPI] is the interface definition language used to define CSMP Device and NMS interfaces.

3.1. CoAP Usage Profile

The NMS and devices communicate directly using CoAP. Acting as a CoAP client, a device issues RESTful requests to read or modify specific resources (objects) exposed by the NMS server. Likewise, the device serves requests from the NMS to manipulate resources exposed by the device.

CSMP specializes the usage of CoAP in the following ways:

  1. CSMP does not use the Token capabilities described in [RFC7252]. Request/response messaging MUST be implemented using the "synchronous" form of a CON request with response piggybacked in the subsequent ACK. A client MUST elide the Token field from the request message. The server SHOULD ignore the Token if received from a client request.
  2. Due to high latencies typical of LPWAN technologies, a client MUST NOT use the CoAP retransmission model when sending a CON message to a server. After sending a CON message, the client MUST accept a response from the server at any time up until the device sends its next CON message (containing a new CoAP Message ID).

3.2. Interface Specification

CSMP defines a CSMP Device interface and a CSMP NMS interface. Each of these interfaces defines a set of resources (objects), the addresses at which these resources exist (URLs), and the methods (GET, PUT, POST, DELETE) which may be used to manipulate the resources to implement device management operations. The CSMP Device and CSMP NMS interfaces are expressed using the OpenAPI interface definition language.

OpenAPI's heritage is the expression of interfaces typically constructed with HTTP and JSON. This document defines a few conventions required to express a CoAP interface in OpenAPI:

  1. The generic RESTful verb designators GET, PUT, POST, and DELETE are used along with the text presentation of a resource's URL. This is done for developer readability. An actual implementation must properly encode the CoAP message Code field along with the Uri-Host, Uri-Port, Uri-Path, and Uri-Query options as described in [RFC7252].
  2. CoAP Response codes of the form X.YZ are expressed in the OpenAPI as XYZ (the actual response code is scaled by 100 in the OpenAPI rendering).
  3. Placeholder objects are defined in OpenAPI to represent the Protocol Buffer binary objects which will be transported in CoAP messaging payloads. The actual binary objects are defined in a separate Protocol Buffer file [CSMPMSG].
  4. CoAP supports the NON messaging pattern. OpenAPI syntax always requires a response be defined. The CSMP interface definitions will note when a response is for a COAP NON request and not an actual CoAP response (no response is sent).

3.2.1. CSMP Device Interface

A CSMP device MUST implement the interface specified within [CSMPDEV]. Various forms of the GET method are used for retrieving registration information, device information, and monitoring information from the devices. Various forms of the POST method are used to deliver configuration and commands to devices. Usage of this interface is detailed in the sections which follow.

3.2.2. CSMP NMS Interface

A CSMP NMS MUST implement the interfaces specified within [CSMPNMS]. Various forms of the POST method are used for device registration, device metrics reporting, and asynchronous GET responses (resulting from a request including the "a" option and default response URL). Usage of this interface is detailed in the sections which follow.

3.3. Resources

CSMP devices and CSMP NMS use the CoAP GET, POST, and DELETE methods to manipulate the resources specified in [CSMPMSG], which are the Protocol Buffer object payloads contained in the various CoAP requests and responses. Usage of these payloads is detailed in the sections which follow.

3.3.1. Base URL

A CSMP server is located at a <base-url> of the form coap://hostname:port/<base-path>

It is RECOMMENDED that a default port of 61628 be used.

The <base-path> for all CSMP resources at a particular hostname:port MUST be identical.

It is RECOMMENDED that a default <base-path> of "/." be used.

Because an NMS CSMP request message may be multicast to a large number of devices, all CSMP devices within a multicast domain MUST have identical port and <base-path>.

The following <base-path> are reserved for future use:

  1. /o

Full details of the NMS and Device service URLs are defined in [CSMPDEV] and [CSMPNMS].

3.3.2. Resource Encoding

3.3.2.1. Standard TLVs

The message payloads of CSMP requests and responses MUST be formatted as a sequence of Type-Length-Value objects. Each TLV object has the following format:

| Type | Length | Value |

The Type field is an unsigned integer identifying a specific CSMP TLV ID and MUST be encoded as a Protocol Buffers varint.

The Length field is an unsigned integer containing the number of octets occupied by the Value field. The Length field MUST be encoded as a Protocol Buffers varint.

The Value field MUST contain the Protocol Buffers encoded TLV corresponding to the indicated Type.

The set of objects defined by CSMP and their Type (TLV ID) are specified in [CSMPMSG].

3.3.3. Large Requests

A single CSMP TLV MUST NOT be larger than the space available in a single CoAP request message payload, minus the space occupied by mandatory TLVs. CSMP requests containing large TLVs or many TLVs may exceed available space within a CoAP request / UDP datagram.

If a POST request is larger than the UDP MTU, the request MUST be split into multiple POST requests with the TLVs spread across the message bodies. The server MUST be prepared to handle the TLVs in any order.

If a GET request exceeds the UDP MTU because the max length of the "q" option is exceeded, the request MUST be split into multiple GET requests, each with a subset of the query option.

The GET response from a server may not be able to fit all requested TLVs into the response. The server will respond with only the TLVs it is able to fit within the message body. A client SHOULD issue additional GET requests to obtain the missing TLVs.

Recommended network MTU will be deployment / technology dependent. For example, an MTU of 1024 is often used for large scale IEEE 802.15.4 mesh networks.

3.4. CSMP Security Model

The NMS signs outgoing device messaging. Devices verify the signature to confirm source and integrity of incoming NMS messages. NMS-Device trust is established with an NMS certificate/public key programmed into the device at time of manufacture. Signing TLVs included in the message payload enable signature verification by a device. The Signing TLVs are:

  1. Signature TLV. When included, the Signature TLV MUST be the last TLV in a message payload. The signature is calculated over the first byte of the message payload up to but not including the Signature TLV itself. Unless otherwise specified, the signature MUST be calculated as ECDSA with SHA-256 signature cipher using the signer's (NMS) private key. If the message signature is incorrect, the device MUST ignore the message.
  2. SignatureValidity TLV. The SignatureValidity TLV defines the validity period for the message. The SignatureValidity TLV MUST be included when the Signature TLV is included. If the message is received outside the defined validity period, the device MUST ignore the message.

If either of the Signing TLVs are missing from a message payload, the device MUST ignore the message.

Additional layer 2, 3, or 4 security mechanisms may be utilized to meet the requirements of specific deployment models (Wi-SUN layer 2 security, VPN at layer 3, DTLS at layer 4, etc.). Details of these additional security mechanisms are out of scope of this specification.

3.4.1. Signature Exemption

For situations in which a request payload signature adds overhead without improving security, the Signing TLVs may be elided for certain payloads. For example, the overhead of signing each block of a firmware update may be unnecessary as a full image integrity check is performed over the entire file and reported to the NMS.

The signature exemptible TLVs are:

  1. ImageBlock
  2. DescriptionRequest

The NMS MAY elide the Signing TLVs provided the request body contains only exemptible TLVs. Otherwise, the Signing TLVs MUST be included.

A device MAY accept incoming message payloads without Signing TLVs provided the payload contains only exemptible TLVs.

3.5. Device Groups

CSMP groups are used to support multicast messaging to devices.

A group is uniquely defined by a group-type/group-id pair. A device MAY be a member of multiple group-types, but MUST be a member of only one group-id within a group-type. A device MUST support membership in at least two group types.

The NMS assigns a device to a group using the GroupAssign TLV. On initial boot, a device has no group assignments. To be assigned to a device group, a GroupAssign TLV MUST be sent to the device either by a POST request from the NMS or within the response to the device's registration request to the NMS.

The NMS removes a device from a group by POST-ing a GroupEvict TLV to the device.

If a device's group assignment is changed at the NMS, upon receipt of the next metrics report from the device, the NMS MUST POST a new GroupAssign TLV to the device.

Group assignments are not additive. Assignments MUST be replaced upon receipt of a subsequent GroupAssign TLV.

A GroupAssign TLV MUST NOT be sent within a multicast message.

A GroupEvict TLV MUST NOT be sent within a multicast message.

Devices MUST maintain group assignments in durable storage (across power cyclings / reboots).

CSMP multicast messages MUST contain a GroupMatch TLV. Upon receipt of a multicast CSMP message:

  1. A device MUST process the message if the contained GroupMatch TLV matches a group to which the device is assigned.
  2. A device MUST ignore the message if the message does not contain a GroupMatch TLV.
  3. A device MUST ignore the message if the GroupMatch TLV does not match a device group assignment.

3.5.1. Reserved Group Types

Group type 1 is reserved for configuration.

Group type 2 is reserved for firmware.

3.6. Device TLV Processing Order

A device processes message payload TLVs in the following order:

  1. If present, the Signature and SignatureValidity TLVs MUST be processed first.
  2. If present, the GroupMatch TLV MUST be processed next.
  3. The remaining payload TLVs MUST be processed in the order they appear in the payload.
  4. TLVs within a payload SHOULD NOT be duplicated. In the case of a duplicate TLV, the last payload instance of TLV MUST be used.
  5. The index field of a TLV table entry is used to determine uniqueness of the TLV. TLVs with identical index values MUST be considered to be duplicates (table entry TLVs are identified in [CSMPCOMP].
  6. TLV specific error handling is described in the OpenAPI definitions.

4. Functional Description

This section describes the major operational flows of the CSMP protocol.

4.1. Device Lifecyle States

For understanding of CSMP device behavior, it is helpful to consider the NMS' view of device states and state transitions (presented below).

The NMS views a device as transitioning through the following states:

Connector Exit to or entry from another part of chart. Communication Link Transcommunication link. Unheard Registering Registered Down Device added to NMS inventory Device metrice not received for mark-down-threshold Metrics report received from Device Device issues Registration Request to NMS Device issues Registration Request to NMS Device issues Registration Request to NMS Registration ACK and first metrics received from Device
Figure 1: NMS View of Device State
  1. Devices pre-populated into the NMS prior to deployment exist in the Unheard state.
  2. Upon receipt of a device registration request, the NMS records the device's presence on the network and the device enters the Registering state.
  3. NMS responds with a registration ACK payload containing configuration for the device.
  4. The device transitions to the Up state upon NMS receipt of a device metrics report.
  5. If subsequent metrics reports are lost (poor network conditions), the device transitions to the Down state. NMS receipt of a new metrics report transitions the device back to Up state.

4.2. NMS Discovery

A device requires the <nms-base-url> of its NMS. Acquisition of the NMS URL may be accomplished via a variety of means including a DHCP option, pre-deployment administrative configuration setting, etc. The specific mechanism to be used is beyond the scope of this specification.

For devices using DHCPv6 address assignment, a device MAY request DHCPv6 option 26484 sub-option 1 to obtain the URL of its NMS.

4.3. Device Registration and Configuration

Registration is the messaging flow via which a device announces its entry onto the network and provides a means for the NMS to push configuration information to the device.

A device registers with an NMS by issuing a registration request to an NMS. The NMS subsequently responds to reject or accept the registration, with device configuration included in a successful registration response. A device issues a registration request for a variety of reasons:

  1. A device MUST register when the device reboots (power cycled or receipt of RebootRequest TLV from the NMS).
  2. A device MUST register when its IP address has changed (usually indicating a network re-join).
  3. A device MUST register if its mesh parent has changed (mesh networks only).
  4. A device MUST register if it detects the NMS IP address has changed.
  5. A device MUST register upon receipt of NMSRedirectRequest TLV from the NMS. This can be caused by the removal of a device from the NMS inventory but the device continues to communicate with a Session ID now unknown to the NMS.

A device and NMS implement the registration messaging flow depicted in Figure 2.

Device Device NMS NMS Registration Begin [1] Wait random time tBackoff between 0 and tInterval [2] CON POST msgId = X++ to /r ... 1. MUST contain DeviceID TLV 2. MUST contain CurrentTime TLV 3. MUST contain Device Information TLVs 4. MAY contain SessionID TLV 4. MAY contain GroupInfo TLV 5. MAY contains ReportSubscribe TLV [3] Wait remaining time between tBackoff and tInterval Double tInterval up to tIntervalMax Continue loop until ACK receipt success and no redirect [4] ACK msgID = X ... 1 Contains NMSRedirectRequest TLV. 2 MUST contain Signing TLVs. Continue loop alt [NMS redirects] [5] ACK msgID = X ... 1. MAY contain SessionID TLV 2. MAY contain GroupAssign TLV 3. MAY contain ReportSubscribe TLV. 4. MUST contain the Signing TLVs. Break out of loop. [NMS no redirect] loop First Metrics Report [6] NON POST msgID = X++ to /c ... 1. MUST contain SessionID TLV. 2. MUST contain CurrentTime TLV. 3. MUST contain list of TLVs speficied by ReportSubscribe TLV Registration Complete Continuing Metrics Reports [7] Wait random time tMetricsBackoff between 0 and tMetricsInterval defined by ReportSubscribe [8] NON POST msgId = X++ to /c ... 1. MUST contain Session ID TLV 2. MUST contain CurrentTime TLV 3. MUST contain the list of TLVS specified by previous ReportSubscribe TLV. [9] Wait remaining time between tMetricsBackoff and tMetrics Interval loop
Figure 2: Device Registration, Configuration, and Metrics

4.3.1. Device Registration Request

A device MUST implement two configurable parameters used to control the registration process, initially set at manufacture time, and MUST be maintained in durable storage.

  1. tIntervalMin defaults to 300 seconds (5 minutes). Also configurable via TLV 42/regIntervalMin field.
  2. tIntervalMax defaults to 3600 seconds (1 hour). Also configurable via TLV 42/regIntervalMax field.

A device MUST issue registration requests using the following algorithm.

    Set tInterval = tIntervalMin.
    Wait an initial period between 0 and tInterval seconds.
    Do {
      1. Wait tBackoff seconds, where tBackoff is a random
        interval between tInterval/2 and tInterval seconds.
        A tBackoff value in the latter half of the interval
        ensures a minimum time between successive registration
        attempts.
      2. Send a new CoAP CON POST request message to NMS <nms-base-url>/r.
        The message payload MUST contain the registration TLVs
        described in section 3.3.1.2.
      3. Wait the remaining (tInterval – tBackoff) seconds.
      4. Set tInterval to 2 * tInterval.
        If tInterval is greater than tIntervalMax,
        set tInterval to tIntervalMax.
    } While the device has not received an ACK to its registration POST.

An example execution of a full registration POST retry sequence is presented in Section 8.

If the device receives an ACK message with CoAP response code 2.03 (valid) from the NMS at any time before the device's next registration POST, the TLVs within the ACK message MUST be processed.

4.3.2. Registration POST Payload

The following TLVs MUST be included in the device registration POST to the NMS:

  1. DeviceID (primary identifier of the device).
  2. CurrentTime (used to validate device local time).

Previously registered devices SHOULD already have (durably stored) values for the Session ID, GroupInfo,and ReportSubscribe TLVs and MUST include these TLVs in the device registration POST to the NMS:

  1. SessionID
  2. GroupInfo (one per group)
  3. ReportSubscribe

The following Device Information TLVs MUST be included in the registration POST:

  1. HardwareDesc
  2. InterfaceDesc (one per interface)
  3. IPAddress (one per address)
  4. NMSStatus (reason for the registration operation)
  5. WPANStatus
  6. RPLSettings

Note that the SessionID, GroupAssign, and ReportSubscribe TLV set is considered to be generic device Configuration. The Configuration TLV set is technology specific and MAY be extended with additional technology specific TLVs (beyond the scope of this specification).

4.3.3. NMS Registration Response

Upon receipt of a registration request, the NMS looks up the information for the device identified by DeviceID to confirm correctness of the request. See [CSMPNMS] for details of DeviceID and SessionID validation and related error response codes.

If the device is found in inventory, authorized to register, and all other registration request content is confirmed to be valid, the NMS MUST send an ACK response message with response code 2.03(Valid)to the device. The ACK takes one of two forms, depending upon whether or not the NMS will redirect the device to another NMS.

In the case where the NMS is not redirecting, the response body to a valid registration request contains the following TLVs:

  1. SessionID MUST be elided from the ACK if the SessionID contained in the registration request is correct, otherwise the correct SessionID TLV MUST be included in the ACK.
  2. GroupAssign MUST be elided from the ACK if the GroupAssign contained in the registration request is correct, otherwise the correct GroupAssign TLV MUST be included in the ACK.
  3. ReportSubscribe MUST be elided from the ACK if the ReportSubscribe contained in the registration request is correct, otherwise the correct ReportSubscribe TLV MUST be included in the ACK.
  4. Signing TLVs MUST be included.

In the case where the NMS is redirecting, the response body to a valid registration request contains the following TLVs:

  1. NMSRedirectRequest MUST be included.
  2. Signing TLVs MUST be included.

When the response contains a SessionID TLV, the device MUST durably store this TLV and SessionID TLV MUST be included in all future CSMP requests to the NMS.

When the response contains a GroupAssign TLV, the device MUST durably store the group-id (overwriting any other stored group-id for the same group-type) and MUST use the new GroupAssign values for comparison with all future receipt of GroupMatch TLVs.

When the response contains a ReportSubscribe TLV, the device MUST begin reporting the indicated metrics TLVs (ignoring any TLVs requested by the ReportSubscribe which are unknown to the device).

The NMS includes the NMSRedirectRequest TLV in a registration response to request the device register with an alternate NMS instance (load balancing, etc.). Upon receipt of this TLV, the device MUST cease registration attempts with the original NMS and start the registration process with the NMS indicated in the NMSRedirectRequest TLV. If registration succeeds with this new NMS, all subsequent device CSMP messaging MUST be directed to this new NMS. Note that device receipt of NMSRedirectRequest TLV is a one-time redirect and not persisted across device restarts.

4.3.4. Registration Complete

The NMS considers device registration to be complete when all of the following conditions are met:

  1. The registration ACK to the device indicates code 2.03 (Valid) and message ID match is confirmed as described in CoAP.
  2. The ACK response body does not contain an NMSRedirectRequest TLV.
  3. The first metrics report from the device is received by the NMS.

4.4. Device Metrics Reporting

Upon receipt of a ReportSubscribe TLV, a device configures as many as two metrics reports:

  1. A primary metrics report using the interval and TLV ID set.
  2. A secondary metrics report using the heartbeat interval and heartbeat TLV ID set.

The primary metrics report MUST be used for mains powered devices (with the secondary report disabled). To conserve power, metrics reporting for low power devices MAY be split across primary and secondary reports, with the primary report configured to provide TLVs needed at more frequent interval and the secondary configured for TLVs required at a more relaxed interval.

A device configures metrics parameters to control the device's primary metrics report as follows:

  1. tMetricsInterval (how often a device MUST report its metrics) is typically set between 5 minutes and 8 hours (depends on application requirements). Designated by the interval field of the ReportSubscribe TLV.
  2. tlvList (the list of TLVs the device MUST report) is designated by the tlvId field of the ReportSubscribe TLV.

A device configures metrics parameters to control a device's secondary metrics report as follows:

  1. tMetricsInterval is designated by the intervalHeartBeat field of the ReportSubscribe TLV.
  2. tlvList is designated by the tlvIdHeartBeat field of the ReportSubscribe TLV.

The TLV content of the primary and secondary metrics reports are deployment and application specific. For example, the primary metrics report for a 6LoWPAN, mains-powered mesh node might be configured as:

  1. InterfaceMetrics
  2. GroupInfo
  3. FirmwareImageInfo
  4. Uptime
  5. LowpanPhyStats
  6. DifServMetrics
  7. ReportSubscribe

TLV content of the secondary metrics report is similarly application specific and beyond the scope of this specification.

For each configured metrics report, a device MUST commence reporting immediately after receipt of a successful registration ACK by sending a NON POST to <nms-url>/c containing the following TLVs:

  1. SessionID
  2. CurrentTime
  3. The TLVs from tlvList. The entirety of all table entries MUST be included.

Following the initial metrics report, the device MUST implement the following algorithm for subsequent metric reports (for both primary and secondary report):

    Send a new CoAP NON POST to <nms-url>/c with the required metrics TLVs.
    Wait an initial random interval between 0 and tMetricsInterval seconds.
    Do {
        1. Wait tMetricsBackoff seconds, where tMetricsBackoff is random
          value between tMetricsInterval/2 and tMetricsInterval seconds.
        2. Send a new CoAP NON POST message to <nms-url>/c
          with the required metrics TLVs
        3. Wait the remaining (tMetricsInterval – tMetricsBackoff) seconds
          so that the full tMetricsInterval has expired.
    } While the device has a valid IP address.

4.5. Device Firmware Update

CSMP defines a device firmware update process optimized for LPWANs. A key aspect of this process is the separation of image placement on a device from activation (execution) of the image on the device.

The device firmware update process consists of three sub-flows:

  1. Firmware download. A new image is placed on one or more members of a device group.
  2. Image load. Activate an image on one or more members of a device group at a scheduled time.
  3. Set backup image. Optional designation of an image to be used when load of all other images fails.

A device should implement the following mechanisms in support of firmware update:

  1. It is RECOMMENDED that vendors implement digital signing of images prior to image release to production.
  2. It is RECOMMENDED that a device implement a secure bootloader which (upon device receipt of a LoadRequest TLV or at device power-up) validates the activated image's signature, loads the image from durable storage into operating memory, and transfers execution to the image. Specific details of image signing and the secure bootloader are left to the vendor beyond the scope of this specification.
  3. A device MUST be capable of storing at least two firmware images: the running image and at least one additional image.
  4. It is RECOMMENDED that a device also support a backup image. A device boots into the backup image when the device is unable to boot into any other image.
  5. A device MUST be capable of scheduling an image load (activation) at a specific future time (i.e. the device must maintain a time source).

4.5.1. Firmware Image Format

A CSMP firmware image file consists of three main parts: a CSMP defined image header, the vendor defined image binary, and the vendor defined image signature (as depicted below).

Table 1: Firmware Image Format
Field Size (octets) Description
    Begin Header
Header Version 4 32 bit unsigned integer which MUST be set to 2.
Header Length 4 32 bit unsigned integer which MUST be set to 256.
App Rev Major 4 Vendor specific 32 bit unsigned integer which is set to indicate the major revision number of the application image.
App Rev Minor 4 Vendor specific 32 bit unsigned integer which is set to indicate the minor revision number of the application image.
App Build 4 Vendor specific 32 bit unsigned integer which is set to indicate the build of the application image.
App Length 4 32 bit unsigned integer which MUST be set to the octet length of the Header + Image Binary field.
App Name 32 Vendor specific 32 octet string which is set to indicate the name of the application.
App SCC Branch 32 Vendor specific 32 octet string which is set to indicate the source code control system branch ID.
App SCC Commit 8 Vendor specific 8 octet string which is set to indicate the source code control system commit ID.
App SCC Flags 4 Vendor specific 32 bit unsigned integer which is set to indicate the source code control system build flags.
App Build Date 16 Vendor specific 16 octet string which is set to indicate the build date and time of the application image.
hwid 32 32 octet field which is RECOMMENDED to be set to a concatenation of a unique manufacturer ID and product model identifier.
sub_hwid 32 32 octet field which MAY be set for a manufacturer specific purpose, or functionally elided by filling with 0x20 (ASCII space character).
kernel_rev 16 16 octet field which MAY be set for a manufacturer specific purpose, or functionally elided by filling with 0x20 (ASCII space character).
sub_kernel_rev 16 16 octet field which MAY be set for a manufacturer specific purpose, or functionally elided by filling with 0x20 (ASCII space character).
Reserved 44 Pad to 256 octets, octets MUST be set to 0.
    End Header, Begin Image
Image Binary Variable Vendor specific image data.
    End Image, Begin Signature
Signature Variable Vendor specific image signature. Calculated over entire content of this structure except the signature and pad fields.
    End Signature, Begin Pad
Pad Variable Optional pad field to enable image to fill vendor specific flash memory boundary. When present, octets MUST be set to 0xFF.

All multi-octet fields are encoded as little-endian.

4.5.2. Firmware Download

An NMS implements the messaging flow depicted in Figure 3 and Figure 4 to download an image to a group of devices. Unicast distribution is also supported, with the difference being use of unicast addresses, omission of the GroupMatch TLV and omission of the 'a' query option.

Device1 Device1 Device2 Device2 NMS NMS Disseminate Firmware Details [1] NON POST to /c with options a, r 1. MUST contain GroupMatch TLV 2. MUST contain TransferRequest TLV 3. MUST contain Signing TLVs [2] [3] Store file meta information. [4] Store file meta information. Wait random delay based on the "a" option. [5] NON POST to /c 1. MUST contain TransferResponse TLV [6] NON POST to /c 1. MUST contain TransferResponse TLV Transfer Image Blocks [7] NON POST to /c 1. MUST contain GroupMatch TLV 2. MUST contain ImageBlock TLV [8] [9] Store image block [10] Store image block
Figure 3: Firmware Download
Device1 Device1 Device2 Device2 NMS NMS Transfer Image Blocks [11] NON POST to /c 1. MUST contain GroupMatch TLV. 2. MUST contain ImageBlock TLV. 3. MUST contain DescriptionRequest TLV (75) [12] [13] Store image block [14] Store image block Wait random delay based on the "a" option. [15] NON POST to /c 1. MUST contain FirmwareImageInfo TLV [16] NON POST to /c 1. MUST contain FirmwareImageInfo TLV [17] NON POST to /c 1. MUST contain GroupMatch TLV. 2. MUST contain ImageBlock TLV [18] [19] Store image block [20] Store image block Transfer Final Image Block, Request Image Status [21] NON POST to /c with options a, r 1. MUST contain GroupMatch TLV. 2. MUST contain ImageBlock TLV 3. MUST contain DescriptionRequest TLV (75) [22] [23] Store image block [24] Store image block Wait random delay based on the "a" option. [25] NON POST to /c 1. MUST contain FirmwareImageInfo TLV [26] NON POST to /c 1. MUST contain FirmwareImageInfo TLV
Figure 4: Firmware Download (cont)

A firmware download begins with the NMS informing the device group of the meta-data of an image the NMS wishes to download, and the devices subsequently informing the NMS of the images already loaded on the devices.

The NMS MUST issue a NON POST to <device-url>/c configured as follows:

  1. A GroupMatch TLV MUST be included for the desired group.
  2. The 'a' option MUST be specified to randomize subsequent device responses.
  3. The request MAY contain an 'r' option to redirect the subsequent TransferResponse.
  4. The request MUST contain a TranferRequest TLV (meta-data for the file to be downloaded)
  5. The request MUST contain the Signing TLVs.

Devices receiving a TransferRequest message:

  1. MUST process the Signing TLVs and the GroupMatch TLV as described in section 2.6.
  2. MUST wait the specified period when the 'a" option is included.
  3. MUST issue a NON POST request to <nms-url>/c which MUST contain the TransferResponse TLV. The <nms-url> MUST be overridden by the 'r' option if specified.

The NMS MUST proceed with image block transfer when at least one member of the target device group indicates Response Code of OK in the TransferResponse TLV. Otherwise, the transfer MUST be aborted.

Images are often multiple 100s of Kilobytes in size and likely require fragmentation into multiple blocks (N) to be transferred to a device.

For the transfer of each image block, the NMS MUST issue a NON POST to <device-url>/c configured as follows:

  1. The request MUST contain the GroupMatch TLV for the desired group.
  2. The request MUST contain an ImageBlock TLV
  3. The request MAY contain the Signing TLVs.

To determine image download progress, the NMS MUST periodically include a DescriptionRequest TLV, requesting the FirmwareImageInfo TLV, in the image block request. It is RECOMMENDED that the NMS request the FirmwareImageInfo TLV at 10% increments of the image download. The NMS MUST request the FirmwareImageInfo TLV with the transfer of the last block of the image.

Devices receiving the image block request message:

  1. MUST process the Signing TLVs and GroupMatch TLVs as described in section 2.6 (if present).
  2. MUST cache the image block for final image assembly.
  3. When the FirmwareImageInfo TLV is requested, the device MUST wait the specified period when the 'a' option is included and MUST issue a NON POST request to <nms-url>/c which MUST contain the FirmwareImageInfo TLV (meta-data for files loaded on the device). The <nms-url> MUST be overridden by the 'r' option if specified.

Receipt of the final FirmwareImageInfo TLV enables the NMS to confirm the integrity of the completely downloaded image.

4.5.3. Image Activation

The messaging flow for scheduling image activation across a group of devices and cancelling a scheduled image activation are depicted in Figure 5 and described below. Unicast activation is also supported, with the difference being use of unicast addresses, omission of the GroupMatch TLV and omission of the 'a' query option.

Device1 Device1 Device2 Device2 NMS NMS Schedule FW Reload [1] NON POST to /c with options a, r 1. MUST contain Groupmatch TLV 2. MUST contain LoadRequest TLV 3. MUST contain Signing TLVs [2] [3] Schedule the load at the specified time. [4] Schedule the load at the specified time. Wait random delay based on the "a" option. [5] NON POST to /c 1. MUST contain LoadResponse TLV [6] NON POST to /c 1. MUST contain LoadResponse TLV Cancel Scheduled Reload. [7] NON POST to /c with options a, r 1. MUST contain GroupMatch TLV 2. MUST contain CancelLoadRequest TLV 3. MUST contain Signing TLV [8] [9] Cancel scheduled load. [10] Cancel scheduled load. Wait random delay based on the "a" option. [11] NON POST to /c 1. MUST contain CancelLoadResponse TLV [12] NON POST to /c 1. MUST contain CancelLoadResponse TLV
Figure 5: Image Activation
4.5.3.1. Image Load

An NMS implements the following message flow to command devices to designate an image as the active executable and the time at which the image is to be activated.

The NMS MUST issue a NON POST to <device-url>/c configured as follows:

  1. A GroupMatch TLV MUST be included for the desired group.
  2. The 'a' option MUST be specified to randomize subsequent device responses.
  3. The request MAY contain an 'r' option to redirect the subsequent LoadResponse.
  4. The request MUST contain a LoadRequest TLV (designating the image and time at which the image is to be activated).
  5. The request MUST contain the Signing TLVs.

Devices receiving a LoadRequest message:

  1. MUST process the Signing TLVs and GroupMatch TLVs as described in section 2.6.
  2. MUST wait the specified period when the 'a" option is included.
  3. MUST issue a NON POST request to <nms-url>/c which MUST contain the LoadResponse TLV (indicating success or reason for failure). The <nms-url> MUST be overridden by the 'r' option if specified.
4.5.3.2. Cancel Image Load

An NMS implements the following message flow to cancel a previously scheduled image activation.

The NMS MUST issue a NON POST to <device-url>/c configured as follows:

  1. A GroupMatch TLV MUST be included for the desired group.
  2. The 'a' option MUST be specified to randomize subsequent device responses.
  3. The request MAY contain an 'r' option to redirect the subsequent CancelLoadResponse.
  4. The request MUST contain a CancelLoadRequest TLV (designating the image load to be cancelled).
  5. The request MUST contain the Signing TLVs.

Devices receiving a CancelLoadRequest message:

  1. MUST process the Signing TLVs and GroupMatch TLVs as described in section 2.6.
  2. MUST wait the specified period when the 'a" option is included.
  3. MUST issue a NON POST request to <nms-url>/c which MUST contain the CancelLoadResponse TLV (indicating success or reason for failure). The <nms-url> MUST be overridden by the 'r' option if specified.

4.5.4. Set Backup Image

An NMS implements the following message flow to command a device to designate a stored image as the backup image.

Device1 Device1 Device2 Device2 NMS NMS [1] NON POST to /c with options a, r 1. MUST contain GroupMatch TLV 2. MUST contain SetBackupRequest TLV 3. MUST contain Signing TLVs [2] [3] Set the specified image as the backup image [4] Set the specified image as the backup image Wait random delay based on the "a" option. [5] NON POST to /c 1. MUST contain SetBackupResponse TLV [6] NON POST to /c 1. MUST contain SetBackupResponse TLV
Figure 6: Set Backup Image

The NMS MUST issue a NON POST to <device-url>/c configured as follows:

  1. A GroupMatch TLV MUST be included for the desired group.
  2. The 'a' option MUST be specified to randomize subsequent device responses.
  3. The request MAY contain an 'r' option to redirect the subsequent SetBackupResponse.
  4. The request MUST contain a SetBackupRequest TLV (designating the image load to be cancelled).
  5. The request MUST contain the Signing TLVs.

Devices receiving a SetBackupRequest message:

  1. MUST process the Signing TLVs and GroupMatch TLVs as described in section 2.6.
  2. MUST wait the specified period when the 'a" option is included.
  3. MUST issue a NON POST request to <nms-url>/c which MUST contain the SetBackupResponse TLV (indicating success or reason for failure). The <nms-url> MUST be overridden by the 'r' option if specified.

4.6. Device Commands

Many TLVs served by a device are used by the NMS to interrogate the device for configuration state and operational status. There are, however, several TLVs which direct a device to execute internal actions. Examples of these TLVs are PingRequest and RebootRequest.

Usage of a command TLVs is illustrated with the following PingRequest example directed at a single device.

The NMS MUST issue a NON POST to <device-url>/c configured as follows:

  1. A GroupMatch TLV MUST NOT be included for the designed device.
  2. The request MAY contain an 'r' option to redirect the subsequent PingResponse.
  3. The request MUST contain a PingRequest TLV (describing the Ping action to be performed).
  4. The request MUST contain the Signing TLVs.

Devices receiving a PingRequest message:

  1. MUST process the Signing TLVs as described in section 2.6.
  2. MUST begin the requested Ping operation.

The NMS will subsequently interrogate the device with one or more GET requests to the device for the PingResponse TLV.

Note the specific messaging exchanges vary per the definition of each commands. Details are provided within [CSMPMSG].

5. Security Considerations

As discussed in previous sections, a CSMP NMS signs outgoing Device messaging using an NMS private key. Signing TLVs included in the message payload enable signature verification by a device using an NMS signing certificate\public key, thereby providing authenication of source and integrity check of the message incoming to the Device (without confidentially).

Additional layer 2, 3, or 4 security mechanisms may be utilized to meet additional security requirements of specific deployment models. Examples include:

  1. Layer 2 802.1X/EAP-TLS may be used to provide mutual authenication of Device and NMS as well as distribution of a unique shared key to be used to subsequently encrypt and source authenticate communication.
  2. Layer 2 802.11i tactics may be used to distribute group keys useful for securing group wide (multicast) messaging.
  3. Layer 3 VPN may be used to secure messaging between Device and NMS.
  4. Layer 4 DTLS may be used to secure application specific messaging.

Specific details of the usage profile for these additional security mechanisms are highly specific to the LPWAN deployment and are thus out of scope of this specification.

6. IANA Considerations

This document requires no IANA actions.

7. Implementation Status

This specification documents the technical details of CSMP as it has been deployed by Cisco and partners since 2012. Today, CSMP deployments manage many millions of LPWAN endpoints across a wide variety of energy utility and smart city applications.

As this information is time dependent, the RFC Editor is requested to remove this section before publication.

8. Appendix A Registration Retry Example

  1. Device powers up.
  2. Device sets interval for 0 to 5 minutes.
  3. Device wait a random time between 2.5 and 5 minutes.
  4. Device sends a confirmable registration POST message.
  5. Device waits the rest of the interval until 5 minutes have passed.
  6. Device waits a random time between 5 and 10 minutes.
  7. Device sends the registration message again, with a new CoAP message ID.
  8. Device waits the rest of the interval until 10 minutes have passed.
  9. Device waits a random time between 10 and 20 minutes.
  10. Device sends the registration message again, with a new CoAP message ID.
  11. Device waits the rest of the interval until 20 minutes have passed.
  12. Device waits a random time between 20 and 40 minutes.
  13. Device sends the registration message again, with a new CoAP message ID.
  14. Device waits the rest of the interval until 40 minutes have passed.
  15. Device waits a random time between 40 and 60 minutes.
  16. Device sends the registration message again, with a new CoAP message ID.
  17. Device waits the rest of the interval until 60 minutes have passed.
  18. Repeat steps 15, 16, and 17 forever.

9. Normative References

[CSMPCOMP]
"CSMP Components", n.d., <https://github.com/woobagooba/draft-ietf-is-csmp/blob/e0be5a31906eb3e9983e7cc28b24ed9482543784/CsmpComponents-1.0.yaml>.
[CSMPDEV]
"CSMP Device Interface", n.d., <https://github.com/woobagooba/draft-ietf-is-csmp/blob/e0be5a31906eb3e9983e7cc28b24ed9482543784/CsmpDevice-1.0.1.yaml>.
[CSMPMSG]
"CSMP Payload Definitions", n.d., <https://github.com/woobagooba/draft-ietf-is-csmp/blob/e0be5a31906eb3e9983e7cc28b24ed9482543784/CsmpTLVsPublic.proto>.
[CSMPNMS]
"CSMP NMS Interface", n.d., <https://github.com/woobagooba/draft-ietf-is-csmp/blob/e0be5a31906eb3e9983e7cc28b24ed9482543784/CsmpNms-1.0.1.yaml>.
[OPENAPI]
"OpenAPI Initiative", n.d., <https://www.openapis.org/>.
[PB]
"Protocol Buffers", n.d., <https://developers.google.com/protocol-buffers>.
[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC7252]
Shelby, Z., Hartke, K., and C. Bormann, "The Constrained Application Protocol (CoAP)", RFC 7252, DOI 10.17487/RFC7252, , <https://www.rfc-editor.org/info/rfc7252>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.

Acknowledgments

The authors would like to express their gratitude to reviewers and early implementors, including but not limited to Chris Hett, Klaus Hueske, Hideki Tanaka, and Johannes van der Horst.

Authors' Addresses

Paul Duffy
Cisco Systems, Inc.
Jasvinder Bhasin
Cisco Systems, Inc.
Kit-Mui Leung
Cisco Systems, Inc.
Huimin She
Cisco Systems, Inc.
Li Zhou
Cisco Systems, Inc.