Internet-Draft | CoAP Simple Management Protocol | August 2022 |
Duffy (ed), et al. | Expires 18 February 2023 | [Page] |
CoAP Simple Management Protocol (CSMP) provides lifecycle management for resource constrained IoT devices deployed within large-scale, bandwidth constrained IoT networks. This document describes the design and operation of CSMP.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 18 February 2023.¶
Copyright (c) 2022 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
Low Power Wide Area Network (LPWAN) technologies provide long range, low power connectivity for Internet of Things (IoT) applications. LPWANs typically operate over distances of several kilometers with link bandwidths as low as 10s of Kbps. LPWAN devices are often compute, storage and power constrained (optimized to operate for years on a single battery charge).¶
A large LPWAN may contain millions of devices requiring a Network Management System (NMS) able to provide at-scale lifecycle management. The management protocol must be able to operate within the constrained performance envelope of an LPWAN. The management protocol must offer an efficient message encoding, be optimized for efficient and secure messaging flows across the LPWAN, and support classic NMS functions such as device on-boarding, device configuration, device status reporting, securing the network, etc.¶
This document describes the design and operation of the CoAP Simple Management Protocol (CSMP), which provides management capabilities for constrained IoT devices deployed within large scale LPWANs. Features include:¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
CSMP is a usage profile of the Constrained Application Protocol [RFC7252], which is designed for implementing RESTful messaging for resource constrained devices. It is fair to view CoAP as a binary encoded functional subset of HTTP operating over UDP which also supports multicast messaging. Resources (addressable objects) transported within CSMP message payloads are implemented using the Protocol Buffers compact binary encoding [PB].¶
It is assumed the reader is familiar with:¶
The NMS and devices communicate directly using CoAP. Acting as a CoAP client, a device issues RESTful requests to read or modify specific resources (objects) exposed by the NMS server. Likewise, the device serves requests from the NMS to manipulate resources exposed by the device.¶
CSMP specializes the usage of CoAP in the following ways:¶
CSMP defines a CSMP Device interface and a CSMP NMS interface. Each of these interfaces defines a set of resources (objects), the addresses at which these resources exist (URLs), and the methods (GET, PUT, POST, DELETE) which may be used to manipulate the resources to implement device management operations. The CSMP Device and CSMP NMS interfaces are expressed using the OpenAPI interface definition language.¶
OpenAPI's heritage is the expression of interfaces typically constructed with HTTP and JSON. This document defines a few conventions required to express a CoAP interface in OpenAPI:¶
A CSMP device MUST implement the interface specified within [CSMPDEV]. Various forms of the GET method are used for retrieving registration information, device information, and monitoring information from the devices. Various forms of the POST method are used to deliver configuration and commands to devices. Usage of this interface is detailed in the sections which follow.¶
A CSMP NMS MUST implement the interfaces specified within [CSMPNMS]. Various forms of the POST method are used for device registration, device metrics reporting, and asynchronous GET responses (resulting from a request including the "a" option and default response URL). Usage of this interface is detailed in the sections which follow.¶
CSMP devices and CSMP NMS use the CoAP GET, POST, and DELETE methods to manipulate the resources specified in [CSMPMSG], which are the Protocol Buffer object payloads contained in the various CoAP requests and responses. Usage of these payloads is detailed in the sections which follow.¶
A CSMP server is located at a <base-url> of the form coap://hostname:port/<base-path>¶
It is RECOMMENDED that a default port of 61628 be used.¶
The <base-path> for all CSMP resources at a particular hostname:port MUST be identical.¶
It is RECOMMENDED that a default <base-path> of "/." be used.¶
Because an NMS CSMP request message may be multicast to a large number of devices, all CSMP devices within a multicast domain MUST have identical port and <base-path>.¶
The following <base-path> are reserved for future use:¶
Full details of the NMS and Device service URLs are defined in [CSMPDEV] and [CSMPNMS].¶
The message payloads of CSMP requests and responses MUST be formatted as a sequence of Type-Length-Value objects. Each TLV object has the following format:¶
| Type | Length | Value |¶
The Type field is an unsigned integer identifying a specific CSMP TLV ID and MUST be encoded as a Protocol Buffers varint.¶
The Length field is an unsigned integer containing the number of octets occupied by the Value field. The Length field MUST be encoded as a Protocol Buffers varint.¶
The Value field MUST contain the Protocol Buffers encoded TLV corresponding to the indicated Type.¶
The set of objects defined by CSMP and their Type (TLV ID) are specified in [CSMPMSG].¶
A single CSMP TLV MUST NOT be larger than the space available in a single CoAP request message payload, minus the space occupied by mandatory TLVs. CSMP requests containing large TLVs or many TLVs may exceed available space within a CoAP request / UDP datagram.¶
If a POST request is larger than the UDP MTU, the request MUST be split into multiple POST requests with the TLVs spread across the message bodies. The server MUST be prepared to handle the TLVs in any order.¶
If a GET request exceeds the UDP MTU because the max length of the "q" option is exceeded, the request MUST be split into multiple GET requests, each with a subset of the query option.¶
The GET response from a server may not be able to fit all requested TLVs into the response. The server will respond with only the TLVs it is able to fit within the message body. A client SHOULD issue additional GET requests to obtain the missing TLVs.¶
Recommended network MTU will be deployment / technology dependent. For example, an MTU of 1024 is often used for large scale IEEE 802.15.4 mesh networks.¶
The NMS signs outgoing device messaging. Devices verify the signature to confirm source and integrity of incoming NMS messages. NMS-Device trust is established with an NMS certificate/public key programmed into the device at time of manufacture. Signing TLVs included in the message payload enable signature verification by a device. The Signing TLVs are:¶
If either of the Signing TLVs are missing from a message payload, the device MUST ignore the message.¶
Additional layer 2, 3, or 4 security mechanisms may be utilized to meet the requirements of specific deployment models (Wi-SUN layer 2 security, VPN at layer 3, DTLS at layer 4, etc.). Details of these additional security mechanisms are out of scope of this specification.¶
For situations in which a request payload signature adds overhead without improving security, the Signing TLVs may be elided for certain payloads. For example, the overhead of signing each block of a firmware update may be unnecessary as a full image integrity check is performed over the entire file and reported to the NMS.¶
The signature exemptible TLVs are:¶
The NMS MAY elide the Signing TLVs provided the request body contains only exemptible TLVs. Otherwise, the Signing TLVs MUST be included.¶
A device MAY accept incoming message payloads without Signing TLVs provided the payload contains only exemptible TLVs.¶
CSMP groups are used to support multicast messaging to devices.¶
A group is uniquely defined by a group-type/group-id pair. A device MAY be a member of multiple group-types, but MUST be a member of only one group-id within a group-type. A device MUST support membership in at least two group types.¶
The NMS assigns a device to a group using the GroupAssign TLV. On initial boot, a device has no group assignments. To be assigned to a device group, a GroupAssign TLV MUST be sent to the device either by a POST request from the NMS or within the response to the device's registration request to the NMS.¶
The NMS removes a device from a group by POST-ing a GroupEvict TLV to the device.¶
If a device's group assignment is changed at the NMS, upon receipt of the next metrics report from the device, the NMS MUST POST a new GroupAssign TLV to the device.¶
Group assignments are not additive. Assignments MUST be replaced upon receipt of a subsequent GroupAssign TLV.¶
A GroupAssign TLV MUST NOT be sent within a multicast message.¶
A GroupEvict TLV MUST NOT be sent within a multicast message.¶
Devices MUST maintain group assignments in durable storage (across power cyclings / reboots).¶
CSMP multicast messages MUST contain a GroupMatch TLV. Upon receipt of a multicast CSMP message:¶
Group type 1 is reserved for configuration.¶
Group type 2 is reserved for firmware.¶
A device processes message payload TLVs in the following order:¶
This section describes the major operational flows of the CSMP protocol.¶
For understanding of CSMP device behavior, it is helpful to consider the NMS' view of device states and state transitions (presented below).¶
The NMS views a device as transitioning through the following states:¶
A device requires the <nms-base-url> of its NMS. Acquisition of the NMS URL may be accomplished via a variety of means including a DHCP option, pre-deployment administrative configuration setting, etc. The specific mechanism to be used is beyond the scope of this specification.¶
For devices using DHCPv6 address assignment, a device MAY request DHCPv6 option 26484 sub-option 1 to obtain the URL of its NMS.¶
Registration is the messaging flow via which a device announces its entry onto the network and provides a means for the NMS to push configuration information to the device.¶
A device registers with an NMS by issuing a registration request to an NMS. The NMS subsequently responds to reject or accept the registration, with device configuration included in a successful registration response. A device issues a registration request for a variety of reasons:¶
A device and NMS implement the registration messaging flow depicted in Figure 2.¶
A device MUST implement two configurable parameters used to control the registration process, initially set at manufacture time, and MUST be maintained in durable storage.¶
A device MUST issue registration requests using the following algorithm.¶
Set tInterval = tIntervalMin. Wait an initial period between 0 and tInterval seconds. Do { 1. Wait tBackoff seconds, where tBackoff is a random interval between tInterval/2 and tInterval seconds. A tBackoff value in the latter half of the interval ensures a minimum time between successive registration attempts. 2. Send a new CoAP CON POST request message to NMS <nms-base-url>/r. The message payload MUST contain the registration TLVs described in section 3.3.1.2. 3. Wait the remaining (tInterval – tBackoff) seconds. 4. Set tInterval to 2 * tInterval. If tInterval is greater than tIntervalMax, set tInterval to tIntervalMax. } While the device has not received an ACK to its registration POST.¶
An example execution of a full registration POST retry sequence is presented in Section 8.¶
If the device receives an ACK message with CoAP response code 2.03 (valid) from the NMS at any time before the device's next registration POST, the TLVs within the ACK message MUST be processed.¶
The following TLVs MUST be included in the device registration POST to the NMS:¶
Previously registered devices SHOULD already have (durably stored) values for the Session ID, GroupInfo,and ReportSubscribe TLVs and MUST include these TLVs in the device registration POST to the NMS:¶
The following Device Information TLVs MUST be included in the registration POST:¶
Note that the SessionID, GroupAssign, and ReportSubscribe TLV set is considered to be generic device Configuration. The Configuration TLV set is technology specific and MAY be extended with additional technology specific TLVs (beyond the scope of this specification).¶
Upon receipt of a registration request, the NMS looks up the information for the device identified by DeviceID to confirm correctness of the request. See [CSMPNMS] for details of DeviceID and SessionID validation and related error response codes.¶
If the device is found in inventory, authorized to register, and all other registration request content is confirmed to be valid, the NMS MUST send an ACK response message with response code 2.03(Valid)to the device. The ACK takes one of two forms, depending upon whether or not the NMS will redirect the device to another NMS.¶
In the case where the NMS is not redirecting, the response body to a valid registration request contains the following TLVs:¶
In the case where the NMS is redirecting, the response body to a valid registration request contains the following TLVs:¶
When the response contains a SessionID TLV, the device MUST durably store this TLV and SessionID TLV MUST be included in all future CSMP requests to the NMS.¶
When the response contains a GroupAssign TLV, the device MUST durably store the group-id (overwriting any other stored group-id for the same group-type) and MUST use the new GroupAssign values for comparison with all future receipt of GroupMatch TLVs.¶
When the response contains a ReportSubscribe TLV, the device MUST begin reporting the indicated metrics TLVs (ignoring any TLVs requested by the ReportSubscribe which are unknown to the device).¶
The NMS includes the NMSRedirectRequest TLV in a registration response to request the device register with an alternate NMS instance (load balancing, etc.). Upon receipt of this TLV, the device MUST cease registration attempts with the original NMS and start the registration process with the NMS indicated in the NMSRedirectRequest TLV. If registration succeeds with this new NMS, all subsequent device CSMP messaging MUST be directed to this new NMS. Note that device receipt of NMSRedirectRequest TLV is a one-time redirect and not persisted across device restarts.¶
The NMS considers device registration to be complete when all of the following conditions are met:¶
Upon receipt of a ReportSubscribe TLV, a device configures as many as two metrics reports:¶
The primary metrics report MUST be used for mains powered devices (with the secondary report disabled). To conserve power, metrics reporting for low power devices MAY be split across primary and secondary reports, with the primary report configured to provide TLVs needed at more frequent interval and the secondary configured for TLVs required at a more relaxed interval.¶
A device configures metrics parameters to control the device's primary metrics report as follows:¶
A device configures metrics parameters to control a device's secondary metrics report as follows:¶
The TLV content of the primary and secondary metrics reports are deployment and application specific. For example, the primary metrics report for a 6LoWPAN, mains-powered mesh node might be configured as:¶
TLV content of the secondary metrics report is similarly application specific and beyond the scope of this specification.¶
For each configured metrics report, a device MUST commence reporting immediately after receipt of a successful registration ACK by sending a NON POST to <nms-url>/c containing the following TLVs:¶
Following the initial metrics report, the device MUST implement the following algorithm for subsequent metric reports (for both primary and secondary report):¶
Send a new CoAP NON POST to <nms-url>/c with the required metrics TLVs. Wait an initial random interval between 0 and tMetricsInterval seconds. Do { 1. Wait tMetricsBackoff seconds, where tMetricsBackoff is random value between tMetricsInterval/2 and tMetricsInterval seconds. 2. Send a new CoAP NON POST message to <nms-url>/c with the required metrics TLVs 3. Wait the remaining (tMetricsInterval – tMetricsBackoff) seconds so that the full tMetricsInterval has expired. } While the device has a valid IP address.¶
CSMP defines a device firmware update process optimized for LPWANs. A key aspect of this process is the separation of image placement on a device from activation (execution) of the image on the device.¶
The device firmware update process consists of three sub-flows:¶
A device should implement the following mechanisms in support of firmware update:¶
A CSMP firmware image file consists of three main parts: a CSMP defined image header, the vendor defined image binary, and the vendor defined image signature (as depicted below).¶
Field | Size (octets) | Description |
---|---|---|
Begin Header | ||
Header Version | 4 | 32 bit unsigned integer which MUST be set to 2. |
Header Length | 4 | 32 bit unsigned integer which MUST be set to 256. |
App Rev Major | 4 | Vendor specific 32 bit unsigned integer which is set to indicate the major revision number of the application image. |
App Rev Minor | 4 | Vendor specific 32 bit unsigned integer which is set to indicate the minor revision number of the application image. |
App Build | 4 | Vendor specific 32 bit unsigned integer which is set to indicate the build of the application image. |
App Length | 4 | 32 bit unsigned integer which MUST be set to the octet length of the Header + Image Binary field. |
App Name | 32 | Vendor specific 32 octet string which is set to indicate the name of the application. |
App SCC Branch | 32 | Vendor specific 32 octet string which is set to indicate the source code control system branch ID. |
App SCC Commit | 8 | Vendor specific 8 octet string which is set to indicate the source code control system commit ID. |
App SCC Flags | 4 | Vendor specific 32 bit unsigned integer which is set to indicate the source code control system build flags. |
App Build Date | 16 | Vendor specific 16 octet string which is set to indicate the build date and time of the application image. |
hwid | 32 | 32 octet field which is RECOMMENDED to be set to a concatenation of a unique manufacturer ID and product model identifier. |
sub_hwid | 32 | 32 octet field which MAY be set for a manufacturer specific purpose, or functionally elided by filling with 0x20 (ASCII space character). |
kernel_rev | 16 | 16 octet field which MAY be set for a manufacturer specific purpose, or functionally elided by filling with 0x20 (ASCII space character). |
sub_kernel_rev | 16 | 16 octet field which MAY be set for a manufacturer specific purpose, or functionally elided by filling with 0x20 (ASCII space character). |
Reserved | 44 | Pad to 256 octets, octets MUST be set to 0. |
End Header, Begin Image | ||
Image Binary | Variable | Vendor specific image data. |
End Image, Begin Signature | ||
Signature Variable | Vendor specific image signature. | Calculated over entire content of this structure except the signature and pad fields. |
End Signature, Begin Pad | ||
Pad | Variable | Optional pad field to enable image to fill vendor specific flash memory boundary. When present, octets MUST be set to 0xFF. |
All multi-octet fields are encoded as little-endian.¶
An NMS implements the messaging flow depicted in Figure 3 and Figure 4 to download an image to a group of devices. Unicast distribution is also supported, with the difference being use of unicast addresses, omission of the GroupMatch TLV and omission of the 'a' query option.¶
A firmware download begins with the NMS informing the device group of the meta-data of an image the NMS wishes to download, and the devices subsequently informing the NMS of the images already loaded on the devices.¶
The NMS MUST issue a NON POST to <device-url>/c configured as follows:¶
Devices receiving a TransferRequest message:¶
The NMS MUST proceed with image block transfer when at least one member of the target device group indicates Response Code of OK in the TransferResponse TLV. Otherwise, the transfer MUST be aborted.¶
Images are often multiple 100s of Kilobytes in size and likely require fragmentation into multiple blocks (N) to be transferred to a device.¶
For the transfer of each image block, the NMS MUST issue a NON POST to <device-url>/c configured as follows:¶
To determine image download progress, the NMS MUST periodically include a DescriptionRequest TLV, requesting the FirmwareImageInfo TLV, in the image block request. It is RECOMMENDED that the NMS request the FirmwareImageInfo TLV at 10% increments of the image download. The NMS MUST request the FirmwareImageInfo TLV with the transfer of the last block of the image.¶
Devices receiving the image block request message:¶
Receipt of the final FirmwareImageInfo TLV enables the NMS to confirm the integrity of the completely downloaded image.¶
The messaging flow for scheduling image activation across a group of devices and cancelling a scheduled image activation are depicted in Figure 5 and described below. Unicast activation is also supported, with the difference being use of unicast addresses, omission of the GroupMatch TLV and omission of the 'a' query option.¶
An NMS implements the following message flow to command devices to designate an image as the active executable and the time at which the image is to be activated.¶
The NMS MUST issue a NON POST to <device-url>/c configured as follows:¶
Devices receiving a LoadRequest message:¶
An NMS implements the following message flow to cancel a previously scheduled image activation.¶
The NMS MUST issue a NON POST to <device-url>/c configured as follows:¶
Devices receiving a CancelLoadRequest message:¶
An NMS implements the following message flow to command a device to designate a stored image as the backup image.¶
The NMS MUST issue a NON POST to <device-url>/c configured as follows:¶
Devices receiving a SetBackupRequest message:¶
Many TLVs served by a device are used by the NMS to interrogate the device for configuration state and operational status. There are, however, several TLVs which direct a device to execute internal actions. Examples of these TLVs are PingRequest and RebootRequest.¶
Usage of a command TLVs is illustrated with the following PingRequest example directed at a single device.¶
The NMS MUST issue a NON POST to <device-url>/c configured as follows:¶
Devices receiving a PingRequest message:¶
The NMS will subsequently interrogate the device with one or more GET requests to the device for the PingResponse TLV.¶
Note the specific messaging exchanges vary per the definition of each commands. Details are provided within [CSMPMSG].¶
As discussed in previous sections, a CSMP NMS signs outgoing Device messaging using an NMS private key. Signing TLVs included in the message payload enable signature verification by a device using an NMS signing certificate\public key, thereby providing authenication of source and integrity check of the message incoming to the Device (without confidentially).¶
Additional layer 2, 3, or 4 security mechanisms may be utilized to meet additional security requirements of specific deployment models. Examples include:¶
Specific details of the usage profile for these additional security mechanisms are highly specific to the LPWAN deployment and are thus out of scope of this specification.¶
This document requires no IANA actions.¶
This specification documents the technical details of CSMP as it has been deployed by Cisco and partners since 2012. Today, CSMP deployments manage many millions of LPWAN endpoints across a wide variety of energy utility and smart city applications.¶
As this information is time dependent, the RFC Editor is requested to remove this section before publication.¶
The authors would like to express their gratitude to reviewers and early implementors, including but not limited to Chris Hett, Klaus Hueske, Hideki Tanaka, and Johannes van der Horst.¶