Dynamic MultiPath Routing Protocol

Internet-Draft	Dynamic MultiPath Routing	November 2021
Pfeifer & Widmann	Expires 31 May 2022	[Page]

Abstract

Dynamic MultiPath Routing (DMPR) is a loop free path vector routing protocol with built-in support for policy based multipath routing. It has been designed from scratch to work at both low and high bandwidth networks - even with high packet loss. The objective was to keep routing overhead low and ensure a deterministic protocol exchange behavior. DMPR can be used to manage larger networks with characteristics based on BGPv4 with transport and self-configuration properties taken from OSPF/OLSR. Unlike BGPv4 or OSPF, DMPR does not support higher network separation concepts. A DMPR network is a flat network in which DMPR nodes have equal tasks. This also applies to DMPR communication. Unlike OLSR/OSPF there is no flooding messages (topology broadcast), information are stored, accumulated/compressed and forwarded at each DMPR node. This feature contributes to the message load being deterministic.¶

1. Introduction

Todays mobile wireless networks have a diversity of requirements on the wireless links. To meet these requirements, it is possible to attach multiple network access technologies on the router and select, depending on the CoS of the packet, over which wireless link the packet is sent. This is the main idea of policy based multipath routing. The established routing protocols do not support the use of multiple access technologies on a single router. To tackle this issues, DMPR as been developed as a protocol for policy based multipath routing in and between mobile networks, which consist of multiple wireless links with different characteristics. DMPR makes it possible to calculate multiple routing tables and maintain the best paths for multiple policies.¶

1.1. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].¶

1.2. Terminology

The following list describes the terminology used in this RFC¶

Router, Node: A router (or a node) is a routing entity of a network. It runs an implementation of DMPR, has zero or more networks attached and at least one link to another router.¶
Link: A link is the direct connection between two interfaces of two distinct routers.¶
Link Attribute: A link attribute is a basic attribute of a link, for example the maximum bandwidth, the average packet loss or the cost of the link.¶
Path: A Path in DMPR is one or more successive, directly connected links which define a loop-free path from one node to another.¶
Policy: A policy describes an arrangement relation over a set of paths using the link attributes of these paths. Note that to guarantee loop-free properties this arranging function MUST be commutative and associative. Policy examples can be found in Appendix A ¶

1.3. Organization of this Document

Section 2 describes the information every node has and gets and how it should handle this information. Section 3 describes the behavior of the protocol in different scenarios and how it achieves particular features. Section 5 describes the message format in detail. Section 6 describes optional features that can be implemented to improve or extend DMPR but are not required for the basic functionality of propagating routing information.¶

Appendix A has a few policy examples. Appendix B lists all constants used in this RFC where some of them are configurable. Appendix C shows a few simple examples of how DMPR behaves under specific network conditions.¶

1.4. Overview

Every router periodically sends out unicast or multicast messages to all routing enabled links. This message already includes routers database (best path to each node) known by this router. A router receiving this message adds itself to all these paths, chooses the best path to each node among all those it got from its neighbors and advertises these paths itself via unicast or multicast messages (if received as Unicast, it MUST be returned as unicast). This is standard procedure in a path-vector routing protocol. In DMPR the paths are further separated by a policy, therefore it is possible to have more than one path to each node. Also included in the message are all attributes of this path so that the receiving node can make informed decisions on whether this path is the best under the current policy. In the end there exist multiple paths through the network and packets can be routed according to their requirements which for example can be the path with the highest bandwidth or with the lowest latency.¶

1.5. Distinction from other Routing Protocols

Traditionally, routing protocols find the best path using a scalar metric. This metric may be a simple constant stating the preference of the link or may be a computed metric using several factors such as bandwidth, latency or cost. Furthermore, this metric is only known locally or, for example in the case of BGP-4, where external paths can be marked with a local preference for internal peers, to a small subset of the network. In DMPR a policy is a globally defined function that defines a function how to determinate the best path. For this reason, the policy has all link attributes of each path at its disposal and therefore has a higher control over which path it chooses.¶

3. Behavior

3.1. Neighbor Detection

DMPR supports unicast and multicast neighbor detection and transport schemes. The scheme MUST be configurable at link level (e.g. eth0: multicast, eth1: unicast). Multicast provides ad-hoc capabilities without prior knowledge of neighbor nods. The multicast detection mechanism and transport mechanism are similar to those of other ad-hoc routing protocols such as OLSR.¶

Unicast detection and transport lacks support of ad-hoc configuration: the neighbor list must be configured a prior. The advantage of using unicast is that DMPR can also be used on networks that do not or insufficiently support multicast. Notes: an alternative for the use of unicast is the use of tunnels (IPIP, GRE, ...). For example, the tunnel is the preferred solution when BGAN terminals or routing foreign segments must be bridged.¶

DMPR is build around the concept of identifying nodes uniquely via DMPR ID. It is therefore irrelevant whether neighbours are detected several times via different links and unicast or multicast.¶

3.1.1. Multicast Neighbor Detection

Each node listens to a unicast or multicast address at each enabled DMPR link. If multicast, the multicast address SHOULD be configurable. Periodically, after a defined message interval + jitter, a node sends out its routing message, which includes:¶

All nodes it has a path to, including the networks they advertise¶
For each policy, a path to each node it knows of.¶
The attributes of the links between the paths.¶

When a node receives a message, it deduces a connection and therefore a path to the sender via the interface that message came in. With this mechanism the path to a node propagates through the network. Every received message SHOULD be assigned to a hold timer. When this hold timer expires, the message MUST be deleted from the Message Information Base. The timeout SHOULD be a multiple of the routing message interval. Asymmetric links are handled with a feature called reflection, which is described in Section 5.4.¶

3.1.2. Unicast Neighbor Detection

For unicast, DMPR MUST support a list of IPv4/IPv6 addresses of DMPR neighbors at a particular link. The messages and timeout constraints are identical to previous multicast section.¶

3.2. Detection of a Lost Neighbor

Whenever a neighbor is not reachable anymore (e.g. due to topology change), no further routing messages will be received from this neighbor. As all the received messages are assigned to a timer, no routing messages of the lost neighbor will be present in the Message Information Base, after the expiration of this timer. This means, that the loss of the neighbor is detected after the timeout of the last received routing message of the concerned neighbor. For this reason, the routing table SHOLD be recalculated, whenever the Message Information Base is purged due to timeouts.¶

3.3. Interface Handling

In DMPR all interfaces that are registered with the DMPR daemon are treated completely separate from each other. Routing messages are sent over each one individually. This ensures that all possibly accessible neighbors are reachable. The message information base (seeSection 2.1) is grouped by interface. Therefore nodes can detect links on each interface individually.¶

3.4. Message Handling

Each message received by a neighbor is saved in the Message Information Base (seeSection 2.1). With this message a hold timer is set that purges the messages after the given time. To improve the message size a node can choose to only send partial updates (i.e. differential, only the changes since the last full update). To support this a node has to retain the a version from the last full message so it can apply the new partial message.¶

3.5. Policies

To support routing different traffic types over different routes, DMPR supports multiple policies. A policy defines, how the best path to a destination is computed using the available link attributes. For all the defined policies, seperate routes will be calculated to every reachable destination. The best path to all destinations is included in the routing message for every policy. For each policy, a seperate routing table MUST be generated. In large networks, this results in a multi topology routing. When different parts of the network have different attributes (e.g. one path has a low loss rate, another path in contrast has a higher bandwidth), different subsets of the topology will be used to forward packets that require different policies (Class of Service).¶

3.6. Link Attributes

DMPR MUST know the link attributes that are required to determinate the best path for all the registered policies on all links (interfaces) to the router. These attributes can either be configured staticly by the network administrator or can be dynamically gathered from the attached modem devices. This RFC does not specify how the link information has to be gained. There is no specification defining which attributes must be given to the protocol (e.g bandwidth, loss rate, latency). The requirement of attributes depends on the policies meant to be used.¶

3.7. Route Selection

The calculation of the best route MUST be defined for each policy separately. For example simple policies only consider a single link metric (such as bandwidth or loss rate) for the calculation of the best route. Combined policies might use a combination of multiple link metrics. The route selection mechanism is part of a policy's definition and therefore can be individually defined.¶

3.8. Routing Data Calculation

For every reachable destination, the routing data is calculated for all defined policies using the policy's specific route selection mechanism. As a result, there MUST be a seperate routing table for every defined policy. Whenever new crucial information is received, the routing data MUST be recalculated. Information that causes a recalculation of the routing data can be:¶

A destination is reachable via a new interface¶
The hold timer of a message in the Message Information Base is expired (a destination is not reachable anymore via a specific interface¶
Link attributes have changed¶

3.9. Network Retraction

Each network advertised in a message has an optional flag called "retracted". This flag is set to true when a node no longer advertises this network as available. The only node to ever set this flag to true MUST be the originally advertising node. A set retracted flag always supersedes an unset flag. Networks are forwarded with this ruleset:¶

Table 1
Network known as not retracted	Network known as retracted	Network set as retracted in a message	Action to take
false	false	false	Insert network in known networks and forward
false	false	true	ignore network, do not forward
false	true	false	forward network as retracted
false	true	true	forward network as retracted
true	false	false	just forward network
true	false	true	set network in known networks as retracted and forward retracted
true	true	false	illegal
true	true	true	illegal

5. Message Format

5.1. Header

A DMPR packet consists of a preamble, followed by zero or more extension headers followed by zero or one payload. Each extension header and payload is defined by a type.¶

Table 2: Possible Types
Type	Use
0-119	Extension Header
120-127	Extension Header, reserved for private use
128-247	Payload
248-255	Payload, reserved for private use

Possible Types are defined in further detail below¶

5.1.1. Preamble

The preamble of a DMPR packet is as follows¶

0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Magic| Reserved|    NextType   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The preamble of a packet¶

Magic: A 3-bit Magic: 0b010¶
Reserved: Reserved for future use¶
NextType: The type of the next header or payload.¶

5.1.2. Extension Header

An extension header consists of the type immediately following this header, a length specifier, and the Extension Header data.¶

0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    NextType   |     Length    |                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               +
|                                                               |
+                              Data                             +
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

NextType: 8-bit unsigned integer: The type of the immediately following header or payload (same as specified in the packet preamble description)¶
Length: 8-bit unsigned integer: The length of the Data field in 2-octets, this does not include NextType and Length itself¶
Data: The header-specific data according to length. The encoding and type is specified by the header itself.¶

5.1.3. Payload

The Payload consists of the data from the end of the preamble or last extension header until the end of the packet. Payloads may be recursive, i.e. contain a valid packet (or parts of it) in themselves. Payload processors therefore MUST have the ability to feed their result back into the message processing chain. This behavior is defined by the payload itself.¶

5.1.3.1. Payload: keep-alive

Type: 127¶

This is a keep-alive packet, the payload length is zero. Implementations SHOULD reset the message hold timer for the sending node upon receiving a keep-alive packet¶

5.1.3.2. Payload: uncompressed JSON

Type: 128¶

Unkompressed, plain, standard-compliant I-JSON data as described in [RFC7493]. This is the main routing data; its structure is defined in Section 5.2 ¶

5.1.3.3. Payload: compressed JSON

Type: 129¶

LZMA-compressed standard-compliant I-JSON data as described in [RFC7493]. This is the main routing data; its structure is defined in Section 5.2 ¶

5.1.3.4. Payload: Fragmentation

Type: 130¶

A packet greater than the MTU between two nodes SHOULD be fragmented using the fragmentation payload.¶

0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|   Identifier  |L|Packet offset|                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               +
|                            Payload                            |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The Fragmentation Payload Header¶

Identifier: Identifies possibly concurrent fragmented packets. Implementations SHOULD use an incrementing counter to practically eliminate the possibility of a collision.¶
L(ast): Last, set to 1 if this packet has the highest packet offset in this fragmentation collection, i.e. is the last packet.¶
Packet index: 7-bit unsigned integer. Defines the index of this packet in the list of fragments resulting in the fragmentation of the original packet. The first packet has offset zero.¶

When a packet is larger than the MTU of the link between two nodes it SHOULD be fragmented. For this purpose, the sending node computes the maximum effective payload size for packets sent (i.e MTU less preamble, possibly extension headers and the fragmentation header) and splits the original packet into parts with this size. For each of these parts, it sends a packet with the fragmentation header set to a common identifier, a corresponding packet offset and the LAST bit set for the last fragment.¶

The receiving node keeps track of all received fragments, grouping them by source address and identifier. As soon as all fragments of a packet have been received, the reconstructed packet MUST be fed back into the message processing chain as if it were a new, just received packet. Fragments MUST be regularly purged based on a hold timer.¶

5.2. JSON Payload

A JSON payload is an ASCII-7 encoded JSON object. A sending node SHOULD use ascii-encoding for the JSON data. A receiving node MUST be able to decode UTF-8 encoded data. The payload can be zip compressed. A compressed payload has to be announced in the header.¶

Augmented requirements language for this section:¶

REQUIRED: This field is required, the sending node MUST include it.¶
REQUIRED if not empty: If the field would be empty, it can be omitted, otherwise it is REQUIRED¶
OPTIONAL: This field can be inserted to activate specific features or use other functionality. A sending node can choose to omit it and a receiving node MUST be able to work without this field.¶

General Message Structure¶

{
  "id": <NODE_ID>,
  "seq": <SEQUENCE_NUMBER>,
  "type": <TYPE>,
  "partial-base": <SEQUENCE_NUMBER>,
  "addr-v4": <IPv4_ADDRESS>,
  "addr-v6": <IPv6_ADDRESS>,
  "networks": {
    <IPvX_NETWORK>: {},
    <IPvX_NETWORK>: {
      "retracted": true
    }
  },
  "routing-data": {
    <POLICY>: {
      <NODE_ID>: {
        "path": <PATH>
      }
    }
  },
  "node-data": {
    <NODE_ID>: {
      "networks": {
        <IPvX_NETWORK>: {},
        <IPvX_NETWORK>: {
          "retracted": true
        }
      }
    }
  },
  "link-attributes": {
    <LINK_ATTRIBUTE_ID>: {
      <LINK_ATTRIBUTE>: <METRIC>
    }
  },
  "request-full": union(true, [<NODE_ID>, ...]),
  "reflect": {
    <REFLECT_DATA>: <DATA>
  }
  "reflected": {
    <NODE_ID>: {
      <REFLECT_DATA>: <DATA>
    }
  }
}

Key and value description:¶

id

string: The sending node's id, NODE_ID: MUST NOT contain any of the brackets: ()[]{}<>¶

seq

integer: The message sequence number, strictly monotonically increasing¶

type

string: The type of the message, specified in further detail below Section 5.2, Paragraph 8 ¶

partial-base

integer: The base message of a partial update, the message then only includes the difference between the actual data and the base message¶

addr-v4

string: The IPv4 address of the sending node over the link this packet has been sent.¶

addr-v6

string: The IPv6 address of the sending node over the link this packet has been sent.¶

networks

object: The networks advertised by this node. The keys are valid IPv4/IPv6 network identifications with subnet prefix. If the value of a network key is a object itself and the "retracted" key of this object is set to true, the network MUST be handled as retracted. See Section 3.9 ¶

routing-data

object: A path to each reachable node for each policy. POLICY is the name of a policy defined in the sending node. If the receiving node does not understand this policy the entry MUST be ignored. PATH: a path to a node described according to this syntax:¶

path = node [node-id ">[" link-attribute-id "]>" path]
node-id = *ALPHA
link-attribute-id = *DIGIT

node-data

object: a list of networks for each reachable node defined in "routing-data". "networks" is handled like "networks" defined above.¶

link-attributes

object: the set of link-attributes used in the paths of routing-data. Each key SHOULD be an integer and MUST NOT contain any of the brackets ()[]{}<> The value of an entry is itself a object containing LINK_ATTRIBUTE: METRIC pairs where LINK_ATTRIBUTE is the name of a link attribute and metric is its value as defined in Section 1.2 ¶

request-full

array or true: A list of NODE_IDs from which the sending node requests a full update message. If true the node requests a full update from all neighbors.¶

reflect

object: arbitrary data the sending node wants to have included in the "reflected" object in the next message of the receiver¶

reflected

object: a set of reflected data, contains, for each neighboring node the data the node requested to reflect.¶

Each message has a type. This RFC describes two types, namely full and partial, which are described in further detail here.¶

5.2.1. Full Update

A full update SHOULD replace all data from the sending node in the receivers Message Information Base. It MUST NOT require any previous knowledge of the sender by the receiver. The following keys are specified:¶

addr-v4: string, REQUIRED: The IPv4 address of the sending node over the link this packet has been sent.¶
addr-v6: string: REQUIRED: The IPv6 address of the sending node over the link this packet has been sent.¶
: Note: Only one of addr-v4 and addr-v6 is required¶
networks: object, REQUIRED if not empty: The networks advertised by this node. See above¶
routing-data: object, REQUIRED if not empty: The paths advertised by the sender, grouped by POLICY and target NODE_ID. The syntax of a path is described above. A path MUST include the sender of this message, all hops with their link-attribute IDs and the target node.¶
node-data: object, OPTIONAL: The networks from other nodes known to the sender including their retraction status.¶
link-attributes: object, REQUIRED if not empty: The link-attributes used in "routing-data". Each entry MUST contain all attributes known to the sender, even if they are not needed by a policy.¶

5.2.2. Partial Update

A partial update only replaces the changed data in the receivers message information base. It therefore has the additional field "partial-base" which describes the sequence number of the base message, which MUST be a full update message, to which the changes apply. Note: A partial update only describes changes to a previous full update, never to a previous partial update. If the receiving node is unable to apply the partial update, e.g because it lacks the base message, then this node SHOULD use the "request-full" procedure to request a new full update (seeSection 5.3).¶

The following keys are specified:¶

addr-v4: string, OPTIONAL: The IPv4 address of the sending node over the link this packet has been sent, only included if it has not changed. If it became invalid, the value is null.¶
addr-v6: string, OPTIONAL: The IPv6 address of the sending node over the link this packet has been sent, only included if it has not changed. If it became invalid, the value is null.¶
: Note: A partial update MUST NOT produce an invalid configuration by deleting the only address available for a node.¶
networks: object, OPTIONAL: The networks advertised by the sender. The entries replace the base message entries on a per NODE_ID basis. If an entry has been deleted, the value for the specific NODE_ID is null.¶
routing-data: object, OPTIONAL: The paths advertised by the sender, grouped by POLICY and target NODE_ID. The entries replace the base message entries on a per NODE_ID basis. If an entry has been deleted, the value for the specific NODE_ID is null.¶
node-data: object, OPTIONAL: The networks from other nodes known to the sender. The entries replace the base message entries on a per NODE_ID basis. If an entry has been deleted, the value for the specific NODE_ID is null.¶
link-attributes: object, REQUIRED if not empty: The link-attributes used in "routing-data". Note that link-attributes are only valid on a per-message basis and MUST NOT replace link-attribute entries in the base message.¶

5.3. Requests

When a node is not able to apply a partial update or just joined a network, it SHOULD send out a request for a full update using the request-full key in the message. This key may be an array containing NODE_IDs from which a full message is needed or may be the Boolean value true, to indicate that a full message from every neighbor is required. When a node receives a request-full key in a message that either has the value true or its ID present in the array, it MUST do one of the following: variant1: schedule the next message it sents to be a full message. variant2: send a full update immediately and reset its message interval timer, except when the last message already was a out-of-band full message, in which case it MUST/SHOULD schedule the next message according to the message interval timer to be a full message.¶

5.4. Reflections

Reflections are a extensible mechanism and allow a node to exchange data with neighboring nodes, with 2-hop neighbors and with itself. When a node includes arbitrary JSON data in the reflect key in its message, each node receiving this message MUST send this data in the reflected key of its messages under the corresponding NODE_ID. A node MUST support reflecting all requests but is not required to actually parse that data. Because the messages are sent to all neighbors, every 2-hop neighbor becomes aware of the reflected data of a node. This fact is not used in this RFC but may be used in extensions of this protocol.¶