Internet-Draft | BP EID-Pattern | January 2023 |
Sipos | Expires 27 July 2023 | [Page] |
This document extends the Endpoint ID (EID) concept into an EID Pattern, which is used to categorize any EID as matching a specific pattern or not. EID Patterns are suitable for expressing agent configuration, for being used on-the-wire by DTN protocols, and for being easily understandable by a layperson. EID Patterns include scheme-specific optimizations for expressing set membership and each scheme pattern includes text and CBOR encoding forms; the pattern for the "ipn" EID scheme being designed to be highly compressible in its CBOR form. This document also defines a Public Key Infrastructure Using X.509 (PKIX) Other Name form to contain an EID Pattern and a handling rule to use a pattern to match an EID.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 27 July 2023.¶
Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
The Bundle Protocol (BP) Version 7 specification [RFC9171] defines text and CBOR encoding forms of an Endpoint ID (EID) which is used as both a source and a destination for individual bundles. BP Agent implementations have necessarily used methods of defining patterns for matching multiple EIDs in order to configure routing, forwarding, and delivery of bundles, but these have not yet been standardized and do not have a concise form suitable for on-the-wire messaging.¶
In much the same way that the Classless Inter-domain Routing (CIDR) mechanism of [RFC4632] can be used to aggregate a contiguous and bit-aligned block of IP addresses in a concise unit (encoded as text or otherwise), this concept of EID Pattern is used to aggregate a set of EIDs into a single concise unit. This is especially valuable because an EID includes both an identifier of the node sending or receiving the bundle as well as an identifier for the specific service which generated or will process the bundle. Any EID Pattern can be used both to aggregate EIDs based on node identifier, service identifier, or both.¶
A purely text-based pattern mechanism such as [W3C-PAT] could handle the general case of matching the text form of EIDs (as URIs) but would not be able to achieve the same level of encoding compression and would not be able to express of exact numeric ranges like the scheme-specific mechanism defined in this document.¶
The certificate profile and NODE-ID definition of [RFC9174] uses the text form of EID to authenticate nodes based on EID. This document defines a Public Key Infrastructure Using X.509 (PKIX) Other Name Form to contain an EID Pattern and a handling rule to use a pattern to match an EID. This allows authenticating an individual EID based on an EID Pattern in much the same way as using a "wildcard" certificate Section 6.4.3 of [RFC6125] to match a DNS name.¶
One other aspect of this patterning mechanism is that the text form of each scheme-specific pattern is intended to be, in a subjective sense, natural and understandable for the case of a human manually typing patterns into a text document or quick email message; the interpretation of the text pattern should "make sense" with minimal training.¶
This document defines a logical model of pattern matching BP Endpoint IDs and both text and CBOR encoding forms, as well as a PKIX extension to make use of an EID Pattern.¶
This document does not define a method of disambiguating an EID from an EID Pattern (in either encoded form) without any other context. Given a pure text or CBOR encoding of an arbitrary value, there must be some external context to determine how to interpret it.¶
Although the same EID definitions apply to BP Version 6 [RFC5050] this document does not provide any mechanisms of integrating with that protocol. It is an implementation matter for a BP Agent to use EID Patterns with BP Version 6 bundles and their compressed bundle header encoding (CBHE).¶
This document defines text structure using the Augmented Backus-Naur Form (ABNF) of [RFC5234]. The entire ABNF structure can be extracted from the XML version of this document using the XPath expression:¶
'//sourcecode[@type="abnf"]'¶
The following initial fragment defines the top-level rules of this document's ABNF.¶
eid-pattern = ipn-pattern / dtn-pattern ; Shared wildcard rules wildcard = "*" multi-wildcard = "**"¶
From the document [RFC3986] the definition is taken for pchar
.
From the document [RFC5234] the definition is taken for digit
.
From the document [RFC9171] the definition is taken for nbr-delim
.¶
This document defines CBOR structure using the Concise Data Definition Language (CDDL) of [RFC8610]. The entire CDDL structure can be extracted from the XML version of this document using the XPath expression:¶
'//sourcecode[@type="cddl"]'¶
The following initial fragment defines the top-level symbols of this document's CDDL, which includes the example CBOR content.¶
start = eid-pattern eid-pattern = $eid-pattern .within eid-structure¶
From the document [RFC9171] the definition is taken for eid-structure
.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
This document does not define a universal form of EID Pattern, though text forms of EID Patterns do share concepts and rules for wildcard matching. Instead, in order to achieve efficiencies in non-text encoding, each EID scheme uses a different form of complex pattern matching.¶
The text form of an EID Pattern is not a URI and is not bound by the character set restrictions imposed in [RFC3986]. This is much the same as a URI template [RFC6570] is also not itself a URI. Although some forms of EID Pattern can contain reserved URI characters, it is not guaranteed that any particular EID Pattern will be intrinsically differentiable from an EID. See Section 4 for details on handling concerns.¶
For the pattern forms defined in this document, the exact-match pattern's text form is identical with its matching EID. This behavior is not required or necessary but is a convenient side effect of the text definitions and makes the EID Pattern a proper superset of EID. The IPN pattern has an exact-match CBOR form which is identical to its matching EID, while the DTN pattern CBOR form is always as a component pattern array.¶
As defined in Section 4.2.5.1.1 of [RFC9171], DTN scheme EIDs have an authority (node name) part and a sequence of path (service demux) segment components. Combining these components together, the whole EID SSP is treated as a sequence of these unstructured text components. Because of the lack of more specific structure, outside of match-all wildcards only a generic pattern matching mechanism like a regular expression can be used.¶
The conceptual model of the DTN pattern is that the node name and the sequence of path segments can be matched as one of:¶
A DTN pattern SHALL contain at least two components: the first for the node name and the others for the service demux. A DTN pattern SHALL contain no more than one multi-segment wildcard component. If present, a DTN pattern SHALL only contain a multi-segment wildcard in its last (demux path segment) component.¶
When matching a DTN pattern any query or fragment parts of an EID SHALL be ignored and not treated as comparison components. A DTN pattern SHALL be considered to match a specific EID when both have the same scheme, the pattern has the same number of components as the EID, and each component of the the pattern matches the corresponding component of the EID SSP. If the number of components differ or if any component doesn't match, the whole pattern does not match. Each pattern component SHALL be considered to match according to the following rules:¶
Because these are dealing with text values in an information model, the matching occurs in the percent-encoding normalized or percent-decoded domain (i.e. it's not a pattern for the encoded URI, the matching is performed within the information model of the SSP).¶
Because of the arbitrarily complex nesting rules allowed by regular expressions, and the multiple techniques available for different expressions to match the same subsets of text, DTN pattern sets can only be consistently computed when the node-name or demux path segments are either exact-text matches or one of the match-all wildcards.¶
Users of the DTN pattern SHALL have a mechanism to perform set logic with specific value and wildcard components. EID Pattern processors MAY, but cannot be assumed to, have a mechanism to perform set logic on regular expression components.¶
The text form of the DTN pattern conforms to the ABNF in Figure 1. The authority begins with the same string "//" and authority and demux components are separated by the same character "/" as in the DTN URI scheme.¶
This pattern uses reserved URI characters of "[" and "]" (see Section 2.2 of [RFC3986]) to indicate the presence of a regular expression for a component. This allows completely disambiguating a DTN pattern from a specific DTN EID when a regular expression or wildcard is present. Because neither of those are required to be present in a DTN pattern and the asterisk "*" is a valid path segment character, the considerations of Section 4 still always apply to decoding text as EID Pattern versus an EID.¶
A concrete use of this text form is illustrated in this example:¶
dtn://node/[%5Eanchored]/other%20part/** <-- P --> <--- P ---> <--- P ---->¶
Where the "P" sections are percent-encoded (with no reserved characters) and square brackets unambiguously delimit the expression component. The actual components in this example are the specific value "node", the regular expression "^anchored", and the specific value "other part" and all are UTF-8 and percent-encoded. Further examples are given in Appendix B.1.¶
The CBOR form of the DTN pattern conforms to the CDDL in Figure 2.
Just as in the DTN URI scheme the pattern scheme identifier is 1, the first component of the SSP identifies the node and the last components identify the service path segments.
The well-known SSP SHALL be encoded using the same uint
value specified for the DTN URI scheme.¶
Each of the DTN pattern components SHALL be CBOR encoded as follows:¶
dtn-exact
symbol.¶
regexp
symbol.¶
true
item.¶
false
item.¶
The wildcard sentinel values have no intrinsic meaning and were simply chosen to be one-octet-encoded special items. The CBOR form of the DTN pattern is not as compressible as the IPN pattern, but the exact text is not percent encoded and the regular expression tag "regexp" does save one octet per instance.¶
As defined in Section 4.2.5.1.2 of [RFC9171] and updated in [I-D.ietf-dtn-ipn-update], IPN scheme EIDs have a SSP which is divided into a bounded number of integer numeric components. Because of this, the pattern for IPN scheme EIDs is based on matching a numeric value or range for each component.¶
The conceptual model of the IPN pattern is that each of the components of the SSP can be matched as one of:¶
An IPN pattern SHALL contain between two and four components, inclusive, corresponding to the IPN scheme EID components.¶
Within a single component of the IPN pattern, the range intervals SHALL be disjoint and non-contiguous. Any overlapping or contiguity of intervals within a set can be coalesced into a single covering interval with the same meaning. The text form of a range can, but SHOULD NOT, contain overlapping or contiguous intervals. The CBOR form of a range does not allow overlapping intervals because of its compressed form, but does allow contiguous intervals. The decoder for any form of an IPN pattern SHALL normalize all intervals sets to satisfy information model requirements. The decoder for any form of an IPN pattern SHOULD treat the failure of any piece parts of a pattern as a failure to decode the whole pattern.¶
A limitation of this mechanism is that there is no intermediate component pattern between a specific set of finite intervals and the match-all (unbounded) wildcard. There is no capability of including an non-finite bounds within any interval.¶
An IPN pattern SHALL be considered to match a specific EID when both have the same scheme, the pattern has the same number of components as the EID, and each component of the the pattern matches the corresponding component of the EID SSP. If the number of components differ or if any component doesn't match, the whole pattern does not match. Each pattern component SHALL be considered to match according to the following rules:¶
Because these are dealing with numeric values in an information model, the matching occurs after any encoding-specific normalization (i.e. it's not a text pattern for the text encoding, the matching is performed within the information model of the SSP).¶
One benefit of using an EID pattern with an information model of a sequence of numbers or ranges is that performing set logic such as intersection or containment is straightforward. For set logical behavior, the specific value case is treated as a singleton set and the wildcard case is treated as the unbounded-interval.¶
Two IPN patterns intersect if all of their corresponding components intersect, and the intersection of each component range can be readily computed using multi-interval set logic. Likewise, one IPN pattern is a subset (or proper subset) of another pattern if all of the components is a subset (or proper subset) of the other's corresponding component.¶
The text form of the IPN pattern conforms to the ABNF in Figure 3. Each component is separated by the same character "." as in the IPN URI scheme. This pattern uses reserved URI characters of "[" and "]" (see Section 2.2 of [RFC3986]) to indicate the presence of a range set for a component, the character "," to separate each range, and the character "-" to indicate the inclusive range within the set. Each of the numeric values within the range is inclusive. If the range does not contain two values it is a length-one range.¶
The canonical text form of an IPN pattern SHALL order all range sets in ascending numeric order.¶
The CBOR form of the IPN pattern conforms to the CDDL in Figure 4. Just as in the IPN URI scheme the pattern scheme identifier is 2, the first components of the SSP identify the node and the last component identifies the service.¶
Each of the IPN pattern components SHALL be CBOR encoded as follows:¶
uint
symbol.¶
ipn-range
symbol.¶
true
item.¶
The wildcard sentinel values have no intrinsic meaning and were simply chosen to be one-octet-encoded special items. The encoding of ranges is a compressed form in which each pair of values in the range indicates:¶
Another way to interpret these pairs is that each number indicates the length of alternating "excluded" and "included" intervals for the range.¶
This document expands upon the PKIX profile of TCPCLv4 [RFC9174] to allow an EID Pattern in any certificate where an Node ID is required or allowed.¶
This document defines a PKIX Other Name Form identifier, id-on-bundleEIDPattern
in Appendix A; this identifier can be used as the type-id
in a Subject Alternative Name entry of type otherName
.
The BundleEIDPattern
value associated with the otherName
type-id id-on-bundleEIDPattern
SHALL be an EID Pattern text form, encoded as an UTF8String
, with a scheme that is present in the IANA "Bundle Protocol URI Scheme Types" registry [IANA-BP].¶
This specification defines an EID-PATTERN-ID of a certificate as being the Subject Alternative Name entry of type otherName
with a name form of BundleEIDPattern
and a value limited to an EID Pattern text form.
An entity SHALL ignore any entry of type otherName
with a name form of BundleEIDPattern
and a value that is some text other than an EID Pattern.¶
The EID-PATTERN-ID is similar to the NODE-ID as defined in Section 4.4.1 of [RFC9174] but can match many different and distinct Node IDs. URI matching of an EID-PATTERN-ID SHALL use the scheme-specific matching logic defined in this specification. An EID Pattern scheme can refine this matching logic with rules regarding how node IDs within that scheme are to be compared with the issued EID-PATTERN-ID.¶
As an augmentation of Section 4.4.2 of [RFC9174]: Unless prohibited by CA policy, a TCPCL end-entity certificate SHALL contain either a NODE-ID or an EID-PATTERN-ID that authenticates the node ID of the peer. All other requirements of that certificate profile are unchanged by this document.¶
It is critical for applications handling EIDs and EID Patterns to positively distinguish between the two based on the context in which the value is being used. For PKIX Subject Alternative Name this is distinguished by the different Other Name forms. An EID which is inappropriately interpreted as an EID Pattern could allow an attacker to elevate access depending upon other aspects of the system being accessed.¶
CAs which issue certificates containing EID Patterns need to consider the implications of an overly-broad pattern in the same way that current Web PKI CAs must manage certificates with wildcard DNS-IDs.¶
Although the reserved characters "[" and "]" are disallowed within the URI authority and path segments by [RFC3986] there are still URI processors which could be lax about enforcing that restriction and could allow an EID pattern to be decoded in a place where an actual EID is expected. This could allow unwanted side-effects when the EID is handled by a BP Agent.¶
Both URI authority and path segments are percent-encoded text and need to be handled by EID processors as such for both pattern matching and equality comparison. Additionally, for the IPN scheme there are numeric values that must be handled as such for pattern matching and comparison.¶
This specification re-uses the "Bundle Protocol URI Scheme Types" sub-registry within the "Bundle Protocol" registry [IANA-BP] for the CBOR encoding of EID Patterns and adds an informative column "EID Pattern Reference" as in the following table.¶
Value | Description | ... | EID Pattern Reference |
---|---|---|---|
1 | dtn | [This specification] | |
2 | ipn | [This specification] |
IANA has created, under the "Structure of Management Information (SMI) Numbers" registry [IANA-SMI], a sub-registry titled "SMI Security for PKIX Other Name Forms". The other name forms table is updated to include a row "id-on-bundleEIDPattern" for containing an Endpoint ID Pattern as in the following table.¶
Decimal | Description | References |
---|---|---|
ON-TBD | id-on-bundleEIDPattern | [This specification] |
The formal structure of the associated other name form is in Appendix A. The use of this OID is defined in Section 3.¶
The following ASN.1 module formally specifies the BundleEIDPattern
structure and its Other Name form in the syntax of [X.680].
This specification uses the ASN.1 definitions from [RFC5912] with the 2002 ASN.1 notation used in that document.¶
<CODE BEGINS> DTN-EIDPATTERN-2023 { iso(1) identified-organization(3) dod(6) internet(1) security(5) mechanisms(5) pkix(7) id-mod(0) id-mod-dtn-eidpattern-2023(MOD-TBD) } DEFINITIONS IMPLICIT TAGS ::= BEGIN IMPORTS OTHER-NAME FROM PKIX1Implicit-2009 -- [RFC5912] { iso(1) identified-organization(3) dod(6) internet(1) security(5) mechanisms(5) pkix(7) id-mod(0) id-mod-pkix1-implicit-02(59) } id-pkix FROM PKIX1Explicit-2009 -- [RFC5912] { iso(1) identified-organization(3) dod(6) internet(1) security(5) mechanisms(5) pkix(7) id-mod(0) id-mod-pkix1-explicit-02(51) } ; id-on OBJECT IDENTIFIER ::= { id-pkix 8 } DTNOtherNames OTHER-NAME ::= { on-bundleEIDPattern, ... } -- The otherName definition for Bundle EID Pattern on-bundleEIDPattern OTHER-NAME ::= { BundleEIDPattern IDENTIFIED BY { id-on-bundleEIDPattern } } id-on-bundleEIDPattern OBJECT IDENTIFIER ::= { id-on ON-TBD } -- Same encoding as BundleEID, which allows URI reserved characters BundleEIDPattern ::= IA5String END <CODE ENDS>¶
This trivial example matches only one EID (which itself has the same text form)
dtn://node/service
which has a CBOR form of:¶
[1, ["node", "service"]]¶
An example of normalized matching is that the pattern dtn://node/service
will still match the EIDs dtn://node/ser%76ice
and dtn://no%64e/service
because each component match is performed in percent-decoded and UTF-8 decoded form.¶
This example matches a single-segment service demux on a single node
dtn://node/*
which has a CBOR form of:¶
[1, ["node", true]]¶
That single wildcard will match the empty demux dtn://node/
but will not match demux paths such as dtn://node/long/name
or any more segments.¶
This example matches all service demux on a single node with a multi-wildcard
dtn://node/**
which has a CBOR form of:¶
[1, ["node", false]]¶
This example matches a service demux with a prefix segment "pre"
dtn://node/pre/**
which has a CBOR form of:¶
[1, ["node", "pre", false]]¶
This example matches all node names having the same service demux
dtn://**/some/serv
which has a CBOR form of:¶
[1, [false, "some", "serv"]]¶
This example includes a single regular expression for single-segment service that starts with the letter "a" in the text form of
dtn://**/[^a]
which has a CBOR form of:¶
[1, [false, 35("^a")]]¶
This trivial example matches only one EID (which itself has the same text and CBOR forms)
ipn:0.3.4
which has a CBOR form of:¶
[2, [0, 3, 4]]¶
This example matches all service numbers on a single node
ipn:0.3.*
which has a CBOR form of:¶
[2, [0, 3, true]]¶
This example matches all no-authority nodes with the same service number
ipn:*.4
which has a CBOR form of:¶
[2, [true, 4]]¶
This example includes a single range over the service numbers ipn:0.3.0
to ipn:0.3.19
inclusive as
ipn:0.3.[0-19]
which has a CBOR form of:¶
[2, [0, 3, [0, 20]]]¶
This example includes an offset range over the service numbers ipn:0.3.10
to ipn:0.3.19
inclusive as
ipn:0.3.[10-19]
which has a CBOR form of:¶
[2, [0, 3, [10, 10]]]¶
This example includes multiple ranges of service numbers ipn:0.3.0
to ipn:0.3.4
and ipn:0.3.10
to ipn:0.3.19
inclusive as
ipn:0.3.[0-4,10-19]
which has a CBOR form of:¶
[2, [0, 3, [0, 5, 5, 10]]]¶
An overlapping or contiguous pattern such as ipn:0.3.[0-9,10-19]
or ipn:0.3.[0-15,10-19]
or ipn:0.3.[10-19,0-9]
would be normalized to ipn:0.3.[0-19]
.¶
An unordered pattern such as ipn:0.3.[10-19,0-4]
would be normalized to ipn:0.3.[0-4,10-19]
.¶
The DTN pattern expressiveness is based on use case examples provided by Carlo Caini and Lucien Loiseau.¶