TOC |
|
There are a number of anonymous RxRPC applications which require identity assertions in order to ensure that the desired peer receives and processes the procedure call. This memo defines a replacement for the rxnull security class which provides a means for mutually agreeing upon who is communicating, without incurring cryptographic overhead. It should be noted that, much like rxnull, this security object is not suitable for use in a distributed environment due to its inability to provide integrity protection.
This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress.”
The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html.
This Internet-Draft will expire on May 16, 2010.
Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the BSD License.
1.
Introduction
1.1.
Requirements Language
2.
Overview of Rx RPC
2.1.
Packet Mux
3.
Presenting Problems
3.1.
Node Renumbering
3.2.
Epoch ID Multi-Homing Bit
4.
Rx Clear Security Class
4.1.
Constants
4.2.
Security Header
4.3.
Data Packet Validation
4.4.
Abort Packet Handling
4.4.1.
RXCL_ERR_UNKNOWN_VERS
4.4.2.
RXCL_ERR_UNKNOWN_ID_TYPE
4.4.3.
RXCL_ERR_WRONG_PEER
4.4.4.
RXCL_ERR_XCID_UNUSPP
5.
Multi-Home Behavior
6.
Acknowledgements
7.
IANA Considerations
8.
GCO Registrar Considerations
8.1.
RxClear security index
8.2.
Rx error codes
8.3.
Rx Clear Security Header Version
8.4.
Endpoint Identifier Type
9.
Security Considerations
10.
Normative References
§
Author's Address
TOC |
Many high-performance applications based upon Rx RPC cannot tolerate cryptographic overhead. In order to ensure correctness in the face of transport-layer address renumbering, some form of context needs to be established between client and server to permit upper-layer applications to reject processing of remote procedure calls which were misdirected.
TOC |
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 (Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” March 1997.) [RFC2119].
TOC |
Rx RPC is a remote procedure call mechanism built on top of UDP. In order to establish a stateful call context on top of a stateless datagram protocol, Rx relies upon a number of client-asserted header fields to establish a flow-controlled communications channel between peers. To eliminate the need for context-establishment round-trips, Rx relies upon client assertions to establish a stateful context.
TOC |
Rx connection objects are identified by a tuple of packet header fields. The most important control field is the most-significant bit of the epoch header field. When this bit is asserted, the connection object is operating in multi-homing mode, as specified in RXRPC (Zeldovich, N., “Rx protocol specification draft,” 2002.) [rx_spec]. In the normal Rx operating mode (with the multi-homing bit set to zero), Rx connections are identified by the following tuple: (host, port, epoch, cid), where these elemenets are defined as:
- host:
- IPv4 address of peer
- port:
- UDP port of peer
- epoch:
- Rx header epoch field
- cid:
- Rx header cid field (channel ID bits masked to zero)
However, when the multi-homing bit is asserted, the connection identifier tuple becomes: (epoch, cid). Thus, multi-homed Rx connection objects have a shared (epoch, cid) namespace, independent of peer address.
TOC |
The design of this Rx security class is motivated by server and client renumbering incidents at large AFS-3 deployments. When a file server is renumbered, there is a several hour window until the next VL_GetAddrsU RPC is performed to refresh the file server UUID to IPv4 address mappings in the client. Due to the TTL-based invalidation of stale cached mappings, there is a substantial time interval during which RPCs can be delivered to the wrong file server, potentially leading to incorrect behavior.
Similarly, client renumbering can lead to incorrect behavior due to a loss of cache coherence. The AFS-3 callback mechanism relies upon correct knowledge of client UUID to IPv4 address mappings in order to deliver cache invalidation messages to clients. When these mappings become stale due to intervening address renumbering events, advertisement of incorrect addresses, NATs, etc. these "call back" remote procedure calls may be delivered to the wrong client node. In some circumstances this can lead to false state of success on the file server because an unintended client received, processed, and sent a response of success to the file server. Due to the success return code, the file server will no longer attempt to deliver the invalidation, and the client to which the call back was supposed to be delivered will continue to operate on stale cached data because it never received the cache invalidation message.
TOC |
When servers are renumbered, one potential outcome is that two or more machines running the same service will swap addresses. In this case, there is a possibility for the wrong machine to correctly interpret, and attempt to execute, a procedure call.
In some cases, execution of an RPC by the wrong endpoint will still result in correct behavior. However, this is not generally true, where execution by an unintended target could result in undefined, or even dangerous, behavior. For example, in afs3, existence of shadow clones could result in a situation where the RW shadow is updated by an afsint RPC, instead of the canonical RW site.
TOC |
When the multi-homing bit is asserted, (connection,epoch) tuples become globally unique. This mode of operation permits clients to contact the server on multiple addresses, thus allowing client operating systems to route datagrams as desired. Current implementations of Rx bind the connection to the first peer address on which a datagram was received. Since all reply datagrams are sent to the bound peer, connection hijacking becomes impossible. Unfortunately, this comes at the expense of handling client renumbering events.
TOC |
In order to overcome the dangers inherent in assuming stability of transport addresses, the Rx Clear security class embeds a security header in all data packets. This security header contains application-specific endpoint identifier assertions for both the source and destination.
When a datagram is received by the wrong peer, an Rx abort packet will be dispatched notifying the peer of the need to re-bind transport addresses for this connection object. When such an abort packet is received by a client connection, the error will be immediately propagated back to the caller so that application-specific logic may be invoked to refresh transport-layer address mappings for the intended destination endpoint. In the server case, this memo standardizes new multi-homing Rx connection peer binding semantics which allow for graceful handling of client renumbering events.
TOC |
- RX_SEC_ID_CLEAR:
- An Rx security index will be allocated by the Grand Central Registrar. As with all Rx security indices, this 8 bit integer will uniquely identify the security class bound to a given Rx datagram.
- RXCL_HDR_VERS_1:
- Rx Clear security header version 1 will be allocated by the Grand Central Registrar. This version number will correspond to the XDR-encoded data structure RxClear_Header, as specified below.
- RXCL_ERR_UNKNOWN_VERS:
- An Rx error code will be allocated which communicates that this version of the Rx Clear security header is unsupported by the peer. This error code will be sent as the user payload of an Rx abort packet.
- RXCL_ERR_UNKNOWN_ID_TYPE:
- An Rx error code will be allocated which communicates that this endpoint identifier type is not supported by the peer. This error code will be sent as the user payload of an Rx abort packet.
- RXCL_ERR_WRONG_PEER:
- An Rx error code will be allocated which communicates mis-delivery of an Rx Clear-protected datagram to the wrong peer. This error code will be sent as the user payload of an Rx abort packet.
- RXCL_ERR_XCID_UNSUPP:
- An Rx error code will be allocated which communicates to the peer that this node is incapable of supporting the extended connection id field. This error code will be sent as the user payload of an Rx abort packet upon receipt of an RxClear header containing a non-zero clh_xcid field by a node which cannot support extended connection identifiers.
TOC |
In order to communicate expectations to the peer, all data packets travelling over an RxClear-protected connection will include an XDR-encoded security header which carries identity assertions. The RxClear mechanism uses a header rather than a challenge- response mechanism because the additional round-trips required by the Rx challenge- response mechanism were deemed too costly for the typical unauthenticated Rx call workload.
The proposed security header is an XDR-encoded structure defined as follows:
struct rxClear_Header { u_char clh_version; /* authenticator version number */ u_char reserved; u_short clh_data_off; /* data payload offset */ afs_uint32 clh_xcid; /* extended CID field */ afs_uint32 clh_data_len; /* data payload length */ afs_uint32 clh_trl_off; /* security trailer offset */ afs_uint32 clh_flags; /* miscellaneous control flags */ afs_uint32 clh_id_type; /* how to interpret opaque peer identifier payloads */ opaque clh_src_id; /* assertion of client identity */ opaque clh_dst_id; /* assertion of server identity */ };
Rx Clear Security Header
Figure 1 |
This security header will be an XDR-encoded data structure, which will occupy the first octets of the data offset in an Rx packet -- it will start at the offset directly following the Rx packet header. The normal packet data will begin at the data offset specified in the clh_data_off field of the security header. All implementations MUST obey an upper limit of 256 octets on the size of the XDR encoded security header. Implementors SHOULD safely handle packets with oversize security headers.
- clh_version:
- This is the version of the Rx Clear security object header. If this version is unknown by the peer, then the connection must be aborted.
- clh_data_off:
- This value specifies the beginning of the data payload, in units of octets from the beginning of the Rx packet payload. This field is used by receivers to determine where to begin reading the data payload.
- clh_xcid:
- This field is reserved for future use as an extended connection identifier. This field is included because it is known that there are drafts in-progress which steal bits from the Rx header connection id field in order to provide for additional channel identifiers. Implementations MUST set this field to zero until extended cid protocol behavior is standardized.
- clh_trl_off:
- This value specifies the offset in octets of the clear security class packet trailer. A value of zero indicates the absence of a security trailer.
- clh_flags:
- These bits are reserved for future allocation by the Grand Central Registrar.
- clh_id_type:
- This field identifies the contents of the src_id and dst_id opaque fields. Values within this namespace are allocated by the Grand Central Registrar.
- clh_src_id:
- This field contains an application-specific source endpoint identifier. For example, in the case of afs3, this will likely be a node UUID.
- clh_dst_id:
- This field contains an application-specific destination endpoint identifier. For example, in the case of afs3, this will likely be a node UUID.
TOC |
Upon receipt of a data packet with the security index set to RX_SEC_ID_CLEAR, the node will XDR decode the security header, and subsequently validate the security header. Following XDR decode, the node shall first verify that the clh_version field contains a supported version number. In the event that the node does not support this RxClear version, the node will send an Rx abort packet to the peer with error code RXCL_ERR_UNKNOWN_VERS.
The second step in validation involves the extended connection identifier field, clh_xcid. If this node does not support extended cid, and the clh_xcid field is non-zero, then an abort packet with user payload RXCL_ERR_XCID_UNSUPP should be sent to the peer, and the connection should transition to an error state.
Next, the application-specific endpoint identifier type specified in clh_id_type field is validated to ensure that the application layer can handle this identifier type. If this endpoint identifier type is not supported by the application layer, then the node will send an Rx abort packet with user payload of RXCL_ERR_UNKNOWN_ID_TYPE, and the connection should transition to an error state.
The application layer will then be asked to validate the clh_dst_id field. If there is a mismatch, an abort packet will be sent to the peer with user payload RXCL_ERR_WRONG_PEER, and the Rx connection will then transition into an error state.
TOC |
Processing of received Rx Abort packets must be updated to handle the new RXCL_ERR_ error codes. If such an error code is received on a connection with security index other than RX_SEC_ID_CLEAR, then behavior is undefined.
TOC |
This error code indicates that the peer is unable to support the version of the RxClear security header sent in a packet. The connection is transitioned into an error state.
TOC |
This error code indicates that the peer is unable to support this application-specific endpoint identifier type. The connection is transitioned into an error state.
TOC |
This error code indicates that the packet was delivered to the wrong peer. Behavior in this situation depends on several factors. First, for connections where the epoch multi-homing bit is zero, the connection must be transitioned to an error state. For multi-homed connections, behavior further depends upon whether this is a client connection, or a server connection. For client connections, the easiest course of action is to set the connection to an error state, and allow the client to re-resolve the application-specific endpoint-identifier to transport identifier mapping, allocate a new Rx connection, and re-try the call. In the case of a multi-homed server connection, the implementation SHOULD make a best-effort try to deliver the call reply data to the correct destination, as this may be a non-idempotent procedure call. This memo outlines in detail new peer binding semantics for multi-homed Rx connections in another section. Hence, whenever it is possible, the server will not transition a server connection into an error state upon receipt of this message. Instead, it SHOULD invalidate the peer currently bound to the connection so that future replies go to a different, hopefully correct, transport address.
TOC |
This error code indicates that the peer is unable to support the extended connection identifier field in the RxClear security header. The connection is transitioned to error state, and the implementation SHOULD mark the peer as being incapable of supporting extended connection identifiers so that connections allocated to this peer in the future contain a clh_xcid field with value zero.
TOC |
Rx supports multi-homed clients through the assertion of the most-significant bit in the Rx header epoch field. When this bit is asserted, a server will accept datagrams into a connection regardless of the source host address and port. However, reply packets are always sent to the first peer address which contacted the server on any given (epoch, cid) tuple. This behavior prevents connection hijacking, at the expense of robust multi-homing support.
In order to properly support multi-homing this memo specifies relaxation of the peer binding policies. Most importantly, upon receipt of an RXCL_ERR_WRONG_PEER abort packet, an Rx server should not transition a server-mode connection to an error state. Rather, it SHOULD mark the peer currently bound to the Rx connection as being incorrect so that responses may be sent to a different peer, as determined upon receipt of the next ping packet. Although this does open up room for connection hijacking, it does so only for anonymous connections, which are otherwise exposed to denial of service attacks.
To address the issue of lack of response, new Rx server implementations SHOULD permit re-binding of the peer on server-mode connections. To this end, servers should track liveness of peer addresses on a server connection in order to remove a dead peer from a connection. If an Rx ping comes from an address other than the currently bound peer transport address, the Rx implementation MAY try to re-send unacknowledged packets to this other address. If these re-transmits are correctly aknowledged, the connection may be re-bound to the new peer.
TOC |
I would like to thank all of the participants at the 2009 Edinburgh AFS hackathon for their input into the design of this security mechanism. Specifically, I would like to thank Jeffrey Altman for suggesting that it would be architecturally cleaner to place peer identity assertions into a security header, rather than modifying AFS-3 RPCs to explicitly include application-layer identity assertions as IN parameters.
TOC |
This memo includes no request to IANA.
TOC |
This memo includes several assigned numbers requests which must be considered by the Grand Central Registrar.
TOC |
A new Rx security index must be allocated. It is anticipated that given the small size of the security index namepsace, the allocation will only be satisfied after rough consensus is established on the afs3-standardization@openafs.org mailing list.
TOC |
The Rx Clear security class allocates several new Rx error codes for use in Rx abort packet payloads. Given that there are multiple Rx implementations, it is assumed that the GCO Registrar will be responsible for allocating new error table values.
TOC |
The Rx Clear security class includes a version number in its packet header. This memo requests that the GCO Registrar control allocation of all version numbers for this protocol field. Specifically, this memo requests allocation of version 1 within this new namespace.
TOC |
The Rx Clear security class provides a means of sending opaque application data, which is intended to provide a means of transmitting application-specific transport-independent endpoint identifiers. This memo requests that the GCO Registrar control allocation of endpoint identifier type codes.
TOC |
This protocol explicitly provides no means of encrypting nor integrity checking the contents of Rx headers or payloads. It SHOULD NOT be used, except in physically secured and isolated high-performance computing environments where cryptographic overhead is deemed to be unacceptable.
TOC |
[RFC2119] | Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” BCP 14, RFC 2119, March 1997 (TXT, HTML, XML). |
[rx_spec] | Zeldovich, N., “Rx protocol specification draft,” 2002. |
TOC |
Thomas Keiser | |
Sine Nomine Associates | |
43596 Blacksmith Square | |
Ashburn, VA 20147 | |
USA | |
Phone: | +1 703 723 6673 |
Email: | tkeiser@sinenomine.net |