TOC 
S/MIME Working GroupP. Gutmann
Internet-DraftUniversity of Auckland
Intended status: Standards TrackMay 17, 2010
Expires: November 18, 2010 


Using HMAC-authenticated Encryption in the Cryptographic Message Syntax (CMS)
draft-gutmann-cms-hmac-enc-01.txt

Abstract

This document specifies the conventions for using HMAC-authenticated encryption with the Cryptographic Message Syntax (CMS) authenticated-enveloped-data content type. This mirrors the use of HMAC combined with an encryption algorithm that's already employed in IPsec, SSL/TLS, and SSH, which is widely supported in existing crypto libraries and hardware, and has been extensively analysed by the crypto community.

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress.”

This Internet-Draft will expire on November 18, 2010.

Copyright Notice

Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.



Table of Contents

1.  Introduction
    1.1.  Conventions Used in This Document
2.  Background
3.  CMS Encrypt-and-Authenticate Overview
    3.1.  Rationale
4.  CMS Encrypt-and-Authenticate
    4.1.  Encrypt-and-Authenticate Message Processing
    4.2.  Rationale
    4.3.  Test Vectors
5.  SMIMECapabilities Attribute
6.  Security Considerations
7.  IANA Considerations
8.  References
    8.1.  Normative References
    8.2.  Informative References
§  Author's Address




 TOC 

1.  Introduction

This document specifies the conventions for using HMAC-authenticated encryption with the Cryptographic Message Syntax (CMS) authenticated-enveloped-data content type. This mirrors the use of HMAC combined with an encryption algorithm that's already employed in IPsec, SSL/TLS, and SSH, which is widely supported in existing crypto libraries and hardware, and has been extensively analysed by the crypto community.



 TOC 

1.1.  Conventions Used in This Document

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119] (Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” March 1997.).



 TOC 

2.  Background

Integrity-protected encryption is a standard feature of session-oriented security protocols like IPsec [IPsec] (Kent, S. and K. Seo, “Security Architecture for the Internet Protocol,” December 2005.), SSH [SSH] (Ylonen, T. and C. Lonvick, “The Secure Shell (SSH) Transport Layer Protocol,” Janaury 2006.), and SSL/TLS [TLS] (Dierks, T. and E. Rescorla, “The Transport Layer Security (TLS) Protocol Version 1.2,” August 2008.), but until recently wasn't available for message-based security protocols like CMS, although PGP added a form of integrity protection by encrypting a SHA-1 hash of the message alongside the message contents to provide authenticate-and-encrypt protection [OpenPGP] (Callas, J., Donnerhacke, L., Hal, H., Shaw, D., and R. Thayer, “OpenPGP Message Format,” November 2007.). Usability studies have shown that users expect encryption to provide integrity protection [Garfinkel] (Garfinkel, S., “Design Principles and Patterns for Computer Systems That Are Simultaneously Secure and Usable,” May 2005.), creating cognitive dissonance problems when the security mechanisms don't in fact provide this assurance.

This document applies the same encrypt-and-authenticate mechanism already employed in IPsec, SSH, and SSL/TLS, to CMS (technically some of these actually use authenticate-and-encrypt rather than encrypt-and-authenticate, since what's authenticated is the plaintext and not the ciphertext). This mechanism is widely supported in existing crypto libraries and hardware, and has been extensively analysed by the crypto community.



 TOC 

3.  CMS Encrypt-and-Authenticate Overview

Conventional CMS encryption uses a content encryption key (CEK) to encrypt a message payload. Authenticated encryption requires two keys, one for encryption and a second one for authentication. Like other mechanisms that use authenticated encryption, this document employs a pseudorandom function (PRF) to convert a single block of keying material into the two keys required for encryption and authentication. This converts the standard CMS encryption operation:

     KEK( CEK ) || CEK( data )

into:

     KEK( master_secret ) || HMAC( encrypt( data ) )

where the HMAC and encryption keys are derived from the master_secret via:

     HMAC-K := PRF( master_secret, "authentication" );
     CEK-K := PRF( master_secret, "encryption" );



 TOC 

3.1.  Rationale

There are several possible means of deriving the two keys required for the encrypt-and-authenticate process from the single key normally provided by the key exchange or key transport mechanisms. Several of these however have security or practical issues. For example any mechanism that uses the single exchanged key in its entirety for encryption (using, perhaps, PRF( key ) as the HMAC key) can be converted back to unauthenticated data by removing the outer HMAC layer and rewriting the CMS envelope back to plain EnvelopedData or EncryptedData. By applying the PRF intermediate step, any attempt at a rollback attack will result in a decryption failure.

The option chosen here, the use of a PRF to derive the necessary sets of keying material from a master secret, is well-established through its use in IPsec, SSH, and SSL/TLS, and is widely supported in both crypto libraries and in encryption hardware.

The PRF used is PBKDF2 because its existing use in CMS makes it the most obvious candidate for such a function. If in the future a universal PRF, for example the proposed HKDF, is adopted, this can be substituted for PBKDF2 by specifying it in the prfAlgo field covered in Section 4 (CMS Encrypt-and-Authenticate ).

The resulting processing operations consist of a combination of the operations used for the existing CMS content types EncryptedData and AuthenticatedData, allowing them to be implemented relatively simply using existing code.



 TOC 

4.  CMS Encrypt-and-Authenticate

The encrypt-and-authenticate mechanism is implemented within the existing CMS RecipientInfo framework by defining a new pseudo-algorithm type authEnc which is used in place of the existing single-purpose encrypt-only or HMAC-only algorithm. This pseudo-algorithm is used as a key container for the master secret from which the encryption and authentication keys are derived. Thus instead of using the RecipientInfo to communicate (for example) an AES or HMAC-SHA1 key, it communicates an authEnc keying value from which the required AES encryption and HMAC-SHA1 authentication keys are derived.

The authEnc pseudo-algorithm comes in two forms, one providing 128 bits of keying material and one providing 256 bits:

     id-smime OBJECT IDENTIFIER ::= { iso(1) member-body(2)
                us(840) rsadsi(113549) pkcs(1) pkcs9(9) 16 }

     id-alg  OBJECT IDENTIFIER ::= { id-smime 3 }

     id-alg-authEnc-128 OBJECT IDENTIFIER ::= { id-alg 15 }
     id-alg-authEnc-256 OBJECT IDENTIFIER ::= { id-alg 16 }

The algorithm parameters are:

     AuthEncParams ::= SEQUENCE {
         prfAlgo   [0] AlgorithmIdentifier DEFAULT PBKDF2,
         encAlgo       AlgorithmIdentifier,
         hmacAlgo      AlgorithmIdentifier
         }

prfAlgo is the PRF algorithm used to convert the authEnc value into the encryption and MAC keys. The default PRF is PBKDF2 [PBKDF2] (Kaliski, B., “PKCS #5: Password-Based Cryptography Specification,” September 2000.).

encAlgo is the encryption algorithm and associated parameters to be used to encrypt the content.

hmacAlgo is the HMAC algorithm and associated parameters to be used to authenticate/integrity-protect the content.



 TOC 

4.1.  Encrypt-and-Authenticate Message Processing

The randomly-generated authEnc key to be communicated via the RecipientInfo(s) is converted to separate encryption and authentication keys and applied to the encrypt-and-authenticate process as follows. The notation "PRF( key, salt, iterations )" is used to denote an application of the PRF to the given keying value and salt, for the given number of iterations:

  1. The HMAC algorithm key is derived from the authEnc key via:
              HMAC-K ::= PRF( authEnc_key, "authentication", 1 );
    
  2. The encryption algorithm key is derived from the authEnc key via:
              Enc-K ::= PRF( authEnc_key, "encryption", 1 );
    
  3. The data is processed as described in [AuthEnv] (Housley, R., “Cryptographic Message Syntax (CMS) Authenticated-Enveloped-Data Content Type,” November 2007.), and specifically since the mechanisms used are a union of EncryptedData and AuthenticatedData, as per [CMS] (Housley, R., “Cryptographic Message Syntax (CMS),” September 2009.). The EncryptedData processing is applied first and then the AuthenticatedData processing is applied to the result, so that the nesting is:
              HMAC( encrypt( content ) );
    



 TOC 

4.2.  Rationale

The authEnc pseudo-algorithm has two "key sizes" rather than the one-size-fits-all that the PRF impedance-matching would provide. This is done to address real-world experience from use of AES keys where users demanded AES-256 alongside AES-128 because of some perception that the former was "twice as good" as the latter. Providing an option for keys that go to 11 avoids potential user acceptance problems when someone notices that the authEnc pseudo-key has "only" 128 bits when they expect their AES keys to be 256 bits long.

Using a fixed-length key rather than making it a user-selectable parameter is done for the same reason as AES' quantised key lengths: there's no benefit to allowing, say, 137-bit keys over basic 128- and 256-bit lengths, it adds unnecessary complexity, and if the lengths are user-defined then there'll always be someone who wants keys that go up to 12. Providing a choice of two commonly-used lengths gives users the option of choosing a "better" key size should they feel the need, while not overloading the system with unneeded flexibility.

Apart from the extra step added to key management, all of the processing is already specified as part of the definition of the standard CMS content-types Encrypted/EnvelopedData and AuthenticatedData. This significantly simplifies both the specification and the implementation task, as no new content-processing mechanisms are introduced.



 TOC 

4.3.  Test Vectors

The following test vectors may be used to verify an implementation of HMAC-authenticated encryption. This represents a text string encrypted and authenticated using the password "Password" via CMS PasswordRecipientInfo. The encryption algorithm used is triple DES, whose short block size (compared to AES) makes it easier to corrupt arbitrary bytes for testing purposes within the self-healing CBC mode which will result in correct decryption but a failed MAC check.

   0  227: SEQUENCE {
   3   11:   OBJECT IDENTIFIER authEnvelopedData
                               (1 2 840 113549 1 9 16 1 23)
  16  211:   [0] {
  19  208:     SEQUENCE {
  22    1:       INTEGER 0
  25   97:       SET {
  27   95:         [3] {
  29    1:           INTEGER 0
  32   27:           [0] {
  34    9:             OBJECT IDENTIFIER pkcs5PBKDF2
                                         (1 2 840 113549 1 5 12)
  45   14:             SEQUENCE {
  47    8:               OCTET STRING B9 2B 0A 27 E2 18 EE CC
  57    2:               INTEGER 2000
         :               }
         :             }
  61   35:           SEQUENCE {
  63   11:             OBJECT IDENTIFIER pwriKEK
                                         (1 2 840 113549 1 9 16 3 9)
  76   20:             SEQUENCE {
  78    8:               OBJECT IDENTIFIER des-EDE3-CBC
                                           (1 2 840 113549 3 7)
  88    8:               OCTET STRING 82 42 45 76 64 C3 EC AE
         :               }
         :             }
  98   24:           OCTET STRING
         :             32 30 75 13 74 DA 48 AD 7B 12 93 56 80 30 99 83
         :             8D DC B9 F5 CF 00 CD A4
         :           }
         :         }
 124   82:       SEQUENCE {
 126    9:         OBJECT IDENTIFIER data (1 2 840 113549 1 7 1)
 137   51:         SEQUENCE {
 139   11:           OBJECT IDENTIFIER authEnc128
                                       (1 2 840 113549 1 9 16 3 15)
 152   36:           SEQUENCE {
 154   20:             SEQUENCE {
 156    8:               OBJECT IDENTIFIER des-EDE3-CBC
                                           (1 2 840 113549 3 7)
 166    8:               OCTET STRING 86 60 D4 78 D7 8F 2C 13
         :               }
 176   12:             SEQUENCE {
 178    8:               OBJECT IDENTIFIER hmacSHA (1 3 6 1 5 5 8 1 2)
 188    0:               NULL
         :               }
         :             }
         :           }
 190   16:         [0] 6F 46 3F 60 69 D7 13 B5 73 9E 8A 7D 51 B1 80 C6
         :         }
 208   20:       OCTET STRING
         :         DC 4C 18 3D 46 2E 9C 7C 5D 2A A3 07 C6 BF AE 09
         :         F2 B6 84 5D
         :       }
         :     }
         :   }



 TOC 

5.  SMIMECapabilities Attribute

An S/MIME client SHOULD announce the set of cryptographic functions that it supports by using the S/MIME capabilities attribute [SMIME] (Ramsdell, B. and S. Turner, “Secure/Multipurpose Internet Mail Extensions (S/MIME) Version 3.2 Message Specification,” January 2010.). If the client wishes to indicate support for HMAC-authenticated encryption, the capabilities attribute MUST contain the authEnc128 and/or authEnc256 OID specified above with algorithm parameters of NULL. Other parameters such as the HMAC and encryption algorithms are specified in the standard manner, for example through their own SMIMECapabilities attributes or by mutual agreement.



 TOC 

6.  Security Considerations

Unlike other CMS authenticated-data mechanisms like SignedData and AuthenticatedData, AuthEnv's primary transformation isn't authentication but encryption, so that AuthEnvData may decrypt successfully (in other words the primary data transformation present in the mechanism will succeed) but the secondary function of authentication using the MAC value that follows the encrypted data could still fail. This can lead to a situation in which an implementation might output decrypted data before it reaches and verifies the MAC value. In other words decryption is performed inline and the result is available immediately, while the authentication result isn't available until all of the content has been processed. If the implementation prematurely provides data to the user and later comes back to inform them that the earlier data was, in retrospect, tainted, this may cause users to act prematurely on the tainted data.

This situation could occur in a streaming implementation where data has to be made available as soon as possible (so the initial plaintext is emitted before the final ciphertext and MAC value are read), or one where the quantity of data involved rules out buffering the recovered plaintext until the MAC value can be read and verified. In addition an implementation that tries to be overly helpful may treat missing non-payload trailing data as non-fatal, allowing an attacker to truncate the data somewhere before the MAC value and thereby defeat the data authentication. This is complicated even further by the fact that an implementation may not be able to determine, when it encounters truncated data, whether the remainder (including the MAC value) will arrive presently (a non-failure) or whether it's been truncated by an attacker and should therefore be treated as a MAC failure. (Note that this same issue affects other types of data authentication like signed and MACd data as well, since an over-optimistic implementation may return data to the user before checking for a verification failure is possible).

The exact solution to these issues is somewhat implementation-specific, with some suggested mitigations being as follows: Implementations should buffer the entire message if possible and verify the MAC before performing any decryption. If this isn't possible due to streaming or message-size constraints, implementations should consider breaking long messages into a sequence of smaller ones, each of which can be processed atomically as above. If even this isn't possible implementations should make obvious to the caller or user that an authentication failure has occurred and the previously-returned or output data shouldn't be used. Finally, any data-formatting problem such as obviously truncated data or missing trailing data should be treated as a MAC verification failure even if the rest of the data was processed correctly.



 TOC 

7.  IANA Considerations

This document contains two algorithm identifiers defined by the SMIME Working Group Registrar in an arc delegated by RSA to the SMIME Working Group: iso(1) member-body(2) us(840) rsadsi(113549) pkcs(1) pkcs-9(9) smime(16) modules(0). No action by IANA is necessary for this document or any anticipated updates.



 TOC 

8.  References



 TOC 

8.1. Normative References

[AuthEnv] Housley, R., “Cryptographic Message Syntax (CMS) Authenticated-Enveloped-Data Content Type,” RFC 5083, November 2007 (TXT).
[CMS] Housley, R., “Cryptographic Message Syntax (CMS),” RFC 5652, September 2009 (TXT).
[PBKDF2] Kaliski, B., “PKCS #5: Password-Based Cryptography Specification,” RFC 2898, September 2000 (TXT).
[RFC2119] Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” BCP 14, RFC 2119, March 1997 (TXT, HTML, XML).
[SMIME] Ramsdell, B. and S. Turner, “Secure/Multipurpose Internet Mail Extensions (S/MIME) Version 3.2 Message Specification,” RFC 5751, January 2010 (TXT).


 TOC 

8.2. Informative References

[Garfinkel] Garfinkel, S., “Design Principles and Patterns for Computer Systems That Are Simultaneously Secure and Usable,” May 2005.
[IPsec] Kent, S. and K. Seo, “Security Architecture for the Internet Protocol,” RFC 4301, December 2005 (TXT).
[OpenPGP] Callas, J., Donnerhacke, L., Hal, H., Shaw, D., and R. Thayer, “OpenPGP Message Format,” RFC 4880, November 2007 (TXT).
[SSH] Ylonen, T. and C. Lonvick, “The Secure Shell (SSH) Transport Layer Protocol,” RFC 4253, Janaury 2006 (TXT).
[TLS] Dierks, T. and E. Rescorla, “The Transport Layer Security (TLS) Protocol Version 1.2,” RFC 5246, August 2008 (TXT).


 TOC 

Author's Address

  Peter Gutmann
  University of Auckland
  Department of Computer Science
  New Zealand
Email:  pgut001@cs.auckland.ac.nz