TOC 
S/MIME Working GroupP. Gutmann
Internet-DraftUniversity of Auckland
Intended status: Standards TrackNovember 18, 2010
Expires: May 22, 2011 


Using MAC-authenticated Encryption in the Cryptographic Message Syntax (CMS)
draft-gutmann-cms-hmac-enc-02.txt

Abstract

This document specifies the conventions for using MAC-authenticated encryption with the Cryptographic Message Syntax (CMS) authenticated-enveloped-data content type. This mirrors the use of a MAC combined with an encryption algorithm that's already employed in IPsec, SSL/TLS, and SSH, which is widely supported in existing crypto libraries and hardware, and has been extensively analysed by the crypto community.

Status of this Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress.”

This Internet-Draft will expire on May 22, 2011.

Copyright Notice

Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.



Table of Contents

1.  Introduction
    1.1.  Conventions Used in This Document
2.  Background
3.  CMS Encrypt-and-Authenticate Overview
    3.1.  Rationale
4.  CMS Encrypt-and-Authenticate
    4.1.  Encrypt-and-Authenticate Message Processing
    4.2.  Rationale
    4.3.  Test Vectors
5.  SMIMECapabilities Attribute
6.  Security Considerations
7.  IANA Considerations
8.  Acknowledgements
9.  References
    9.1.  Normative References
    9.2.  Informative References
§  Author's Address




 TOC 

1.  Introduction

This document specifies the conventions for using MAC-authenticated encryption with the Cryptographic Message Syntax (CMS) authenticated-enveloped-data content type. This mirrors the use of a MAC combined with an encryption algorithm that's already employed in IPsec, SSL/TLS, and SSH, which is widely supported in existing crypto libraries and hardware, and has been extensively analysed by the crypto community.



 TOC 

1.1.  Conventions Used in This Document

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119] (Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” March 1997.).



 TOC 

2.  Background

Integrity-protected encryption is a standard feature of session-oriented security protocols like [IPsec] (Kent, S. and K. Seo, “Security Architecture for the Internet Protocol,” December 2005.), [SSH] (Ylonen, T. and C. Lonvick, “The Secure Shell (SSH) Transport Layer Protocol,” January 2006.), and [TLS] (Dierks, T. and E. Rescorla, “The Transport Layer Security (TLS) Protocol Version 1.2,” August 2008.), but until recently wasn't available for message-based security protocols like CMS, although [OpenPGP] (Callas, J., Donnerhacke, L., Hal, H., Shaw, D., and R. Thayer, “OpenPGP Message Format,” November 2007.) added a form of integrity protection by encrypting a SHA-1 hash of the message alongside the message contents to provide authenticate-and-encrypt protection. Usability studies have shown that users expect encryption to provide integrity protection [Garfinkel] (Garfinkel, S., “Design Principles and Patterns for Computer Systems That Are Simultaneously Secure and Usable,” May 2005.), creating cognitive dissonance problems when the security mechanisms don't in fact provide this assurance.

This document applies the same encrypt-and-authenticate mechanism already employed in IPsec, SSH, and SSL/TLS, to CMS (technically some of these actually use authenticate-and-encrypt rather than encrypt-and-authenticate, since what's authenticated is the plaintext and not the ciphertext). This mechanism is widely supported in existing crypto libraries and hardware, and has been extensively analysed by the crypto community [EncryptThenAuth] (Krawczyk, H., “The Order of Encryption and Authentication for Protecting Communications (or: How Secure Is SSL?),” August 2001.).



 TOC 

3.  CMS Encrypt-and-Authenticate Overview

Conventional CMS encryption uses a content encryption key (CEK) to encrypt a message payload. Authenticated encryption requires two keys, one for encryption and a second one for authentication. Like other mechanisms that use authenticated encryption, this document employs a pseudorandom function (PRF) to convert a single block of keying material into the two keys required for encryption and authentication. This converts the standard CMS encryption operation:

    KEK( CEK ) || CEK( data )

into:

    KEK( master_secret ) || MAC( CEK( data ) )

where the MAC and encryption keys are derived from the master_secret via:

    MAC-K := PRF( master_secret, "authentication" );
    CEK-K := PRF( master_secret, "encryption" );



 TOC 

3.1.  Rationale

There are several possible means of deriving the two keys required for the encrypt-and-authenticate process from the single key normally provided by the key exchange or key transport mechanisms. Several of these however have security or practical issues. For example any mechanism that uses the single exchanged key in its entirety for encryption (using, perhaps, PRF( key ) as the MAC key) can be converted back to unauthenticated data by removing the outer MAC layer and rewriting the CMS envelope back to plain EnvelopedData or EncryptedData. By applying the PRF intermediate step, any attempt at a rollback attack will result in a decryption failure.

The option chosen here, the use of a PRF to derive the necessary sets of keying material from a master secret, is well-established through its use in IPsec, SSH, and SSL/TLS, and is widely supported in both crypto libraries and in encryption hardware.

The PRF used is PBKDF2 because its existing use in CMS makes it the most obvious candidate for such a function. If in the future a universal PRF, for example [HKDF] (Krawczyk, H. and P. Eronen, “HMAC-based Extract-and-Expand Key Derivation Function (HKDF),” May 2010.), is adopted then this can be substituted for PBKDF2 by specifying it in the prfAlgo field covered in Section 4 (CMS Encrypt-and-Authenticate ).

The resulting processing operations consist of a combination of the operations used for the existing CMS content types EncryptedData and AuthenticatedData, allowing them to be implemented relatively simply using existing code.



 TOC 

4.  CMS Encrypt-and-Authenticate

The encrypt-and-authenticate mechanism is implemented within the existing CMS RecipientInfo framework by defining a new pseudo-algorithm type authEnc which is used in place of a monolithic encrypt and hash algorithm. The RecipientInfo is used as a key container for the master secret used by the pseudo-algorithm from which the encryption and authentication keys for existing single-purpose encrypt-only and MAC-only algorithms are derived. Thus instead of using the RecipientInfo to communicate (for example) an AES or HMAC-SHA1 key, it communicates an authEnc keying value from which the required AES encryption and HMAC-SHA1 authentication keys are derived.

The authEnc pseudo-algorithm comes in two forms, one providing 128 bits of keying material and one providing 256 bits:

    id-smime OBJECT IDENTIFIER ::= { iso(1) member-body(2)
                us(840) rsadsi(113549) pkcs(1) pkcs9(9) 16 }

    id-alg  OBJECT IDENTIFIER ::= { id-smime 3 }

    id-alg-authEnc-128 OBJECT IDENTIFIER ::= { id-alg 15 }
    id-alg-authEnc-256 OBJECT IDENTIFIER ::= { id-alg 16 }

The algorithm parameters are:

    AuthEncParams ::= SEQUENCE {
        prfAlgo   [0] AlgorithmIdentifier DEFAULT PBKDF2,
        encAlgo       AlgorithmIdentifier,
        macAlgo       AlgorithmIdentifier
        }

prfAlgo is the PRF algorithm used to convert the authEnc value into the encryption and MAC keys. The default PRF is [PBKDF2] (Kaliski, B., “PKCS #5: Password-Based Cryptography Specification,” September 2000.).

encAlgo is the encryption algorithm and associated parameters to be used to encrypt the content.

macAlgo is the MAC algorithm and associated parameters to be used to authenticate/integrity-protect the content.

When the PRF AlgorithmIdentifier is used to specify a non-default PRF function, the salt parameter MUST be an empty (zero-length) string and the iterationCount MUST be one, since these values aren't used in the PRF process. In their encoded form, these two parameters have the value 08 00 02 01 01.



 TOC 

4.1.  Encrypt-and-Authenticate Message Processing

The randomly-generated authEnc key to be communicated via the RecipientInfo(s) is converted to separate encryption and authentication keys and applied to the encrypt-and-authenticate process as follows. The notation "PRF( key, salt, iterations )" is used to denote an application of the PRF to the given keying value and salt, for the given number of iterations:

  1. The MAC algorithm key is derived from the authEnc key via:
            MAC-K ::= PRF( authEnc_key, "authentication", 1 );
    
  2. The encryption algorithm key is derived from the authEnc key via:
            Enc-K ::= PRF( authEnc_key, "encryption", 1 );
    
  3. The data is processed as described in [AuthEnv] (Housley, R., “Cryptographic Message Syntax (CMS) Authenticated-Enveloped-Data Content Type,” November 2007.), and specifically since the mechanisms used are a union of EncryptedData and AuthenticatedData, as per [CMS] (Housley, R., “Cryptographic Message Syntax (CMS),” September 2009.). The EncryptedData processing is applied first and then the AuthenticatedData processing is applied to the result, so that the nesting is:
            MAC( encrypt( content ) );
    
  4. If authenticated attributes are present then they are encoded as described in [AuthEnv] (Housley, R., “Cryptographic Message Syntax (CMS) Authenticated-Enveloped-Data Content Type,” November 2007.) and MACed after the encrypted content, so that the processing is:
            MAC( encrypt( content ) || authAttr );
    



 TOC 

4.2.  Rationale

When choosing between encrypt-and-authenticate and authenticate-and-encrypt the more secure option is encrypt-and-authenticate. There has been extensive analysis of this in the literature, the best coverage is probably [EncryptThenAuth] (Krawczyk, H., “The Order of Encryption and Authentication for Protecting Communications (or: How Secure Is SSL?),” August 2001.).

The authEnc pseudo-algorithm has two "key sizes" rather than the one-size-fits-all that the PRF impedance-matching would provide. This is done to address real-world experience in the use of AES keys where users demanded AES-256 alongside AES-128 because of some perception that the former was "twice as good" as the latter. Providing an option for keys that go to 11 avoids potential user acceptance problems when someone notices that the authEnc pseudo-key has "only" 128 bits when they expect their AES keys to be 256 bits long.

Using a fixed-length key rather than making it a user-selectable parameter is done for the same reason as AES' quantised key lengths: there's no benefit to allowing, say, 137-bit keys over basic 128- and 256-bit lengths, it adds unnecessary complexity, and if the lengths are user-defined then there'll always be someone who wants keys that go up to 12. Providing a choice of two commonly-used lengths gives users the option of choosing a "better" key size should they feel the need, while not overloading the system with unneeded flexibility.

The use of the PRF AlgorithmIdentifier presents some problems because it's usually not specified in a manner that allows it to be easily used as a straight KDF. For example PBKDF2 has parameters:

    PBKDF2-params ::= SEQUENCE {
        salt OCTET STRING,
        iterationCount INTEGER (1..MAX),
        prf AlgorithmIdentifier {{PBKDF2-PRFs}}
                                DEFAULT algid-hmacWithSHA1
        }

of which only the prf AlgorithmIdentifier is used here. In order to avoid having to define new AlgorithmIdentifiers for each possible PRF, this specification sets any parameters not required for KDF functionality to no-op values. In the case of PBKDF2 this means that the salt has length zero and the iteration count is set to one, with only the prf AlgorithmIdentifier playing a part in the processing. Although it's not possible to know what form other PRFs-as-KDFs will take, a general note for their application within this specification is that any non-PRF parameters should similarly be set to no-op values.

As with other uses of PRFs for cryto impedance matching in protocols like IPsec, SSL/TLS and SSH, the amount of input to the PRF generally doesn't match the amount of output. The general philosophical implications of this are covered in various analyses of the properties and uses of PRFs. If you're worried about this then you can try and approximately match the authEnc "key size" to the key size of the encryption algorithm being used, although even there a perfect match for algorithms like Blowfish (448 bits) or RC5 (832 bits) is going to be difficult.

Apart from the extra step added to key management, all of the processing is already specified as part of the definition of the standard CMS content-types Encrypted/EnvelopedData and AuthenticatedData. This significantly simplifies both the specification and the implementation task, as no new content-processing mechanisms are introduced.



 TOC 

4.3.  Test Vectors

The following test vectors may be used to verify an implementation of MAC-authenticated encryption. This represents a text string encrypted and authenticated using the password "Password" via CMS PasswordRecipientInfo. The encryption algorithm used is triple DES, whose short block size (compared to AES) makes it easier to corrupt arbitrary bytes for testing purposes within the self-healing CBC mode which will result in correct decryption but a failed MAC check.

   0  227: SEQUENCE {
   3   11:   OBJECT IDENTIFIER authEnvelopedData
                               (1 2 840 113549 1 9 16 1 23)
  16  211:   [0] {
  19  208:     SEQUENCE {
  22    1:       INTEGER 0
  25   97:       SET {
  27   95:         [3] {
  29    1:           INTEGER 0
  32   27:           [0] {
  34    9:             OBJECT IDENTIFIER pkcs5PBKDF2
                                         (1 2 840 113549 1 5 12)
  45   14:             SEQUENCE {
  47    8:               OCTET STRING B9 2B 0A 27 E2 18 EE CC
  57    2:               INTEGER 2000
         :               }
         :             }
  61   35:           SEQUENCE {
  63   11:             OBJECT IDENTIFIER pwriKEK
                                         (1 2 840 113549 1 9 16 3 9)
  76   20:             SEQUENCE {
  78    8:               OBJECT IDENTIFIER des-EDE3-CBC
                                           (1 2 840 113549 3 7)
  88    8:               OCTET STRING 82 42 45 76 64 C3 EC AE
         :               }
         :             }
  98   24:           OCTET STRING
         :             32 30 75 13 74 DA 48 AD 7B 12 93 56 80 30 99 83
         :             8D DC B9 F5 CF 00 CD A4
         :           }
         :         }
 124   82:       SEQUENCE {
 126    9:         OBJECT IDENTIFIER data (1 2 840 113549 1 7 1)
 137   51:         SEQUENCE {
 139   11:           OBJECT IDENTIFIER authEnc128
                                       (1 2 840 113549 1 9 16 3 15)
 152   36:           SEQUENCE {
 154   20:             SEQUENCE {
 156    8:               OBJECT IDENTIFIER des-EDE3-CBC
                                           (1 2 840 113549 3 7)
 166    8:               OCTET STRING 86 60 D4 78 D7 8F 2C 13
         :               }
 176   12:             SEQUENCE {
 178    8:               OBJECT IDENTIFIER hmacSHA (1 3 6 1 5 5 8 1 2)
 188    0:               NULL
         :               }
         :             }
         :           }
 190   16:         [0] 6F 46 3F 60 69 D7 13 B5 73 9E 8A 7D 51 B1 80 C6
         :         }
 208   20:       OCTET STRING
         :         DC 4C 18 3D 46 2E 9C 7C 5D 2A A3 07 C6 BF AE 09
         :         F2 B6 84 5D
         :       }
         :     }
         :   }



 TOC 

5.  SMIMECapabilities Attribute

An S/MIME client SHOULD announce the set of cryptographic functions that it supports by using the S/MIME capabilities attribute [SMIME] (Ramsdell, B. and S. Turner, “Secure/Multipurpose Internet Mail Extensions (S/MIME) Version 3.2 Message Specification,” January 2010.). If the client wishes to indicate support for MAC-authenticated encryption, the capabilities attribute MUST contain the authEnc128 and/or authEnc256 OID specified above with algorithm parameters ABSENT. The other algorithms used in the authEnc algorithm such as the MAC and encryption algorithm are selected based on the presence of these algorithms in the SMIMECapabilities or by mutual agreement.



 TOC 

6.  Security Considerations

Unlike other CMS authenticated-data mechanisms like SignedData and AuthenticatedData, AuthEnv's primary transformation isn't authentication but encryption, so that AuthEnvData may decrypt successfully (in other words the primary data transformation present in the mechanism will succeed) but the secondary function of authentication using the MAC value that follows the encrypted data could still fail. This can lead to a situation in which an implementation might output decrypted data before it reaches and verifies the MAC value. In other words decryption is performed inline and the result is available immediately, while the authentication result isn't available until all of the content has been processed. If the implementation prematurely provides data to the user and later comes back to inform them that the earlier data was, in retrospect, tainted, this may cause users to act prematurely on the tainted data.

This situation could occur in a streaming implementation where data has to be made available as soon as possible (so that the initial plaintext is emitted before the final ciphertext and MAC value are read), or one where the quantity of data involved rules out buffering the recovered plaintext until the MAC value can be read and verified. In addition an implementation that tries to be overly helpful may treat missing non-payload trailing data as non-fatal, allowing an attacker to truncate the data somewhere before the MAC value and thereby defeat the data authentication. This is complicated even further by the fact that an implementation may not be able to determine, when it encounters truncated data, whether the remainder (including the MAC value) will arrive presently (a non-failure) or whether it's been truncated by an attacker and should therefore be treated as a MAC failure. (Note that this same issue affects other types of data authentication like signed and MACd data as well, since an over-optimistic implementation may return data to the user before checking for a verification failure is possible).

The exact solution to these issues is somewhat implementation-specific, with some suggested mitigations being as follows: Implementations should buffer the entire message if possible and verify the MAC before performing any decryption. If this isn't possible due to streaming or message-size constraints, implementations should consider breaking long messages into a sequence of smaller ones, each of which can be processed atomically as above. If even this isn't possible implementations should make obvious to the caller or user that an authentication failure has occurred and the previously-returned or output data shouldn't be used. Finally, any data-formatting problem such as obviously truncated data or missing trailing data should be treated as a MAC verification failure even if the rest of the data was processed correctly.



 TOC 

7.  IANA Considerations

This document contains two algorithm identifiers defined by the SMIME Working Group Registrar in an arc delegated by RSA to the SMIME Working Group: iso(1) member-body(2) us(840) rsadsi(113549) pkcs(1) pkcs-9(9) smime(16) modules(0). No action by IANA is necessary for this document or any anticipated updates.



 TOC 

8.  Acknowledgements

The author would like to thank Jim Schaad and the members of the S/MIME mailing list for their feedback on this document.



 TOC 

9.  References



 TOC 

9.1. Normative References

[AuthEnv] Housley, R., “Cryptographic Message Syntax (CMS) Authenticated-Enveloped-Data Content Type,” RFC 5083, November 2007 (TXT).
[CMS] Housley, R., “Cryptographic Message Syntax (CMS),” RFC 5652, September 2009 (TXT).
[PBKDF2] Kaliski, B., “PKCS #5: Password-Based Cryptography Specification,” RFC 2898, September 2000 (TXT).
[RFC2119] Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” BCP 14, RFC 2119, March 1997 (TXT, HTML, XML).
[SMIME] Ramsdell, B. and S. Turner, “Secure/Multipurpose Internet Mail Extensions (S/MIME) Version 3.2 Message Specification,” RFC 5751, January 2010 (TXT).


 TOC 

9.2. Informative References

[EncryptThenAuth] Krawczyk, H., “The Order of Encryption and Authentication for Protecting Communications (or: How Secure Is SSL?),” Springer-Verlag LNCS 2139, August 2001.
[Garfinkel] Garfinkel, S., “Design Principles and Patterns for Computer Systems That Are Simultaneously Secure and Usable,” May 2005.
[HKDF] Krawczyk, H. and P. Eronen, “HMAC-based Extract-and-Expand Key Derivation Function (HKDF),” RFC 5869, May 2010 (TXT).
[IPsec] Kent, S. and K. Seo, “Security Architecture for the Internet Protocol,” RFC 4301, December 2005 (TXT).
[OpenPGP] Callas, J., Donnerhacke, L., Hal, H., Shaw, D., and R. Thayer, “OpenPGP Message Format,” RFC 4880, November 2007 (TXT).
[SSH] Ylonen, T. and C. Lonvick, “The Secure Shell (SSH) Transport Layer Protocol,” RFC 4253, January 2006 (TXT).
[TLS] Dierks, T. and E. Rescorla, “The Transport Layer Security (TLS) Protocol Version 1.2,” RFC 5246, August 2008 (TXT).


 TOC 

Author's Address

  Peter Gutmann
  University of Auckland
  Department of Computer Science
  New Zealand
Email:  pgut001@cs.auckland.ac.nz