TOC 
Network Working GroupD. Meyer
Internet-DraftUniversitaet Bremen TZI
Intended status: Standards TrackP. Saint-Andre
Expires: September 10, 2009Cisco
 March 09, 2009


Extensible Messaging and Presence Protocol (XMPP) End-to-End Encryption Using Transport Layer Security ("XTLS")
draft-meyer-xmpp-e2e-encryption-01

Status of this Memo

This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress.”

The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt.

The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html.

This Internet-Draft will expire on September 10, 2009.

Copyright Notice

Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents in effect on the date of publication of this document (http://trustee.ietf.org/license-info). Please review these documents carefully, as they describe your rights and restrictions with respect to this document.

Abstract

This document specifies "XTLS", a protocol for end-to-end encryption of Extensible Messaging and Presence Protocol (XMPP) traffic via an application-level usage of Transport Layer Security (TLS). XTLS treats the end-to-end exchange of XML stanzas as a virtual transport and uses TLS to secure that transport, thus enabling XMPP entities to communicate in a way that is designed to prevent eavesdropping, tampering, and forgery of XML stanzas. The protocol can be used for secure end-to-end messaging as well as any others application such as file transfer.



Table of Contents

1.  Introduction
2.  Scope
3.  Threat Analysis
4.  Requirements
5.  Approach
6.  XTLS Protocol Flow
7.  End-to-End Streams over XTLS Protocol Flow
8.  Bootstrapping Trust on First Communication
    8.1.  Exchanging Certificates
    8.2.  Verification of Non-Human Parties
9.  Session Termination
10.  Determining Support
11.  Security Considerations
    11.1.  Mandatory-to-Implement Technologies
    11.2.  Certificates
    11.3.  Denial of Service
12.  IANA Considerations
13.  References
    13.1.  Normative References
    13.2.  Informative References
Appendix A.  XML Schema
Appendix B.  Copying Conditions
§  Authors' Addresses




 TOC 

1.  Introduction

End-to-end encryption of traffic sent over the Extensible Messaging and Presence Protocol (XMPP) is a desirable goal. Since 1999, the Jabber/XMPP developer community has experimented with several such technologies, including OpenPGP [XEP‑0027] (Muldowney, T., “Current Jabber OpenPGP Usage,” November 2006.), S/MIME [RFC3923] (Saint-Andre, P., “End-to-End Signing and Object Encryption for the Extensible Messaging and Presence Protocol (XMPP),” October 2004.), and encrypted sessions or "ESessions" [XEP‑0218] (Saint-Andre, P. and I. Paterson, “Bootstrapping Implementation of Encrypted Sessions,” May 2007.). For various reasons, these technologies have not been widely implemented and deployed. When the XMPP Standards Foundation asked various Internet security experts to complete a security review of encrypted sessions, it was recommended to explore the possibility of instead using the Transport Layer Security [TLS] (Dierks, T. and E. Rescorla, “The Transport Layer Security (TLS) Protocol Version 1.2,” August 2008.) as the base technology for XMPP. That possibility is explored in this document.

TLS is the most widely implemented protocol for securing network traffic. In addition to applications in the email infrastructure, the World Wide Web [HTTP‑TLS] (Rescorla, E., “HTTP Over TLS,” May 2000.), and datagram transport for multimedia session negotiation [DTLS] (Rescorla, E. and N. Modadugu, “Datagram Transport Layer Security,” April 2006.), TLS is used in XMPP to secure TCP connections from client to server and from server to server, as specified in [rfc3920bis] (Saint-Andre, P., “Extensible Messaging and Presence Protocol (XMPP): Core,” March 2009.). Therefore TLS is already familiar to XMPP developers.

This specification, called "XTLS", defines a method whereby any XMPP entity that supports the XMPP Jingle negotiation framework [XEP‑0166] (Ludwig, S., Beda, J., Saint-Andre, P., McQueen, R., Egan, S., and J. Hildebrand, “Jingle,” December 2008.) can use TLS semantics for end-to-end encryption, whether the application data is sent over a streaming transport (like TCP) or a datagram transport (like UDP). The basic use case is to tunnel XMPP stanzas between two IM users for end-to-end secure chat using end-to-end XML streams. However, XTLS is not limited to encryption of one-to-one text chat, since it can be used between two XMPP clients for encryption of any XMPP payloads, between an XMPP client and a remote XMPP service (i.e., a service with which a client does not have a direct XML stream, such as a [XEP‑0045] (Saint-Andre, P., “Multi-User Chat,” July 2008.) chatroom), or between two remote XMPP services. Furthermore, XTLS can be used for encrypted file transfer using [XEP‑0234] (Saint-Andre, P., “Jingle File Transfer,” February 2009.), for encrypted voice or video sessions using [XEP‑0167] (Ludwig, S., Saint-Andre, P., Egan, S., McQueen, R., and D. Cionoiu, “Jingle RTP Sessions,” December 2008.) and [DTLS‑SRTP] (McGrew, D. and E. Rescorla, “Datagram Transport Layer Security (DTLS) Extension to Establish Keys for Secure Real-time Transport Protocol (SRTP),” February 2009.), and other applications.

Note: The following capitalized keywords are to be interpreted as described in [TERMS] (Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” March 1997.): "MUST", "SHALL", "REQUIRED"; "MUST NOT", "SHALL NOT"; "SHOULD", "RECOMMENDED"; "SHOULD NOT", "NOT RECOMMENDED"; "MAY", "OPTIONAL".



 TOC 

2.  Scope

The XMPP communication exchanges of interest here exist in the context of a one-to-one communication "session" between two entities, where the information exchanged takes the form of XMPP stanzas. However, several other kinds of XMPP exchanges exist outside the context of one-to-one communication sessions:

Ideally, any technology for end-to-end encryption in XMPP could be extended to cover all the scenarios above as well as one-to-one communication sessions. However, many-to-many sessions, one-to-many broadcast, and offline messages are out of scope for this specification.



 TOC 

3.  Threat Analysis

XMPP technologies are typically deployed using a client-server architecture. As a result, XMPP endpoints (often but not always controlled by human users) need to communicate through one or more servers. For example, the user juliet@capulet.lit connects to the capulet.lit server and the user romeo@montague.lit connects to the montague.lit server, but in order for Juliet to send a message to Romeo the message will be routed over her client-to-server connection with capulet.lit, over a server-to-server connection between capulet.lit and montague.lit, and over Romeo's client-to-server connection with montague.lit. Although [rfc3920bis] (Saint-Andre, P., “Extensible Messaging and Presence Protocol (XMPP): Core,” March 2009.) requires support for Transport Layer Security [TLS] (Dierks, T. and E. Rescorla, “The Transport Layer Security (TLS) Protocol Version 1.2,” August 2008.) to make it possible to encrypt all of these connections, when XMPP is deployed any of these connections might be unencrypted. Furthermore, even if the server-to-server connection is encrypted and both of the client-to-server connections are encrypted, the message would still be in the clear while processed by both the capulet.lit and montague.lit servers.

In this specification we primarily address communications security ("commsec") between two parties, especially confidentiality, data integrity, and peer entity authentication. Communications security can be subject to a variety of attacks, which [RFC3552] (Rescorla, E. and B. Korver, “Guidelines for Writing RFC Text on Security Considerations,” July 2003.) divides attacks into passive and active categories. In a passive attack, information is leaked (e.g., a passive attacker could read all of the messages that Juliet sends to Romeo). In an active attack, the attacker can add, modify, or delete messages between the parties, thus disrupting communications.

Traditionally, it seems that XMPP users have been concerned more about passive attacks (such as eavesdropping) than about active attacks (such as man-in-the-middle), perhaps because they have thought that their communications are "just chat", because they have had no expectation that endpoints could be authenticated, or because they have believed that hijacked communications would be detected socially (e.g., because the other party did not have an authentic "voice" in a text conversation). However, both forms of attack are of concern in this protocol.

In particular, we consider the following types of attacks and attackers:

Other attacks are possible, and the foregoing list is best considered incomplete at this time.



 TOC 

4.  Requirements

(This section borrows some text from [XEP‑0210] (Paterson, I., “Requirements for Encrypted Sessions,” May 2007.).)

This document stipulates the following requirements for end-to-end encryption of XMPP communications. It is possible that some of those requirements can be met only with particular TLS cipher suites, or cannot be met at all without defining extensions to TLS itself; a full gap analysis has not yet been completed.



 TOC 

5.  Approach

In broad outline, XTLS takes the following approach to end-to-end encryption of XMPP traffic:

  1. We assume that all XMPP entities will have X.509 certificates; realistically these certificates are likely to be self-signed and automatically generated by an XMPP client, however CA-issued certificates are encouraged to overcome problems with self-signed certificates.
  2. We use the XMPP Jingle extensions as the negotiation framework (see [XEP‑0166] (Ludwig, S., Beda, J., Saint-Andre, P., McQueen, R., Egan, S., and J. Hildebrand, “Jingle,” December 2008.)).
  3. We define a <security/> element that can be included in any Jingle negotiation, and a new "security-info" Jingle action for sending security-related information.
  4. When an entity wishes to encrypt its communications with a second entity, it sends a Jingle session-initiate request that specifies the desired application type, a possible transport, the sender's X.509 fingerprint, and optionally hints about the sender's supported TLS methods.
  5. If both parties support XTLS, the first data sent over the negotiated transport is TLS handshake data, not application data. Once the TLS handshake has finished, the parties can then send application data over the now-encrypted transport.
  6. The simplest scenario is end-to-end encryption of traditional XMPP text chat using end-to-end XML streams, in-band bytestreams (see [XEP‑0047] (Karneges, J., “In-Band Bytestreams (IBB),” November 2006.)), and previously-accepted X.509 certificates.
  7. On first use of end-to-end encryption between two entities, it is encouraged to use secure remote passwords rather than leap-of-faith to bootstrap the subsequent use of the client-generated X.509 certificates.
  8. More complex scenarios are theoretically supported (e.g., encrypted file transfer using SOCKS5 bytestreams and encrypted voice chat using DTLS-SRTP) but have not yet been fully defined.
  9. XTLS theoretically can be used to establish a TLS-encrypted streaming transport or a DTLS-encrypted datagram transport, but integration with DTLS [DTLS] (Rescorla, E. and N. Modadugu, “Datagram Transport Layer Security,” April 2006.) has not yet been prototyped so use with streaming transports is the more stable scenario.

We expand on this approach in the following section.



 TOC 

6.  XTLS Protocol Flow

The basic flow for an XTLS session is as follows, where traffic represented by single dashes (---) is sent over the XMPP signalling channel and traffic represented by double lines (===) is sent over the negotiated transport.

Initiator                   Responder
  |                            |
  |  session-initiate          |
  |  (with security info)      |
  |--------------------------->|
  |  ack                       |
  |<---------------------------|
  |  session-accept            |
  |<---------------------------|
  |  ack                       |
  |--------------------------->|
  |  open transport            |
  |<==========================>|
  |  TLS ClientHello           |
  |===========================>|
  |  TLS ServerHello, [...]    |
  |<===========================|
  |  TLS [...], Finished       |
  |===========================>|
  |  TLS [...], Finished       |
  |<===========================|
  |  application data          |
  |<==========================>|
  |  session-terminate         |
  |<---------------------------|
  |  ack                       |
  |--------------------------->|
  |                            |

To simplify the description we assume here that the parties already trust each other's certificates. See discussion under Section 8 (Bootstrapping Trust on First Communication) for information about bootstrapping of certificate trust on the first communication.

First the initiator sends a Jingle session-initiate request (here the simple case of an end-to-end text chat session using in-band bytestreams [XEP‑0047] (Karneges, J., “In-Band Bytestreams (IBB),” November 2006.). This request includes a <security/> element that contains the fingerprint of the certificate that the initiator will use during the TLS negotiation and a list of TLS methods the initiator supports (here X.509 certificate based authentication and TLS-SRP). Note that this information is exchanged over the insecure server based connection. The purpose of the exchange is to gather information what TLS method should be used in the TLS handshake, e.g. if a client can not verify the fingerprint of the peer it MAY omit the X.509 method. If both clients can verify the fingerprint of the other, it is likely that X.509 certificate based authentication will succeed (unless the data is altered); if one client can not verify the fingerprint the client MAY prompt the user for a password for TLS-SRP based authentication (see Section 8 (Bootstrapping Trust on First Communication) for details).

<iq from='romeo@montague.lit/orchard'
    id='xn28s7gk'
    to='juliet@capulet.lit/balcony'
    type='set'>
  <jingle xmlns='urn:xmpp:jingle:0'>
          action='session-initiate'
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <content creator='initiator' name='xmlstream'>
      <description xmlns='urn:xmpp:jingle:apps:xmlstream:0'/>
      <transport xmlns='urn:xmpp:jingle:transports:ibb:0'
                 block-size='4096'
                 sid='ch3d9s71'/>
      <security xmlns='urn:xmpp:jingle:security:xtls:0'>
        <fingerprint algo='sha1'>RomeoX509CertSHA1Hash</fingerprint>
        <method name='x509'/>
        <method name='srp'/>
      </security>
    </content>
  </jingle>
</iq>

The responder immediately acknowledges receipt of the session-initiate by sending an IQ stanza of type "result" (not shown here).

Depending on the application type, a user agent controlled by a human user might need to wait for the user to affirm a desire to proceed with the session before continuing. When the user agent has received such affirmation (or if the user agent can automatically proceed for any reason, e.g. because no human intervention is expected or because a human user has configured the user agent to automatically accept sessions with a given entity), it returns a Jingle session-accept message. This message will typically contain the offered application type, transport method, and a <security/> element that includes the fingerprint of the responder's X.509 certificate as well as the responder's supported TLS methods.

<iq from='juliet@capulet.com/balcony'
    id='hf64hl'
    to='romeo@montague.net/orchard'
    type='set'>
  <jingle xmlns='urn:xmpp:jingle:0'>
          action='session-accept'
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <content creator='initiator' name='xmlstream'>
      <description xmlns='urn:xmpp:jingle:apps:xmlstream:0'/>
      <transport xmlns='urn:xmpp:jingle:transports:ibb:0'
                 block-size='4096'
                 sid='ch3d9s71'/>
      <security xmlns='urn:xmpp:jingle:security:xtls:0'/>
        <fingerprint algo='sha1'>JulietX509CertSHA1Hash</fingerprint>
        <method name='x509'/>
        <method name='srp'/>
      </security>
    </content>
  </jingle>
</iq>

The following rules apply to the responder's handling of the session-initiate message:

  1. If the responder does not support Jingle-XTLS it will silently ignore the <security/> element in the offer and therefore will return a session-accept message without a <security/> element.
  2. If the responder supports Jingle-XTLS it SHOULD return a session-accept message that contains a <security/> element.
  3. If the responder thinks it will be able to verify the initiator's certificate, it MUST include the fingerprint for the responder's certificate in the <security/> element of the session-accept message. This is the "happy path" and will occur when the parties have already verified each other's certificates.
  4. If the responder thinks it will not be able to verify the initiator's certificate, it MAY omit the fingerprint for the responder's certificate in the <security/> element of the session-accept message. This indicates that certificate-based authentication is not possible. In this case the responder SHOULD signal that it wishes to use some other authentication method, such as secure remote passwords (see discussion under Section 8 (Bootstrapping Trust on First Communication)).
  5. If the responding client cannot verify the initiator's certificate, it SHOULD ask the responding user if a password was exchanged between the parties that can be used for TLS-SRP. If this is not the case, setting up a mutually-authenticated link will fail and the responder MAY terminate the session. Alternatively it could send its own fingerprint knowing it cannot authenticate the initiator, in which case the responder has to trust that there is no man-in-the-middle (see discussion under Section 8 (Bootstrapping Trust on First Communication)).

When the responder sends the session-accept message, the initiator acknowledges receipt by sending an IQ stanza of type "result" (not shown here).

The following rules apply to the initiator's handling of the session-accept message:

  1. If the initiator receives a session-accept without a <security/> element, setting up a secure transport layer has failed. The initiator MAY terminate the session at this point or instead proceed without securing the transport. The client SHOULD ask the initiating user how to processed. This depends on the Jingle application and the initiator's preferences: it makes no sense to use end-to-end XML streams without encryption, but the initiator might continue a file transfer without encryption.
  2. If the initiating client cannot verify the responder's certificate it SHOULD ask the initiating user if a password was exchanged between the parties that can be used for TLS-SRP. If this is not the case, setting up a mutually-authenticated link will fail and the responder MAY terminate the session or proceed with leap-of-faith (see discussion under Section 8 (Bootstrapping Trust on First Communication)).

The initiator can now determine if X.509 certificate based authentication will work or if TLS-SRP will be used. It sends an additional security-info message to the responder to signal its choice. This step is not really necessary because the responder will see the initiator's choice in the first message of the TLS handshake, but it can help an implementation to set up its TLS library properly. Because in this section we assume that the parties already have validated each other's certificates, the security method signalled here is "x509".

<iq from='romeo@montague.lit/orchard'
    id='hf749j'
    to='juliet@capulet.lit/balcony'
    type='set'>
  <jingle xmlns='urn:xmpp:jingle:0'>
          action='security-info'
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <content creator='initiator' name='xmlstream'>
      <security xmlns='urn:xmpp:jingle:security:xtls:0'>
        <method name='x509'/>
      </security>
    </content>
  </jingle>
</iq>

The responder acknowledges receipt by sending an IQ stanza of type "result" (not shown here).

Parallel to the security-info exchange, the clients negotiate a transport for the Jingle session (here the transport is an in-band bytestream as defined in [XEP‑0047] (Karneges, J., “In-Band Bytestreams (IBB),” November 2006.), for which the Jingle negotiation process is specified in [XEP‑0261] (Saint-Andre, P., “Jingle In-Band Bytestreams Transport,” February 2009.); however other transports could be used, for example SOCKS5 bytestreams as defined in [XEP‑0065] (Smith, D., Miller, M., and P. Saint-Andre, “SOCKS5 Bytestreams,” May 2007.) and negotiated for Jingle as specified in [XEP‑0260] (Saint-Andre, P. and D. Meyer, “Jingle SOCKS5 Bytestreams Transport Method,” February 2009.)). Because the parties wish to establish end-to-end encryption, they do not send application data over the transport until the transport has been secured. Therefore the first data that they exchange over the transport consists of the standard four-way TLS handshake, encoded in accordance with the negotiated transport method.

Note: Each transport MUST define a specific time when both clients know that the transport is secured. When XTLS is not used, the Jingle implementation would signal to the using application that the transport is open when the session-accept is sent or received, or when connectivity checks determine media can flow over one of the transport candidates. When XTLS is used, the Jingle implementation starts a TLS handshake on the transport and signals to the using application that the transport is open only after the TLS handshake has finished successfully.

During the TLS handshake, the responder MUST take the role of the TLS server and the initiator MUST take the role of the TLS client. Because the transport is an in-band bytestream, the TLS handshake data is prepared as described in [XEP‑0047] (Karneges, J., “In-Band Bytestreams (IBB),” November 2006.) (i.e., Base64-encoded). First the initiator (acting as the TLS client) constructs a TLS ClientHello, encodes it according to IBB, and sends it to the responder.

<iq from='romeo@montague.net/orchard'
    id='vh38s618'
    to='juliet@capulet.com/balcony'
    type='set'>
  <data xmlns='http://jabber.org/protocol/ibb'
        seq='0'
        sid='vj3hs98y'>
    Base64-encoded-TLS-data
  </data>
</iq>

The responder (acting as the TLS server) then acknowledges receipt by sending an IQ stanza of type "result" (not shown here).

The responder then constructs an appropriate TLS message or messages, such as a ServerHello and a CertificateRequest.

Note: The responder MUST send a CertificateRequest to the initiator.

<iq from='juliet@capulet.com/balcony'
    id='xyw516d0'
    from='romeo@montague.net/orchard'
    type='set'>
  <data xmlns='http://jabber.org/protocol/ibb'
        seq='0'
        sid='vj3hs98y'>
    Base64-encoded-TLS-data
  </data>
</iq>

(Because in-band bytestreams are bidirectional and this data is sent from the responder to the initiator, the IBB 'seq' attribute has a value of zero, not 1.)

The initiator then acknowledges receipt by sending an IQ stanza of type "result" (not shown here).

After some number of TLS messages, the initiator eventually sends a TLS Finished message to the responder.

<iq from='romeo@montague.net/orchard'
    id='s91vd527'
    to='juliet@capulet.com/balcony'
    type='set'>
  <data xmlns='http://jabber.org/protocol/ibb'
        seq='3'
        sid='vj3hs98y'>
    Base64-encoded-TLS-data
  </data>
</iq>

The responder then acknowledges receipt by sending an IQ stanza of type "result" (not shown here).

The responder then also sends a TLS Finished message.

<iq from='juliet@capulet.com/balcony'
    id='z71gs73t'
    from='romeo@montague.net/orchard'
    type='set'>
  <data xmlns='http://jabber.org/protocol/ibb'
        seq='3'
        sid='vj3hs98y'>
    Base64-encoded-TLS-data
  </data>
</iq>

The initiator then acknowledges receipt by sending an IQ stanza of type "result" (not shown here).

If the TLS negotiation has finished successfully, then the Jingle implementation shall signal to the using application that the transport has been secured and is ready to be used. The parties can then begin to exchange application data over the encrypted transport.



 TOC 

7.  End-to-End Streams over XTLS Protocol Flow

For end-to-end encryption of XMPP traffic, the application data is an end-to-end XML stream. After the XTLS session is set up, the peers open an XML stream to excahnge messages. The XML streams are sent though the XTLS connection. In this example the streams are sent over TLS over IBB.

First the initiator constructs an initial stream header.

<stream:stream
        xmlns='jabber:client'
        xmlns:stream='http://etherx.jabber.org/streams'
        from='romeo@montague.lit/orchard'
        to='juliet@capulet.lit/balcony'
        version='1.0'>

Note: In accordance with [rfc3920bis] (Saint-Andre, P., “Extensible Messaging and Presence Protocol (XMPP): Core,” March 2009.), the initial stream header SHOULD include the 'to' and 'from' attributes, which SHOULD specify the full JIDs of the clients. The initiator SHOULD include the version='1.0' flag as shown in the previous example.

The initiator then sends the stream header through the TLS stream and encodes the TLS data in IBB and sends it to the responder.

<iq from='romeo@montague.net/orchard'
    id='ur73n153'
    to='juliet@capulet.com/balcony'
    type='set'>
  <data xmlns='http://jabber.org/protocol/ibb'
        seq='4'
        sid='vj3hs98y'>
    Base64-TLS-data-of-the-stream-header
  </data>
</iq>

The responder then acknowledges receipt by sending an IQ stanza of type "result" (not shown here).

The responder then constructs a response stream header back to the initiator.

<stream:stream
        xmlns='jabber:client'
        xmlns:stream='http://etherx.jabber.org/streams'
        from='juliet@capulet.lit/balcony'
        id='hs91gh1836d8s717'
        to='romeo@montague.lit/orchard'
        version='1.0'>

The responder then sends the response stream header over the TLS link it to the initiator.

<iq from='juliet@capulet.com/balcony'
    id='pd61g397'
    to='romeo@montague.net/orchard'
    type='set'>
  <data xmlns='http://jabber.org/protocol/ibb'
        seq='4'
        sid='vj3hs98y'>
    Base64-TLS-data-of-the-responce-stream-header
  </data>
</iq>

The initiator then acknowledges receipt by sending an IQ stanza of type "result" (not shown here).

Once the streams are established over the bytestreams, either entity then can send XMPP message, presence, and IQ stanzas, with or without 'to' and 'from' addresses.

For example, the initiator could construct an XMPP message.

<message from='romeo@montague.lit/orchard'
         to='juliet@capulet.lit/balcony'>
  <body>
    M&apos;lady, I would be pleased to make your acquaintance.
  </body>
</message>

The initiator then sends the message over the XTLS connection to the responder.

<iq from='romeo@montague.net/orchard'
    id='iq7dh294'
    to='juliet@capulet.com/balcony'
    type='set'>
  <data xmlns='http://jabber.org/protocol/ibb'
        seq='5'
        sid='vj3hs98y'>
    Base64-TLS-data
  </data>
</iq>

The responder then acknowledges receipt by sending an IQ stanza of type "result" (not shown here).

The responder could then construct a reply.

<message from='juliet@capulet.lit/balcony'
         to='romeo@montague.lit/orchard'>
  <body>Art thou not Romeo, and a Montague?</body>
</message>

The responder then sends the reply over the XTLS connection to the initiator.

<iq from='juliet@capulet.com/balcony'
    id='hr91hd63'
    to='romeo@montague.net/orchard'
    type='set'>
  <data xmlns='http://jabber.org/protocol/ibb'
        seq='5'
        sid='vj3hs98y'>
    Base64-TLS-data
  </data>
</iq>

The initiator then acknowledges receipt by sending an IQ stanza of type "result" (not shown here).

To close the end-to-end XML stream, either party (here the responder) constructs a closing </stream:stream> element.

</stream:stream>

The client sends the closing element to the peer over the XTLS connection.

<iq from='juliet@capulet.com/balcony'
    id='kr91n475'
    to='romeo@montague.net/orchard'
    type='set'>
  <data xmlns='http://jabber.org/protocol/ibb'
        seq='6'
        sid='vj3hs98y'>
    Base64-TLS-data
  </data>
</iq>

The peer then acknowledges receipt by sending an IQ stanza of type "result" (not shown here).

However, even after the application-level XML stream is terminated, the negotiated Jingle transport (here in-band bytestream) continues and could be re-used. To completely terminate the Jingle session, the terminating party would then also send a Jingle session-terminate message.

<iq from='juliet@capulet.lit/balcony'
    id='psy617r4'
    to='romeo@montague.lit/orchard'
    type='set'>
  <jingle xmlns='urn:xmpp:jingle:0'
          action='session-terminate'
          initiator='romeo@montague.lit/orchard'
          sid='851ba2'/>
</iq>

The other party then acknowledges the Jingle session-terminate by sending an IQ stanza of type "result" (not shown here).



 TOC 

8.  Bootstrapping Trust on First Communication

When two parties first attempt to use XTLS, their certificates might not be accepted (e.g., because they are self-signed or issued by unknown certification authorities). Therefore each party needs to accept the other's certificate for use in future communication sessions. There are several ways to do so:

If the parties use a password or SASL channel binding to bootstrap trust, the process needs to be completed only once. After the clients have authenticated with the shared secret, they can exchange their certificates for future communication.



 TOC 

8.1.  Exchanging Certificates

To retrieve the certificate of the peer for future communications, a client SHOULD request the certificate according to [XEP‑0189] (Paterson, I., Saint-Andre, P., and D. Meyer, “Public Key Publishing,” March 2009.) over the secure connection. This works only if XTLS was used to set up an end-to-end secure XML stream; exchanging certificates if XTLS was used for other purposes like file transfer is not possible. A client MUST NOT request the certificate over the insecure stream based on the connection to the XMPP server.

<iq from='romeo@montague.lit/orchard'
    id='hf7634k4'
    to='juliet@capulet.lit/balcony'
    type='get'>
  <pubkeys xmlns='urn:xmpp:tmp:pubkey'/>
</iq>

The peer MUST return its own client certificate. If the user has different clients with different client certificates and one user certificate, the user certificate SHOULD also be returned. The user certificate allows it to verify other client certificates using public key retrieval described in [XEP‑0189] (Paterson, I., Saint-Andre, P., and D. Meyer, “Public Key Publishing,” March 2009.).

<iq from='juliet@capulet.com/balcony'
    id='hf7634k4'
    to='romeo@montague.lit/orchard'
    type='result'>
  <pubkeys xmlns='urn:xmpp:tmp:pubkey'>
    <keyinfo>
      <x509cert>
MIICCTCCAXKgAwIBAgIJALhU0Id6xxwQMA0GCSqGSIb3DQEBBQUAMA4xDDAKBgNV
BAMTA2ZvbzAeFw0wNzEyMjgyMDA1MTRaFw0wODEyMjcyMDA1MTRaMA4xDDAKBgNV
BAMTA2ZvbzCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEA0DPcfeJzKWLGE22p
RMINLKr+CxqozF14DqkXkLUwGzTqYRi49yK6aebZ9ssFspTTjqa2uNpw1U32748t
qU6bpACWHbcC+eZ/hm5KymXBhL3Vjfb/dW0xrtxjI9JRFgrgWAyxndlNZUpN2s3D
hKDfVgpPSx/Zp8d/ubbARxqZZZkCAwEAAaNvMG0wHQYDVR0OBBYEFJWwFqmSRGcx
YXmQfdF+XBWkeML4MD4GA1UdIwQ3MDWAFJWwFqmSRGcxYXmQfdF+XBWkeML4oRKk
EDAOMQwwCgYDVQQDEwNmb2+CCQC4VNCHesccEDAMBgNVHRMEBTADAQH/MA0GCSqG
SIb3DQEBBQUAA4GBAIhlUeGZ0d0msNVxYWAXg2lRsJt9INHJQTCJMmoUeTtaRjyp
ffJtuopguNNBDn+MjrEp2/+zLNMahDYLXaTVmBf6zvY0hzB9Ih0kNTh23Fb5j+yK
QChPXQUo0EGCaODWhfhKRNdseUozfNWOz9iTgMGw8eYNLllQRL//iAOfOr/8
      </x509cert>
    </keyinfo>
  </pubkeys>
</iq>


 TOC 

8.2.  Verification of Non-Human Parties

If one of the parties is a "bot" (e.g., an automated service or a device such as a set-top box), the password exchange is a bit more complicated. It is similar to Bluetooth peering if the user has access to both clients at the same time. One of the following scenarios might apply:

A user might have different X.509 certificates for each device. [XEP‑0189] (Paterson, I., Saint-Andre, P., and D. Meyer, “Public Key Publishing,” March 2009.) can be used to manage the user's certificates. A client SHOULD check the peer's PubSub node for certificates. This makes it possible to use the password method only once between two users even if one or both users switch clients. A user can also communicate with a friend's bots: they first open a secure link between two chat clients with a password and exchange the user certificates. After that each device of a user can verify all devices of the other without the need of a password.

The retrieved certificate from the PubSub node may be signed by a CA the client can verify. In that case the client MAY skip the password authentication and rely on the X.509 certificate chain. The client SHOULD ask the user if the certificate should be accepted or if a password exchange is desired.



 TOC 

9.  Session Termination

If either client cannot verify the certificate of the peer or receives an invalid message on the TLS layer, it MUST terminate the Jingle session immediately by sending a Jingle session-terminate message that includes a Jingle reason of <security-error/>.

<iq from='romeo@montague.lit/orchard'
    id='hz81vf48'
    to='juliet@capulet.lit/balcony'
    type='set'>
  <jingle xmlns='urn:xmpp:jingle:0'
          action='session-terminate'
          initiator='romeo@montague.lit/orchard'
          sid='a73sjjvkla37jfea'>
    <reason><security-error/></reason>
  </jingle>
</iq>

The other party then acknowledges the session-terminate by sending an IQ stanza of type "result" (not shown here), and the Jingle session is finished.



 TOC 

10.  Determining Support

If an entity wishes to request the use of XTLS, it SHOULD first determine whether the intended responder supports the protocol. This can be done directly via [XEP‑0030] (Hildebrand, J., Millard, P., Eatmon, R., and P. Saint-Andre, “Service Discovery,” June 2008.) or indirectly via [XEP‑0115] (Hildebrand, J., Saint-Andre, P., Tronçon, R., and J. Konieczny, “Entity Capabilities,” February 2008.).

If an entity supports XTLS, it MUST report that by including a service discovery feature of "urn:xmpp:jingle:security:xtls:0" in response to disco#info requests.

<iq from='romeo@montague.lit/orchard'
    id='disco1'
    to='juliet@capulet.lit/chamber'
    type='get'>
  <query xmlns='http://jabber.org/protocol/disco#info'/>
</iq>
<iq from='juliet@capulet.lit/chamber'
    id='disco1'
    to='romeo@montague.lit/orchard'
    type='result'>
  <query xmlns='http://jabber.org/protocol/disco#info'>
    <feature var='urn:xmpp:jingle:security:xtls:0'/>
    <feature var='urn:xmpp:jingle:apps:xmlstream:0'/>
  </query>
</iq>

Both service discovery and entity capabilities information could be corrupted or intercepted; for details, see under Section 11.3 (Denial of Service).



 TOC 

11.  Security Considerations

This entire document addresses security. Particular security-related issues are discussed in the following sections.



 TOC 

11.1.  Mandatory-to-Implement Technologies

An implementation MUST at a minimum support the "srp" and "x509" methods. A future version of this specification will document mandatory-to-implement TLS ciphers.



 TOC 

11.2.  Certificates

As noted, XTLS can be used between XMPP clients, between an XMPP client and a remote XMPP service (i.e., a service with which a client does not have a direct XML stream), or between remote XMPP services. Therefore, a party to an XTLS bytestream will present either a client certificate or a server certificate as appropriate. Such certificates MUST be generated and validated in accordance with the certificate guidelines guidelines provided in [rfc3920bis] (Saint-Andre, P., “Extensible Messaging and Presence Protocol (XMPP): Core,” March 2009.).

A future version of this specification might provide additional guidelines regarding certificate validation in the context of client-to-client encryption.



 TOC 

11.3.  Denial of Service

Currently XMPP stanzas such as Jingle negotiation messages and service discovery exchanges are not encrypted or signed. As a result, it is possible for an attacker to intercept these stanzas and modify them, thus convincing one party that the other party does not support XTLS and therefore denying the parties an opportunity to use XTLS.

This is a more general problem with XMPP technologies and needs to be addressed at the core XMPP layer.



 TOC 

12.  IANA Considerations

It might be helpful to create a registry of TLS methods that can be used in the context of XTLS (e.g., "openpgp" for use of [RFC5081] (Mavrogiannopoulos, N., “Using OpenPGP Keys for Transport Layer Security (TLS) Authentication,” November 2007.), "srp" for use of [TLS‑SRP] (Taylor, D., Wu, T., Mavrogiannopoulos, N., and T. Perrin, “Using the Secure Remote Password (SRP) Protocol for TLS Authentication,” November 2007.), and "x509" for use of [TLS] (Dierks, T. and E. Rescorla, “The Transport Layer Security (TLS) Protocol Version 1.2,” August 2008.) with certificates). The registry could be maintained by the IANA or by the XMPP Registrar. A future version of this specification will provide more detailed information about the registration requirements.



 TOC 

13.  References



 TOC 

13.1. Normative References

[rfc3920bis] Saint-Andre, P., “Extensible Messaging and Presence Protocol (XMPP): Core,” draft-saintandre-rfc3920bis-09 (work in progress), March 2009 (TXT).
[TERMS] Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” BCP 14, RFC 2119, March 1997 (TXT, HTML, XML).
[TLS] Dierks, T. and E. Rescorla, “The Transport Layer Security (TLS) Protocol Version 1.2,” RFC 5246, August 2008 (TXT).
[XEP-0047] Karneges, J., “In-Band Bytestreams (IBB),” XSF XEP 0047, November 2006.
[XEP-0166] Ludwig, S., Beda, J., Saint-Andre, P., McQueen, R., Egan, S., and J. Hildebrand, “Jingle,” XSF XEP 0166, December 2008.


 TOC 

13.2. Informative References

[DTLS] Rescorla, E. and N. Modadugu, “Datagram Transport Layer Security,” RFC 4347, April 2006 (TXT).
[DTLS-SRTP] McGrew, D. and E. Rescorla, “Datagram Transport Layer Security (DTLS) Extension to Establish Keys for Secure Real-time Transport Protocol (SRTP),” draft-ietf-avt-dtls-srtp-07 (work in progress), February 2009 (TXT).
[HTTP-TLS] Rescorla, E., “HTTP Over TLS,” RFC 2818, May 2000 (TXT).
[RFC3552] Rescorla, E. and B. Korver, “Guidelines for Writing RFC Text on Security Considerations,” BCP 72, RFC 3552, July 2003 (TXT).
[rfc3921bis] Saint-Andre, P., “Extensible Messaging and Presence Protocol (XMPP): Instant Messaging and Presence,” draft-saintandre-rfc3921bis-07 (work in progress), October 2008 (TXT).
[RFC3923] Saint-Andre, P., “End-to-End Signing and Object Encryption for the Extensible Messaging and Presence Protocol (XMPP),” RFC 3923, October 2004 (TXT, HTML, XML).
[RFC5056] Williams, N., “On the Use of Channel Bindings to Secure Channels,” RFC 5056, November 2007 (TXT).
[RFC5081] Mavrogiannopoulos, N., “Using OpenPGP Keys for Transport Layer Security (TLS) Authentication,” RFC 5081, November 2007 (TXT).
[TLS-SRP] Taylor, D., Wu, T., Mavrogiannopoulos, N., and T. Perrin, “Using the Secure Remote Password (SRP) Protocol for TLS Authentication,” RFC 5054, November 2007 (TXT).
[SCRAM] Menon-Sen, A., Melnikov, A., and C. Newman, “Salted Challenge Response (SCRAM) SASL Mechanism,” draft-newman-auth-scram-10 (work in progress), February 2009 (TXT).
[XEP-0027] Muldowney, T., “Current Jabber OpenPGP Usage,” XSF XEP 0027, November 2006.
[XEP-0030] Hildebrand, J., Millard, P., Eatmon, R., and P. Saint-Andre, “Service Discovery,” XSF XEP 0030, June 2008.
[XEP-0045] Saint-Andre, P., “Multi-User Chat,” XSF XEP 0045, July 2008.
[XEP-0060] Millard, P., Saint-Andre, P., and R. Meijer, “Publish-Subscribe,” XSF XEP 0060, September 2008.
[XEP-0065] Smith, D., Miller, M., and P. Saint-Andre, “SOCKS5 Bytestreams,” XSF XEP 0065, May 2007.
[XEP-0115] Hildebrand, J., Saint-Andre, P., Tronçon, R., and J. Konieczny, “Entity Capabilities,” XSF XEP 0115, February 2008.
[XEP-0160] Saint-Andre, P., “Best Practices for Handling Offline Messages,” XSF XEP 0160, January 2006.
[XEP-0167] Ludwig, S., Saint-Andre, P., Egan, S., McQueen, R., and D. Cionoiu, “Jingle RTP Sessions,” XSF XEP 0167, December 2008.
[XEP-0189] Paterson, I., Saint-Andre, P., and D. Meyer, “Public Key Publishing,” XSF XEP 0189, March 2009.
[XEP-0210] Paterson, I., “Requirements for Encrypted Sessions,” XSF XEP 0210, May 2007.
[XEP-0218] Saint-Andre, P. and I. Paterson, “Bootstrapping Implementation of Encrypted Sessions,” XSF XEP 0218, May 2007.
[XEP-0234] Saint-Andre, P., “Jingle File Transfer,” XSF XEP 0234, February 2009.
[XEP-0260] Saint-Andre, P. and D. Meyer, “Jingle SOCKS5 Bytestreams Transport Method,” XSF XEP 0260, February 2009.
[XEP-0261] Saint-Andre, P., “Jingle In-Band Bytestreams Transport,” XSF XEP 0261, February 2009.


 TOC 

Appendix A.  XML Schema

The XML schema will be provided in a later version of this document.



 TOC 

Appendix B.  Copying Conditions

Regarding this entire document or any portion of it, the authors make no guarantees and are not responsible for any damage resulting from its use. The authors grant irrevocable permission to anyone to use, modify, and distribute it in any way that does not diminish the rights of anyone else to use, modify, and distribute it, provided that redistributed derivative works do not contain misleading author or version information. Derivative works need not be licensed under similar terms.



 TOC 

Authors' Addresses

  Dirk Meyer
  Universitaet Bremen TZI
Email:  dmeyer@tzi.de
  
  Peter Saint-Andre
  Cisco
Email:  psaintan@cisco.com