Internet-Draft AVORS July 2024
Schott, et al. Expires 6 January 2025 [Page]
Workgroup:
ART
Internet-Draft:
draft-schott-sip-avors-01
Published:
Intended Status:
Informational
Expires:
Authors:
R. Schott
Deutsche Telekom
M. Kreipl
Deutsche Telekom
B. Dreyer
Deutsche Telekom
R. Jesske
Deutsche Telekom

Avoiding Registration Storms by adapted Registration Behavior for Voice Cloud Applications

Abstract

This document describes the AVORS (Avoiding Registration Storms) concept that allows the resumption of active registrations. The concept can be mapped on any architecture having a distributed structure and could work for different protocols. The concept is exemplary explained here regarding an architecture for voice and is mapped on a 3GPP (3rd Generation Partnership Project) architecture. This document describes the AVORS (Avoiding Registration Storms) concept that allows the resumption of active UE (User Equipment) registrations on other Outbound Proxies (P-CSCF) within the SIP voice architecture. The AVORS concept increases service continuity, improves network resilience, and offers savings potential. Additionally, this document gives an outlook regarding stateless voice architectures, load calculation aspects, and Service Based Interfaces (SBI) in context data base interworking. Security aspects are considered in the security chapter. As stated above the AVORS principle is not only limited to the SIP protocol and could be adopted by other protocols.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 6 January 2025.

Table of Contents

1. Introduction

The AVORS "Avoiding Registration Storms" concept in context of SIP and IMS (IP-Multimedia Subsystem) respectively helps, as the name suggests, to reduce registration storms in case of an outage especially site outages. Nowadays, registration storms are prevented by overcapacity. This overcapacity can be used for other applications, if available or it is idling until an outage occurs. The idling is causing electricity cost even in the case of intelligent power management. In stateless architectures the registration context is stored in a session data base and normally all instances could access this session data base. According to [TS_23.228] Service Based Architecture (SBA) and Service Based Interfaces (SBI) offer in principal access to the session data base for IMS cloud based applications. Regarding the current standardized registration behavior the IMS UE (User Equipment) MUST initiate a new initial registration. This registration needs to pass the outbound proxy (P-CSCF) and the registrar (S-CSCF) before reaching the data base. With the AVORS principle the outbound proxy (P-CSCF) has a dip into the data base and recognizes that a UE is already registered and is able to resume the registration. Resuming of a register of a session is feasible because the registration session context can be stored in a session data base. Instead of sending an initial registration in case of an outage the UE will send a re-register message to the secondary outbound proxy namely a failover P-CSCF. The latter is able to retrieve the session information out of the session data base and is able to resume the registration without sending the message via the registrar or S-CSCF. This works especially when the registrar (S-CSCF) is fully stateless and shortens the amount of messages being sent in case of a failover scenario. The idea of resumption of a registration or a session is working also for other protocols than SIP e.g., TLS [RFC5246] or [RFC8446]. AVORS use the idea of session resumption for SIP via a data base dip from the P-CSCF as an alternative approach to optimize registration behavior especially in case of heavy outages and registration storms. The mechanism does not obsolete the original or classical registration behavior and is complementary. The UE or end devices can run either in classical or AVORS mode and the SIP core or IMS core systems can have both options implemented, classical mode and AVORS mode. The AVORS mode is also working in the case that the failover P-CSCF (secondary outbound proxy) uses a different IP address compared to the primary one. The mechanism can be combined with TLS resumption in case of a wireline Residential Gateway (RG). In wireless or mobile context where IPSec is used for authentication the SIP registration resumption works similar to the wireline case. The aim of this document is to specify how resumption for SIP registration works in combination with a session data base. The focus of this document is on the aspect of registrations and recovery time in case of outages e.g., site outages. A fully session resumption including resumption of media streams needs to be analyzed in an additional work. This principle is described above for the geo-redundancy use case and also works for local redundant instances or in combination.

2. Terminology

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT","SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

3. Rationale of the usage of AVORS mode for SIP sessions

By using the AVORS mode for SIP sessions the overcapacity within the IMS Core or voice systems can be reduced because the amount of message flows is reduced when the P-CSCF is able to resume the registration directly. The approach fits into the SBA of 3GPP and the introduction of a session data base in context 5G. Recovery time of the system is also optimized and shortened which is beneficial for the users. In future cloud based SIP applications will run as virtual instances or in containers and will have a context data base anyway. Therefore, it is reasonable to provide an additional registration or re-registration mechanism when migrating the SIP or IMS systems into the cloud.

Regarding the UE there are two paths possible:

Depending on the installed basis of the operator or the operational requirements both options are valid approaches. In case the UE indicates that it operates in AVORS mode an AVORS header needs to be specified. A suggestion how such a header or parameter could look like is described in the Appendix.

4. Architecture Overview of AVORS Concept

This section gives an overview of the registration resumption concept for SIP sessions. As stated above the classical registration mechanism in case of a failure of a site or a P-CSCF failure is to start a new initial registration towards the secondary outbound proxy and P-CSCF respectively. The registration process invokes P-CSCF, S-CSCF and the data base. With AVORS the P-CSCF is able to resume a registration by handling a normal re-register message of the UE. Instead of starting a new initial registration the UE sends the normal re-registration message to the secondary outbound proxy namely P-CSCF. The P-CSCF gets the session registration context out of the data base where the session context is stored, i.e., session data base. In case of using a TLS session the session resumption process can be implemented for both protocols, TLS and SIP, making the recovery and failover process more efficient. The spare overcapacity needed for processing the initial registration process can be reduced and the recovery mechanism is kept efficient. This document has not the ambition to investigate a TCP session resumption.

The AVORS architecture is described in the following figures. Figure 1 describes the geo-redundancy use case.

Initial register in case of an initial registration or
from time-to-time for security or data base cleaning.

                    +---------+
   +-------+        |         |
   |       |        | P-CSCF  |
   |  DNS  |        | Site #1 |
   |       |       /|         | \
   +-------+      / +---------+  \
       |         /                \
       |        /                  \
       |       / sip                \
       |      /  initial           +---------+
   +-------+ /   register          |         |
   |       |/                      |  Data   |
   |  UE   |                       |  Base   |
   |       |                       |         |
   +-------+                       +---------+
                                   /
                    +---------+   /
                    |         |  /
                    | P-CSCF  | /
                    | Site #2 |/
                    |         |
                    +---------+

Registration resumption in case of a failover to
the secondary outbound proxy or P-CSCF respectively.

                     \     /
                    +---------+
   +-------+        |  \ /    |
   |       |        | P-CSCF  |
   |  DNS  |        | Site #1 |
   |       |        | /    \  | \
   +-------+        +---------+  \
       |            /        \    \
       |                           \
       |                            \
       |                           +---------+
   +-------+                       |         |
   |       |                       |  Data   |
   |  UE   |\                      |  Base   |
   |       | \                     |         |
   +-------+  \                    +---------+
               \                   /
                \   +---------+   /
      sip        \  |         |  /
      registration\ | P-CSCF  | /
      resumption   \| Site #2 |/
                    |         |
                    +---------+
Figure 1: AVORS Geo-Redundancy.

Figure 2 describes the local-redundancy use case.

Initial register in case of an initial registration or
from time-to-time for security or data base cleaning.

                    +---------+
   +-------+        |         |
   |       |        | P-CSCF  |
   |  DNS  |        | Inst.#1 |
   |       |       /|         | \
   +-------+      / +---------+  \
       |         /                \
       |        /                  \
       |       / sip                \
       |      /  initial           +---------+
   +-------+ /   register          |  Local  |
   |       |/                      |  Data   |
   |  UE   |                       |  Base   |
   |       |                       |         |
   +-------+                       +---------+
                                   /
                    +---------+   /
                    |         |  /
                    | P-CSCF  | /
                    | Inst.#2 |/
                    |         |
                    +---------+

Registration resumption in case of a failover to
the secondary local P-CSCF instance.

                     \     /
                    +---------+
   +-------+        |  \ /    |
   |       |        | P-CSCF  |
   |  DNS  |        | Inst.#1 |
   |       |        | /    \  | \
   +-------+        +---------+  \
       |            /        \    \
       |                           \
       |                            \
       |                           +---------+
   +-------+                       |  Local  |
   |       |                       |  Data   |
   |  UE   |\                      |  Base   |
   |       | \                     |         |
   +-------+  \                    +---------+
               \                   /
                \   +---------+   /
      sip        \  |         |  /
      registration\ | P-CSCF  | /
      resumption   \| Inst.#2 |/
                    |         |
                    +---------+
Figure 2: AVORS Local-Redundancy.

5. Functions of Registration Resumption Feature

This document is focusing on the introduction of registration resumption in a SIP (Session Initiation Protocol) environment. The method can be used for SIP-Proxies and SIP-Registrars or according to 3GPP IMS (IP-Multimedia Subsystem) for P-CSCF and S-CSCF architecture. It also works in context of 3GPP SBA (Service Based Architecture). In a second step, one can consider using a similar advanced mechanism for complete session resumption. The procedure and functions described here are part of a stateless voice architecture and are suitable for use in a cloud environment.

7. Security and operational considerations

Registration or session resumption leads to a situation where security plays a role to avoid unauthenticated and unauthorized access to the platform. The security can be hardened in case the sip session resumption is combined with a TLS session resumption. The AVORS mechanism helps in special failure situations to increase the recovery of the platform. It is up to the implementation to request a new initial registration after a longer time interval. Such kind of mechanism would increase security. In case of TCP or TLS the IP address spoofing is not or difficult to achieve. In case of UDP a nonce and next-nonce mechanism with short re-registration timer ensures security. This is also valid in case of using AVORS.

Other security considerations will be addressed in future versions of the document.

8. Abbreviations

Table 4
IAD Integrated Access Device
P-CSCF Proxy Session Control Function
RG Residential Gateway
S-CSCF Serving Session Control Function
UE User Equipment

9. Annex A

AVORS (Avoidance of Registration Storms): The following sections describes the procedures for AVORS which allows a seamless switch over of IAD in case of a faulty connection towards the first SIP proxy where the IAD is connected. When changing the SIP proxy a simple (Re-) REGISTER is needed to reconnect instead of an challenged initial registration procedure.

The following SIP option tag shall apply: This amendment specifies a single option tag, avors. The required information for this registration, as specified in [RFC3261], is:

  Name: avors

Description: This option tag is for the procedure used to send a re-register instead of a register when changing the first network proxy due to network failure/proxy failure. To allow this procedure the network will indicate if this procedure is implemented.

The next part describes a possible process. It is requested to include the SIP Instance-ID in the Contact-Header. For TCP based protocols TFO (TCP Fast Open) according to [RFC7413] shall be supported. Latter is relevant for the P-CSCF instances. For TLS based tansport protocols TLS session resumption according to [RFC8446] is used at the failover P-CSCF instance. Additionally, the "avors" option tag in order to query P-CSCF support for Registration Recovery Procedure or registration resumption, respectively, is introduced.

The following procedures shall apply:
1.      The UE determines a P-CSCF instance by a standard P-CSCF
        discovery mechanism.

2.      The UE performs an initial SIP registration at the P-CSCF.
a.      In addition, the UE sends an option tag "avors" in the
        Supported header field in any SIP register request message.
b.      The UE expects an option tag "avors" in the Supported header
        field in the 200 OK response to a SIP registration from
        the P-CSCF. If the P-CSCF supports AVORS, the UE receives
        an option tag "avors" in the Supported header field
        of the 200 OK.

3.      For all cases that require the UE to change to a different
        P-CSCF instance and a registration was successfully
        negotiated, the following behavior applies:
        Note: For AVORS, a re-register on a new P-CSCF is
        considered as new request in an existing dialog.
        Note: The Contact header field may contain no port number
        or port number according to {{RFC3261}}.
        For UEs supporting AVORS, the Contact header field must not
        be changed on a re-register to a new P-CSCF.
        Note: It is requested that the UE supports SIP Instance-ID
        and includes it in the Contact header field.
        Note: For AVORS, any re-register sent to a new P-CSCF
        MUST also perform re-registration procedures regarding
        commitment to nonce, retaining the call-id and increase of
        the CSeq by at least the value 1.
a.  If TLS was used as a transport protocol:
 i.  If TCP session used by the TLS transport fails and
        the „Timer F“ has not expired, the UE shall not immediately
        try to send a Re-register message to the secondary
        P-CSCF until the re-registration timer is expired.
        This avoids a mass registration at the secondary
        P-CSCF, the re-registration gets spread over the
        UE re-registration time of a defined value of e.g., x min.
 ii. In case of a failed TCP session the UE shall attempt
        a TLS session resumption according to {{RFC8446}} on the
        new P-CSCF instance using the TLS session data
        obtained from the initial handshake on the original
        P-CSCF instance.
        The UE shall delete the TLS session data determined
        during the initial TLS handshake with the original
        P-CSCF from its internal memory when a new initial
        register for this contact has to be executed or if the
        timers belonging to the TLS session have expired
        or the TLS resumption failed.
 iii.The UE will send a re-register request to the new P-CSCF
        instance instead of an initial register message.
b.  If UDP is used between UE and P-CSCF:
 i.  If a SIP message is not answered or in case that a
        keepalive is failed, the UE will send a re-register
        request to the new P-CSCF instance instead of
        a register.
 ii. The UE shall not immediately try to send a re-register message
        to the secondary P-CSCF until the Re-registration timer
        is expired. This avoids a mass registration at the
        secondary P-CSCF, the re-registration gets spread over the
        UE re-registration time of a defined time e.g., x min.
c.      If TCP is used between UE and P-CSCF:
 i.     If TCP session used with or without TLS fails and
        the „Timer F“ has not expired, the UE shall not
        immediately trying to send a Re-register message to the
        secondary P-CSCF until the Re-Registration timer is expired.
        This avoids a mass registration at the secondary P-CSCF,
        the Re-Registration gets spread over the UE Re-Registration
        time of a defined value of x min.
 ii. In case of a failed TCP session the UE shall attempt a TCP
        session resumption (TCP Fast Open (TFO)) according to
        {{RFC7413}} on a new P-CSCF instance using the TFO session
        cookie obtained from the initial handshake on the original
        P-CSCF instance.
 iii. The UE shall delete the TFO session cookie determined during
        the initial TCP handshake with the original P-CSCF from its
        internal memory when the user de-registers or gets
        de-registered by the network or if the timers belonging to
        the TFO session cookie have expired or the TCP session
        resumption failed.
 iv. The UE will send a Re-Register request to the new P-CSCF
        instance instead of an initial Register message.

4.  Optional “Re-register on not answered Invite message”
 i.  If an Invite message to primary P-CSCF receives no response,
        the UE shall send a re-register to secondary P-CSCF and,
        after receiving 200 OK, shall send the Invite message to
        secondary P-CSCF.
        Note: Upon receiving a 503 (Service Unavailable) response
        to an initial invite request containing a Retry-After header
        field, then the originating UE shall not automatically
        reattempt the request until after the period
        indicated by the Retry-After header field contents.

5.  UE behavior if P-CSCF doesn’t support AVORS
 i.  If P-CSCF doesn’t support option tag "avors" in the
        Supported header field in the 200 OK of a
        SIP registration, an initial registration
        has to be performed when
        switching to a new P-CSCF IP address
        (no change to current behavior).

10. IANA Considerations

TBD

11. Acknowledgements

This work has been supported by various contributors. Special thanks to TBD.

12. References

12.1. Normative References

[RFC2246]
Dierks, T. and C. Allen, "The TLS Protocol Version 1.0", RFC 2246, DOI 10.17487/RFC2246, , <https://www.rfc-editor.org/info/rfc2246>.
[RFC3261]
Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, DOI 10.17487/RFC3261, , <https://www.rfc-editor.org/info/rfc3261>.
[RFC5246]
Dierks, T. and E. Rescorla, "The Transport Layer Security (TLS) Protocol Version 1.2", RFC 5246, DOI 10.17487/RFC5246, , <https://www.rfc-editor.org/info/rfc5246>.
[RFC6223]
Holmberg, C., "Indication of Support for Keep-Alive", RFC 6223, DOI 10.17487/RFC6223, , <https://www.rfc-editor.org/info/rfc6223>.
[RFC7413]
Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP Fast Open", RFC 7413, DOI 10.17487/RFC7413, , <https://www.rfc-editor.org/info/rfc7413>.
[RFC8446]
Rescorla, E., "The Transport Layer Security (TLS) Protocol Version 1.3", RFC 8446, DOI 10.17487/RFC8446, , <https://www.rfc-editor.org/info/rfc8446>.

12.2. Informative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[TS_23.228]
"IP Multimedia Subsystem (IMS); Stage 2", .
[TS_24.224]
"IP Multimedia Call Control Protocol based on Session Initiation Protocol (SIP) and Session Description Protocol (SDP); Stage 3", .
[TS_29.598]
"tbd.", .

Authors' Addresses

Roland Schott
Deutsche Telekom
Ida-Rhodes-Str. 2
64295 Darmstadt
Germany
Michael Kreipl
Deutsche Telekom
90441 Nürnberg
Germany
Bastian Dreyer
Deutsche Telekom
20359 Hamburg
Germany
Roland Jesske
Deutsche Telekom
64295 Darmstadt
Germany