Internet-Draft | Routing for Satellites | November 2023 |
Li | Expires 19 May 2024 | [Page] |
Satellite networks present some interesting challenges for packet networking. The entire topology is continually in motion, with links that are far less reliable than what is common in terrestrial networks. Some changes to link connectivity can be anticipated due to orbital mechanics.¶
This document proposes a routing architecture for satellite networks based on existing routing protocols and mechanisms, enhanced with scheduled link connectivity change information. This document proposes no protocol changes.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 19 May 2024.¶
Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
Satellite networks present some interesting challenges for packet networking. The entire topology is continually in motion, with links that are far less reliable than what is common in terrestrial networks. Some changes to link connectivity can be anticipated due to orbital mechanics.¶
This document proposes a routing architecture for satellite networks based on existing routing protocols and mechanisms, enhanced with scheduled link connectivity change information. This document proposes no protocol changes.¶
Downlink: The half of a ground link leading from a satellite to a ground station.¶
Gateway: A ground station that participates as part of the network and acts as the interconnect between satellite constellations and the planetary network. Gateways have a much higher bandwidth than user stations, have ample computing capabilities, and perform traffic engineering duties.¶
GEO: Geostationary Earth Orbit. A satellite in GEO has an orbit that is synchronized to planetary rotation, so it effectively sits over one spot on the planet.¶
Ground link: A link between a satellite and a ground station.¶
IGP: Interior Gateway Protocol. A routing protocol that is used within a single administrative domain. Note that 'gateway' in this context is semantically equivalent to 'router' and has no relationship to the 'gateway' used in the rest of this document.¶
IS-IS: Intermediate System to Intermediate System routing protocol. An IGP that is commonly used by service providers.¶
ISL: Inter-satellite link. Frequently a free space laser.¶
L1: IS-IS Level 1¶
L1L2: IS-IS Level 1 and Level 2¶
L2: IS-IS Level 2¶
LEO: Low Earth Orbit.¶
LSP: IS-IS Link State Protocol Data Unit. An LSP is a set of packets that describe a node's connectivity to other nodes.¶
MEO: Medium Earth Orbit.¶
Stripe: A set of satellites in a few adjacent orbits. These form an IS-IS L1 area.¶
Uplink: The half of a ground link leading from a ground station to a satellite.¶
User station: A ground station interconnected with a small end user network.¶
Satellites travel in specific orbits around their parent planet. Some of them have their orbital periods synchronized to the rotation of the planet, so they are effectively stationary over a single point. Other satellites have orbits that cause them to travel across regions of the planet. These are typically known as Geostationary Earth Orbits (GEO), Medium Earth Orbit (MEO), or Low Earth Orbit (LEO), depending on altitude. For this discussion, nothing is Earth-specific and generalizes to any celestial body, so we use these common terms with the understanding that they could be equally applicable to Mars, Venus, lunar, or even solar orbits.¶
Satellites may have data interconnections with one another through Inter-Satellite Links (ISLs). Due to differences in orbits, ISLs may be connected temporarily, with periods of potential connectivity computed through orbital mechanics. Multiple satellites may be in the same orbit but separated in time and space, with a roughly constant separation. Satellites in the same orbit may have ISLs that have a higher duty cycle than cross-orbit ISLs but are still not guaranteed to always be connected.¶
Ground stations can communicate with one or more satellites that are in their region. Some ground stations have a limited capacity and communicate with only a single satellite at a time. These are known as user stations. Other ground stations may have richer connectivity and higher bandwidth are commonly called gateways, and provide connectivity between the satellite network and conventional wired networks.¶
Like conventional network links, ISLs and ground links can fail at any time. However, unlike conventional links, there are predictable times when ISLs and ground links can potentially connect and disconnect. These predictions can be computed and cataloged in a schedule that can be distributed to relevant network elements. Predictions of a link connecting are not a guarantee: a link may not connect for a variety of reasons. Predictions of a link disconnecting are effectively guaranteed, as the underlying physics is extremely unlikely to improve unexpectedly.¶
Some proposed satellite networks are fairly large, with tens of thousands of proposed satellites. A key concern is the ability to reach this scale and larger.¶
As we know, the key to scalability is the ability to create hierarchical abstractions, so a key question of any routing architecture will be about the abstractions that can be created to contain topological information.¶
Normal routing protocols are architected to operate with a static, if unreliable topology. Satellite networks lack the static organization of terrestrial networks, so normal architectural practices may not apply and alternative approaches may need consideration.¶
In this section, we discuss some of the assumptions that are the basis for this architectural proposal.¶
We assume that the primary use of the satellite network is to provide access from a wide range of geographic locations. We assume that providing high bandwidth bulk transit between peer networks is not a goal. It has been noted that satellite networks can provide lower latencies than terrestrial fiber networks [Handley]. This proposal does not preclude such applications but also does not articulate the mechanisms necessary for user stations to perform the necessary traffic engineering computations. Low-latency applications are not discussed further.¶
As with most access networks, we assume that there will be bidirectional traffic between the user station and the gateway, but that the bulk of the traffic will be from the gateway to the user station. We expect that the uplink from the gateway to the satellite network to be the bandwidth bottleneck, and that gateways will need to be replicated to scale the uplink bandwidth.¶
We assume that it is not essential to provide optimal routing for traffic from user station to user station. If this traffic is sent first to a gateway and then back into the satellite network, this would be acceptable. This type of route is commonly called a 'hairpin' and is not discussed further.¶
We assume that traffic for a user station should enter the network through a gateway that is in some close topological proximity to the user station. This is to maximize the practical capacity of the satellite network. Similarly, we assume that user station traffic should exit the network through the gateway that is in the closest topological proximity.¶
This architecture does not preclude gateway-to-gateway traffic, but it does not seek to optimize it.¶
We assume that a user station registers with one satellite at a time, forming a temporary association that is relayed to the local gateway. The mechanism for this registration is outside of the scope of this document.¶
We assume that links in general will be available when scheduled. As with any network, there will be failures, and the schedule is not a guarantee, but we also expect that the schedule is not grossly inaccurate. We assume that at any given instant, there is enough connectivity to run the network and support the traffic demand. If this assumption does not hold, then no routing architecture can magically make the network more capable.¶
We assume that, in general, intra-orbit ISLs have higher reliability and persistence than inter-orbit ISLs.¶
The goal of the routing architecture is to provide an organizational structure to protocols running on the satellite network such that topology information is conveyed through relevant portions of the network, that paths are computed across the network, and that data can be delivered along those paths, and the structure can scale to a very large network without undue changes to the organizational structure.¶
The end goal of a network is to deliver traffic. In a satellite network where the topology is in a continual state of flux and the user stations are frequently changing their association with the satellites, having a highly flexible and adaptive forwarding plane is essential. Toward this end, we propose to use MPLS as the fundamental forwarding plane architecture [RFC3031]. Specifically, we propose to use a Segment Routing (SR) [RFC8402] based approach, where each satellite is assigned a node Segment Identifier (SID). A path through the network can be then expressed as a label stack of node SIDs. IP forwarding is not used within the internals of the satellite network, although each satellite may be assigned an IP address for management purposes. Existing SR label stack compression algorithms may be used, so that the label stack need only contain the significant waypoints along the path. This implies that the label stack operates as a form of loose source routing through the network.¶
We assume that there is a link-layer mechanism for a user station to associate with a satellite. User stations will have an IP address that is assigned from a prefix managed by its local gateway. The mechanisms for this assignment and its communication to the end station are not discussed herein but might be similar to DHCP [RFC2131]. User station IP addresses change infrequently and do not reflect their association with their first-hop satellite. Gateways advertise a prefix into the global Internet for all of its local user stations.¶
User stations may be assigned a node SID, in which case MPLS forwarding can be used all the way to the user station. Alternatively, if the user station does not have a node SID, then the last hop from the satellite to the end station can be performed based on the destination IP address of the packet. This does not require a full longest prefix match lookup as the IP address is merely a unique identifier at this point.¶
Similarly, gateways may be assigned a node SID. A possible optimization is that a single SID value be assigned as a global constant to always direct traffic to the topologically closest gateway. If traffic engineering is required for traffic that is flowing to a gateway, a specific path may be encoded in a label stack that is attached to the packet by the user station or by the first-hop satellite.¶
Gateways can also perform traffic engineering by using different paths and label stacks for different traffic flows. Routing a single traffic flow across multiple paths has proven to cause performance issues with transport protocols, so that approach is not recommended.¶
The IETF currently actively supports two Interior Gateway Protocols (IGPs): OSPF [RFC2328][RFC5340] and IS-IS [ISO10589] [RFC1195].¶
OSPF requires that the network operate around a backbone area, with subsidiary areas hanging off of the backbone. While this works well for traditional terrestrial networks, this does not seem appropriate for satellite networks, where there is no centralized portion of the topology.¶
IS-IS has a different hierarchical structure, where Level 1 (L1) areas are connected sets of nodes, and then Level 2 (L2) is a connected subset of the topology that intersects all of the L1 areas. Individual nodes can be L1, L2, or both (L1L2). In particular, we propose that all nodes in the network be L1L2 so that local routing is done based on L1 information and then global routing is done based on L2 information.¶
IS-IS also has the interesting property that it does not require interface addresses. This feature is commonly known as 'unnumbered interfaces'. This is particularly helpful in satellite topologies because it implies that ISLs may be used flexibly. Sometimes an interface might be used as an L1 link to another satellite and a few orbits later it might be used as an L1L2 link to a completely different satellite without any reconfiguration or renumbering.¶
Scalability for IS-IS can be achieved through the use of a proposal known as Area Proxy [I-D.ietf-lsr-isis-area-proxy]. With this proposal, all of the nodes in an L1 area combine their information into a single L2 Link State Protocol Data Unit (LSP). This implies that the size of the L1 Link State Database (LSDB) scales as the number of nodes in the L1 area and the size of the L2 LSDB scales with the number of L1 areas.¶
The Area Proxy proposal also includes the concept of an Area SID. This is useful because it allows traffic engineering to construct a path that traverses areas with a minimal number of label stack entries.¶
Suppose, for example, that a network has 1,000 L1 areas, each with 1,000 satellites. This would then mean that the network supports 1,000,000 satellites, but only requires 1,000 entries in its L1 LSDB and 1,000 entries in its L2 LSDB; numbers that are easily achievable today. The resulting MPLS label table would contain 1,000 node SIDs from the L1 LSDB and 1,000 area SIDs from the L2 LSDB. If each satellite advertises an IP address for management purposes, then the IP routing table would have 1,000 entries for the L1 management addresses and 1,000 area proxy addresses from L2.¶
A significant problem with any link state routing protocol is that of area partition. While there have been many proposals for automatic partition repair, none has seen significant production deployment. It seems best to simply avoid this issue altogether and ensure that areas have an extremely low probability of partitioning.¶
As discussed above, intra-orbit ISLs are assumed to have higher reliability and persistence than inter-orbit ISLs. However, even intra-orbit ISLs are not sufficiently reliable to avoid partition issues. Therefore, we propose to group a small number of adjacent orbits as an IS-IS L1 area, called a stripe. We assume that for any given reliability requirement, there is a small number of orbits that can be used to form a stripe that satisfies the reliability requirement.¶
MEO and GEO constellations that have intra-constellation ISLs can also form an IS-IS L1L2 area. Satellites that lack intra-constellation ISLs are better as independent L2 nodes.¶
Forwarding in this architecture is straightforward. A path from a gateway to a user station on the same orbit only requires a single node SID for the satellite that provides the downlink to the user station.¶
Similarly, a user station returning a packet to a gateway need only provide a gateway node SID.¶
For off-orbit forwarding, the situation is a bit more complex. A gateway would need to provide the area SID of the destination area plus the node SID of the downlink satellite. For return traffic, user stations or first-hop satellites would want to provide the area SID for the gateway as well as the gateway SID.¶
Very frequently, access networks congest due to oversubscription and the economics of access. Network operators can use traffic engineering to ensure that they are getting higher efficiency out of their networks by utilizing all available paths and capacity near any congestion points. In this particular case, the gateway will have information about all of the traffic that it is generating and can use all of the possible paths through the network in its topological neighborhood. Since we're already using SR, this is easily done just by adding more explicit SIDs to the label stack. These can be additional area SIDs, node SIDs, or adjacency SIDs. Path computation can be performed by traditional Path Computation Elements (PCE).¶
Each gateway or its PCE will need topological information from all of the areas that it will route through. It can do this by being a participant in the IGP, either directly, via a tunnel, or another delivery mechanism such as BGP-LS [RFC7752]. User stations do not participate in the IGP.¶
Traffic engineering for traffic into a gateway can also be provided by an explicit SR path on the traffic. This can help ensure that ISLs near the gateway do not congest with traffic for the gateway. These paths can be computed by the gateway or PCE and then distributed to the first-hop satellite or user station, which would then apply them to traffic. The delivery mechanism is outside of the scope of this document.¶
The most significant difference between terrestrial and satellite networks from a routing perspective is that some of the topological changes that will happen to the network can be anticipated and computed. Both link and node changes will affect the topology and the network should react smoothly and predictably.¶
The management plane is responsible for providing information about scheduled topological changes. The exact details of how the information is disseminated are outside of the scope of this document but could be done through a YANG model [I-D.united-tvr-schedule-yang]. Scheduling information needs to be accessible to all of the nodes that will make routing decisions based on the topological changes in the schedule, so information about an L1 topological change will need to be circulated to all nodes in the L1 area and information about L2 changes will need to propagate to all L2 nodes, plus the gateways and PCEs that carry the related topological information.¶
There is very little reaction that the network should do in response to a topological addition. A link coming up or a node joining the topology should not have any functional change until the change is proven to be fully operational based on the usual liveness mechanisms found within IS-IS. Nodes may pre-compute their routing table changes but should not install them before all of the relevant adjacencies are flooded. The benefits of this pre-computation appear to be very small. Gateways and PCEs may also choose to pre-compute paths based on these changes, but should be careful to not install paths using the new parts of the topology until they are confirmed to be operational. If some path pre-installation is performed, gateways and PCEs must be prepared for the situation where the topology does not become operational and may need to take alternate steps instead, such as reverting any related pre-installed paths.¶
The network may choose to not do any pre-installation or pre-computation in reaction to topological additions, at a small cost of some operational efficiency.¶
Topological deletions are an entirely different matter. If a link or node is to be removed from the topology, then the network should act before the anticipated change to route traffic around the expected topological loss. Specifically, at some point before the topology change, the affected links should be set to a high metric to direct traffic to alternate paths. This is a common operational procedure in existing networks when links are taken out of service, such as when proactive maintenance needs to be performed. This type of change does require some time to propagate through the network, so the metric change should be initiated far enough in advance that the network converges before the actual topological change. Gateways and PCEs should also update paths around the topology change and install these changes before the topology change takes place. The time necessary for both IGP and path changes will vary depending on the exact network and configuration.¶
This document discusses one possible routing architecture for satellite networks. It proposes no new protocols or mechanisms and thus has no new security impact. Security for IS-IS is provided by [RFC5304].¶
We would like to thank Dino Farinacci for his comments.¶
This document makes no requests for IANA.¶