SRP Replication is a cooperative process. In order to ensure cooperation between SRP replication partners on a link, it is necessary that each replication partner be aware of other potential partners. This is accomplished by maintaining a continuous browse for services of the service type "_srpl-tls._tcp".¶
An SRP Replication Partner MUST maintain an ongoing DNS-SD browse on the service name '_srpl-tls._tcp' within the local browsing domain. The ongoing browse will produce two different types of events: 'add' events and 'remove' events. When the browse is started, it should produce an 'add' event for every SRP replication partner currently present on the network, including the partner that is doing the browsing. Whenever a partner goes offline, a 'remove' event should be produced. 'remove' events are not guaranteed, however.¶
When a new service is added, the SRP partner checks to see if it is in a compatible domain. If the SRP partner has a domain to advertise, it compares that domain to the domain advertised in the added service instance: if they are not the same, then this instance is not a candidate for connection, and should be ignored.¶
If the SRP partner does not have a domain to advertise, then when it begins to browse for partners, it sets a timer for DOMAIN_DISCOVERY_TIMEOUT seconds.¶
If the SRP partner does not have a domain to advertise, and is therefore willing to join an existing domain, it checks to see if the TXT record for the service indicates that joining is permitted. If so, the SRP partner adopts the provided domain name. Once it has adopted such a domain name, it updates its own TXT record to indicate that domain name, and sets the 'join=yes' key/value pair in the TXT record. It also cancels the DOMAIN_DISCOVERY_TIMEOUT timer.¶
If the DOMAIN_DISCOVERY_TIMEOUT timer goes off, then the SRP partner MUST propose a zone name using one of the methods mentioned previously in Section 2.1. It advertises that zone name in its TXT record, with 'join=yes'. It then sets a new timer for DOMAIN_PROPOSE_TIMEOUT seconds.¶
While waiting for the DOMAIN_PROPOSE_TIMEOUT timer to go off, any new 'add' events that arrive are examined to see if they are potential domains to join. If a potential domain to join is seen, and it is the same as the proposed domain, then the partner adopts that domain and treats it as its domain to advertise. It then cancels the DOMAIN_DISCOVERY_TIMEOUT timer.¶
At this point the SRP replication partner has a domain to advertise: either the one it produced, or one that it discovered.¶
Once an SRP replication partner has settled on a domain to advertise, it must either join other SRP replication partners in replicating that domain, or if it is the first, it must advertise its willingness to participate in replicating the domain. In order to do this, it must settle on a dataset ID.¶
The dataset ID is a random 64-bit number, generated by the first server to offer that dataset. There should always be exactly one dataset ID per domain, but the dataset ID has a separate purpose: it represents the set of data that is being replicated by a set of cooperating SRP replication partners. This data is then offered under the agreed-upon domain, but it's possible that there might be several sets of SRP replication partners that agree to replicate a particular domain, and then some event occurs which renders these partners visible to each other. When this happens, the independent sets of partners must converge on a single dataset. This is done using the dataset ID.¶
When more than one dataset ID is present for a particular domain, the dataset ID that is numerically lowest is preferred. This means that SRP replication partners that are currently replicating a dataset with a numerically higher dataset ID will have to abandon that dataset and join together in replicating the numerically lowest dataset. Servers that are not replicating the numerically lowest dataset will therefore stop advertising SRP replication service and begin attempting to join in in replicating the preferred dataset.¶
When a set of servers are advertising a particular dataset ID, the server with the lowest precedence is primary. The primary server is responsible for handing out precedence values to new partners as they join in replicating the dataset. Precedence IDs are always allocated starting with the precedence that is one greater than the primary's precedence.¶
When an SRP replication partner has stopped advertising a particular dataset ID, or has just started and therefore hasn't started advertising a particular dataset ID, and there is a dataset ID present that it can join in replicating, it attempts to connect to the SRP replication partner that is primary for the dataset. If the startup handshake succeeds, the primary will assign a new precedence to the connecting partner as part of the handshake.¶
Once the synchronization phase has finished, the connecting partner will begin advertising the SRP service for the chosen domain using the new dataset ID and the precedence it received from the primary. The connecting server will then also attempt to connect to every SRP partner it sees advertising the same dataset ID and a lower precedence.¶
It is possible that an SRP partner will attempt to join in replicating a dataset, but the primary for that dataset may have discontinued service, but the advertisement for the primary is still in the cache. In this case, the SRP partner will attempt to reconfirm the primary's advertisement. In mDNS, this is done as described in Section 10.4 of [RFC6762]. For DNS Push connections, this is done using the RECONFIRM messsage, described in Section 6.5 of [RFC8765]. For regular (polled) DNS, the SRP partner must trigger a new DNS query. If the primary advertisement is successfully confirmed, this indicates that there is a problem connecting to the primary, in which case the connecting partner SHOULD discontinue attempting to connect for at least MIN_RECONNECT_AFTER_FAILURE seconds.¶
Otherwise, the connecting partner will attempt to connect to the new primary if there is one. If there are no other servers advertising the dataset ID, then the connecting partner reverts to attempting to start its own replication of that dataset.¶
When an SRP replication partner has attempted to discover partners with which to connect, and failed to do so, it then creates its own dataset ID and precedence and begins advertising that dataset. Both the dataset ID and precedence should be generated using a non-deterministic random number generator. The dataset ID should be a random number greater than or equal to zero and less than 2^64. The precedence should be a random number greater than or equal to 0 and less than 2^15. The reason for the upper limit is to allow for a large range of numbers toward which the predence can increase.¶
The replication partner begins advertising this new dataset as soon as the dataset ID and precedence have been generated. As in the previous section, if a new dataset ID is seen shortly afterwards, this most likely indicates that two SRP replication instances came up at the same time; in this case as with the previous one, the lower dataset ID is preferred, and the partner advertising the higher dataset ID abandons that dataset ID to join the partner with the lower dataset ID.¶
The replication partner that first advertises the dataset is the primary replication partner for that dataset. It is responsible for assigning precedences to new partners.¶
It is possible that two SRP replication partners that see different service advertisements could identify different SRP replication servers as primary and attempt to get their precedence values from those different servers. When this happens, it's possible that they might both get the same precedence value. When this occurs, as soon each partner sees another partner advertising its precedence in an SRP replication advertisement, it must discontinue advertising and restart the dataset discovery process.¶
An SRP partner either identifies itself as primary or not. When an SRP partner is primary, it never connects to other SRP servers--it only receives connections. When a non-primary partner connects to the primary partner, it knows it is connecting to the primary partner. If the connection with the primary drops, or if the primary's advertisement goes away, then the non-primary evaluates the set of advertisements that it sees. If its precedence is lowest, it identifies itself as primary.¶
Non-primary servers receive updates from the primary whenever the maximum precedence value changes. Non-primary servers should track this precedence value. When a non-primary becomes primary, it should add ten to the most recently received precedence value, so as to skip any possible precedence assignments that haven't yet propagated.¶