ALTO Extension: Composition Mode of Cost Maps

Internet-Draft	ALTO-COMP	October 2023
Gao	Expires 24 April 2024	[Page]

Abstract

This document introduces an extension to the Application-Layer Traffic Optimization (ALTO) protocol, which enables announcements of the composition modes of multiple cost map services. Specifically, the composition mode defines how the results of multiple cost map services are combined to get the final prediction between two network endpoints. This extension allows ALTO servers to improve the accuracy of the prediction model at similar map sizes, and to efficiently enable differentiated services.¶

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶

This Internet-Draft will expire on 24 April 2024.¶

Copyright Notice

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶

1. Introduction

The Application-Layer Traffic Optimization (ALTO) protocol provides abstractions for application operators and/or end users to query network distance or property information. Specifically, ALTO has defined network map and cost map, which typically are used together, to provide a prediction model of distance information between endpoints in a network.¶

Given the scale of the Internet today, it is unlikely that the prediction model can overfit. Thus, with higher model complexity, an ALTO service tends to provide better accuracy from the same implementation method. As a consequence, operators of the ALTO maps have to make the trade-off between service quality (accuracy of the predicated value) and model complexity (sizes of the maps).¶

Currently, there is no standard way of composing the prediction results from multiple ALTO cost maps. Clients either only request a single pair of network and cost maps, or blindly select ALTO maps and compose the results. These approaches either make inefficient trade-offs, i.e., achieving substantial lower accuracy gains than occupied map sizes, or make incorrect use of the servers' exposed maps, i.e., the composition mode is different from how the server internally constructs the models.¶

This extension is motivated by the ensemble method in machine learning [ENSEMBLE]. Ensemble method uses multiple prediction models to improve the "efficiency" and can typically achieve higher accuracy with the same model complexity. When the models are composed (or "ensembled") using the boosting method [BOOSTING], models are ordered and higher-order models are trained not directly with the samples but residuals (prediction errors) of lower-order models. Thus, model accuracy and model complexity typically grow simultaneously with the number of models -- in the context of ALTO, the number of maps. Thus, an ALTO server may realize differentiated service by controlling the access to higher-order maps.¶

Specifically, this extension defines a new type of ALTO resource called ALTO composition advertisement Section 4. The resource specifies the list of ALTO cost maps and how they are intended to be composed.¶

2. Conventions and Terminology

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].¶

All numeric values are in network byte order. Values are unsigned unless otherwise indicated. Literal values are provided in decimal or hexadecimal as appropriate. Hexadecimal literals are prefixed with "0x" to distinguish them from decimal literals.¶

This document reuses the terms defined in RFC 7285 [RFC7285].¶

3. Composition Modes

This document has some requirements on the cost maps that can be composed. For cost maps that satisfy these requirements, 3 different composition modes are specified to define how the results of these maps must be combined.¶

3.1. Basic Requirements

This extension has the following requirements: First, the cost maps to be composed must support a common cost type. Second, the prediction using a network map and a cost map must follow the same process. Specifically, for a given pair of source and destination network hosts (identified by their IP addresses), the prediction result must be computed as follows:¶

Find the source PID with the longest matching prefix for the source host.¶
Find the destination PID with the longest matching prefix for the destination host.¶
The prediction result is the distance between the source PID and the destination PID.¶

3.2. Composition Mode and Result Ensembling

3.2.1. All

This composition mode is indicated by the string "all".¶

If the composition mode is "all", for each source and destination hosts, the client MUST compute the (weighted) sum of the prediction results from each cost map and its corresponding network map. This mode implies that missing the prediction result of any cost map may lead to substantial prediction error.¶

3.2.2. Random

This composition mode is indicated by the string "random".¶

If the composition mode is "random", the client MAY obtain a prediction result by computing the (weighted) average of prediction results from any non-empty subset of the cost maps. This mode typically implies that the maps are generated using a bagging method, e.g., random forests.¶

3.2.3. Gradient

This composition mode is indicated by the string "gradient".¶

If the composition mode is "gradient", the client MUST interpret the cost maps as an ordered list and MAY obtain a prediction result by computing the (weighted) sum of the first K maps, where K is an arbitrary number that is no less than 1 and no greater than the number of cost maps. This mode typically implies that the maps are generated using a boosting method. It must be noted that prediction results of higher-order maps are useless without the results of lower-order maps in this mode.¶

4. ALTO Composition Advertisement

4.1. Media Type

The composition advertisement resource is a virtual resource and the media type is only used to identify the type of the resource. The "media-type" field in its IRD entry MUST be "application/alto-composition+json".¶

4.2. HTTP Method

The composition advertisement resource is a virtual resource and does not accept any HTTP method.¶

4.3. Accept Input Parameters

None.¶

4.4. Capabilities

The capabilities of a composition advertisement is a JSON object of type CompAdvCapabilities:¶

    object {
        JSONString  comp-mode;
        JSONString  cost-type-names<1..*>;
        [JSONNumber weights<1..*>;]
    } CompAdvCapabilities;

with fields:¶

comp-mode: ~ A JSONString whose value MUST either be "all", "random" or "gradient", as introduce in Section 3.2.¶

cost-type-names: ~ A list of cost type names. Each cost type name MUST appear in the "cost-types" field in the "meta" field of the IRD, and MUST appear in the "cost-type-names" of each cost map whose resource ID is in the entry's "uses" field of the composition advertisement resource. The cost mode of this cost type MUST be "numerical".¶

weights: ~ An optional list of weight coefficient for each cost map in the "uses" field of this resource. The length of this option MUST be equal to the length of the "uses" field.¶

[RFC7285]: Alimi, R., Ed., Penno, R., Ed., Yang, Y., Ed., Kiesel, S., Previdi, S., Roome, W., Shalunov, S., and R. Woundy, "Application-Layer Traffic Optimization (ALTO) Protocol", RFC 7285, DOI 10.17487/RFC7285, September 2014, <https://www.rfc-editor.org/rfc/rfc7285>.

5.2. Informative References

[BOOSTING]: Friedman, J. H., "Stochastic gradient boosting.", Computational statistics & data analysis 38.4 (2002): 367-378. , 1999.
[ENSEMBLE]: Dietterich, T. G., "Ensemble learning", The handbook of brain theory and neural networks 2.1 (2002) 110-125., 2002.