Internet-Draft Task Binding and In-Band Provisioning fo July 2024
Wang & Patton Expires 9 January 2025 [Page]
Workgroup:
Privacy Preserving Measurement
Internet-Draft:
draft-wang-ppm-dap-taskprov-07
Published:
Intended Status:
Informational
Expires:
Authors:
S. Wang
Apple Inc.
C. Patton
Cloudflare

Task Binding and In-Band Provisioning for DAP

Abstract

An extension for the Distributed Aggregation Protocol (DAP) is specified that cryptographically binds the parameters of a task to the task's execution. In particular, when a client includes this extension with its report, the servers will only aggregate the report if all parties agree on the task parameters. This document also specifies an optional mechanism for in-band task provisioning that builds on the report extension.

About This Document

This note is to be removed before publishing as an RFC.

The latest revision of this draft can be found at https://wangshan.github.io/draft-wang-ppm-dap-taskprov/draft-wang-ppm-dap-taskprov.html. Status information for this document may be found at https://datatracker.ietf.org/doc/draft-wang-ppm-dap-taskprov/.

Discussion of this document takes place on the Privacy Preserving Measurement Working Group mailing list (mailto:ppm@ietf.org), which is archived at https://mailarchive.ietf.org/arch/browse/ppm/. Subscribe at https://www.ietf.org/mailman/listinfo/ppm/.

Source for this draft and an issue tracker can be found at https://github.com/wangshan/draft-wang-ppm-dap-taskprov.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 9 January 2025.

Table of Contents

1. Introduction

The DAP protocol [DAP] enables secure aggregation of a set of reports submitted by Clients. This process is centered around a "task" that determines, among other things, the cryptographic scheme to use for the secure computation (a Verifiable Distributed Aggregation Function [VDAF]), how reports are partitioned into batches, and privacy parameters such as the minimum size of each batch. See Section 4.2 of [DAP] for a complete listing.

In order to execute a task securely, it is required that all parties agree on all parameters associated with the task. However, the core DAP specification does not specify a mechanism for accomplishing this. In particular, it is possible that the parties successfully aggregate and collect a batch, but some party does not know the parameters that were enforced.

A desirable property for DAP to guarantee is that successful execution implies agreement on the task parameters. On the other hand, disagreement between a Client and the Aggregators should prevent reports uploaded by that Client from being processed.

Section 3 specifies a report extension (Section 4.4.3 of [DAP]) that endows DAP with this property. First, it specifies an encoding of all task parameters that are relevant to all parties. This excludes cryptographic assets, such as the secret VDAF verification key (Section 5 of [VDAF]) or the public HPKE configurations [RFC9180] of the aggregators or collector. Second, the task ID is computed by hashing the encoded parameters. If a report includes the extension, then each aggregator checks if the task ID was computed properly: if not, it rejects the report. This cryptographic binding of the task to its parameters ensures that the report is only processed if the client and aggregator agree on the task parameters.

One reason this task-binding property is desirable is that it makes the process by which parties are provisioned with task parameters more robust. This is because misconfiguration of a party would manifest in a server's telemetry as report rejection. This is preferable to failing silently, as misconfiguration could result in privacy loss.

Section 4 specifies one possible mechanism for provisioning DAP tasks that is built on top of the extension in Section 3. Its chief design goal is to make task configuration completely in-band, via HTTP request headers. Note that this mechanism is an optional feature of this specification; it is not required to implement the protocol extension in Section 3.

2. Conventions and Definitions

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

This document uses the same conventions for error handling as [DAP]. In addition, this document extends the core specification by adding the following error types:

Table 1
Type Description
invalidTask An Aggregator has opted out of the indicated task as described in Section 4.4

The terms used follow those described in [DAP]. The following new terms are used:

Task configuration:

The non-secret parameters of a task.

Task author:

The entity that defines a task's configuration in the provisioning mechanism of Section 4.

3. The Taskbind Extension

To use the Taskbind extension, the Client includes the following extension in the report extensions for each Aggregator as described in Section 4.4.3 of [DAP]:

[RFC EDITOR: Change this to the IANA-assigned codepoint.]

enum {
    taskbind(0xff00),
    (65535)
} ExtensionType;

The payload of the extension MUST be empty. If the payload is non-empty, then the Aggregator MUST reject the report.

When the client uses the Taskbind extension, it computes the task ID (Section 4.2 of [DAP]) as follows:

task_id = SHA-256(task_config)

where task_config is a TaskConfig structure defined in Section 3.1. Function SHA-256() is as defined in [SHS].

The task ID is bound to each report share (via HPKE authenticated and associated data, see Section 4.4.2 of [DAP]). Binding the parameters to the ID this way ensures, in turn, that the report is only aggregated if the Client and Aggregator agree on the parameters. This is accomplished by the Aggregator behavior below.

During aggregation (Section 4.5 of [DAP]), each Aggregator processes a report with the Taskbind extension as follows.

First, it looks up the ID and parameters associated with the task. Note the task has already been configured; otherwise the Aggregator would have already aborted the request due to not recognizing the task.

Next, the Aggregator encodes the parameters as a TaskConfig defined in Section 3.1 and computes the task ID as above. If the derived task ID does not match the task ID of the request, then it MUST reject the report with error "invalid_message".

During the upload flow (Section 4.4 of [DAP]), the Leader SHOULD abort the request with "unrecognizedTask" if the derived task ID does not match the task ID of the request.

3.1. Task Encoding

The task configuration is encoded as follows:

struct {
    /* Info specific for a task. */
    opaque task_info<1..2^8-1>;

    /* Leader API endpoint as defined in I-D.draft-ietf-ppm-dap-09. */
    Url leader_aggregator_endpoint;

    /* Helper API endpoint as defined in I-D.draft-ietf-ppm-dap-09. */
    Url helper_aggregator_endpoint;

    /* This determines the query type for batch selection and the
    properties that all batches for this task must have. */
    opaque query_config<1..2^16-1>;

    /* Time up to which Clients are allowed to upload to this task.
    Defined in I-D.draft-ietf-ppm-dap-09. */
    Time task_expiration;

    /* Determines the VDAF type and its config parameters. */
    opaque vdaf_config<1..2^16-1>;
} TaskConfig;

The purpose of TaskConfig is to define all parameters that are necessary for configuring each party. It includes all the fields to be associated with a task. In addition to the Aggregator endpoints, maximum batch query count, and task expiration, the structure includes an opaque task_info field that is specific to a deployment. For example, this can be a string describing the purpose of this task. It does not include cryptographic assets shared by only a subset of the parties, including the secret VDAF verification key [VDAF] or public HPKE configurations [RFC9180].

The opaque query_config field defines the DAP query configuration used to guide batch selection. Its content is structured as follows:

struct {
    Duration time_precision;
    uint16 max_batch_query_count;
    uint32 min_batch_size;
    QueryType query_type;
    select (QueryConfig.query_type) {
        case time_interval: Empty;
        case fixed_size:    uint32 max_batch_size;
    };
} QueryConfig;

The length prefix of the query_config ensures that the QueryConfig structure can be decoded even if an unrecognized variant is encountered (i.e., an unimplemented query type).

The maximum batch size for fixed_size query is optional. If query_type is fixed_size and max_batch_size is 0, then the task does not have maximum batch size limit. In particular, during batch validation (Section 4.6.5.2.2 of [DAP]), the Aggregator does not check len(X) <= max_batch_size, where X is the set of reports successfully aggregated into the batch.

The vdaf_config defines the configuration of the VDAF in use for this task. Its content is as follows (codepoints are as defined in [VDAF]):

enum {
    prio3_count(0x00000000),
    prio3_sum(0x00000001),
    prio3_sum_vec(0x00000002),
    prio3_histogram(0x00000003),
    poplar1(0x00001000),
    (2^32-1)
} VdafType;

struct {
    opaque dp_config<1..2^16-1>;  /* Encoded differential privacy parameters */
    VdafType vdaf_type;
    select (VdafConfig.vdaf_type) {
        case prio3_count:
            Empty;
        case prio3_sum:
            uint8;  /* bit length of the summand */
        case prio3_sum_vec:
            uint32; /* length of the vector */
            uint8;  /* bit length of each summand */
            uint32; /* size of each proof chunk */
        case prio3_histogram:
            uint32; /* number of buckets */
            uint32; /* size of each proof chunk */
        case poplar1:
            uint16; /* bit length of input string */
    };
} VdafConfig;

The length prefix of the vdaf_config ensures that the VdafConfig structure can be decoded even if an unrecognized variant is encountered (i.e., an unimplemented VDAF).

Apart from the VDAF-specific parameters, this structure includes a mechanism for differential privacy (DP). The opaque dp_config contains the following structure:

enum {
    reserved(0), /* Reserved for testing purposes */
    none(1),
    aggregator_discrete_gaussian(5),
    (255)
} DpMechanism;

struct {
    DpMechanism dp_mechanism;
    select (DpConfig.dp_mechanism) {
        case none: Empty;
        case aggregator_discrete_gaussian:
          RealNumber sigma;
          RealNumber sensititivity;
    };
} DpConfig;
  • OPEN ISSUE: Should spell out definition of DpConfig for various differential privacy mechanisms and parameters. See draft draft for discussion.

The length prefix of the dp_config ensures that the DpConfig structure can be decoded even if an unrecognized variant is encountered (i.e., an unimplemented DP mechanism).

The definition of Time, Duration, Url, and QueryType follow those in [DAP].

4. In-band Task Provisioning with the Taskbind Extension

Before a task can be executed, it is necessary to first provision the Clients, Aggregators, and Collector with the task's configuration. The core DAP specification does not define a mechanism for provisioning tasks. This section describes a mechanism whose key feature is that task configuration is performed completely in-band, via HTTP request headers.

This method presumes the existence of a logical "task author" (written as "Author" hereafter) who is capable of pushing configurations to Clients. All parameters required by downstream entities (the Aggregators and Collector) are carried by HTTP headers piggy-backed on the protocol flow.

This mechanism is designed with the same security and privacy considerations of the core DAP protocol. The Author is not regarded as a trusted third party: it is incumbent on all protocol participants to verify the task configuration disseminated by the Author and opt-out if the parameters are deemed insufficient for privacy. In particular, adopters of this mechanism should presume the Author is under the adversary's control. In fact, we expect in a real-world deployment that the Author may be co-located with the Collector.

The DAP protocol also requires configuring the entities with a variety of assets that are not task-specific, but are important for establishing Client-Aggregator, Collector-Aggregator, and Aggregator-Aggregator relationships. These include:

This section does not specify a mechanism for provisioning these assets; as in the core DAP protocol; these are presumed to be configured out-of-band.

Note that we consider the VDAF verification key [VDAF], used by the Aggregators to aggregate reports, to be a task-specific asset. This document specifies how to derive this key for a given task from a pre-shared secret, which in turn is presumed to be configured out-of-band.

4.1. Overview

The process of provisioning a task begins when the Author disseminates the task configuration to the Collector and each of the Clients. When a Client issues an upload request to the Leader (as described in Section 4.3 of [DAP]), it includes in an HTTP header the task configuration it used to generate the report. We refer to this process as "task advertisement". Before consuming the report, the Leader parses the configuration and decides whether to opt-in; if not, the task's execution halts.

Otherwise, if the Leader does opt-in, it advertises the task to the Helper during the aggregation protocol (Section 4.4 of [DAP]). In particular, it includes the task configuration in an HTTP header of each aggregation job request for that task. Before proceeding, the Helper must first parse the configuration and decide whether to opt-in; if not, the task's execution halts.

4.2. Task Advertisement

To advertise a task to its peer, a protocol participant includes a header "dap-taskprov" with a request incident to the task execution. The value is the TaskConfig structure defined Section 3.1, expanded into its URL-safe, unpadded Base 64 representation as specified in Sections 5 and 3.2 of [RFC4648].

4.3. Deriving the VDAF Verification Key

When a Leader and Helper implement this mechanism, they SHOULD compute the shared VDAF verification key [VDAF] as described in this section.

The Aggregators are presumed to have securely exchanged a pre-shared secret out-of-band. The length of this secret MUST be 32 bytes. Let us denote this secret by verify_key_init.

Let VERIFY_KEY_SIZE denote the length of the verification key for the VDAF indicated by the task configuration. (See [VDAF], Section 5.)

The VDAF verification key used for the task is computed as follows:

verify_key = HKDF-Expand(
    HKDF-Extract(
        taskprov_salt,   # salt
        verify_key_init, # IKM
    ),
    task_id,             # info
    VERIFY_KEY_SIZE,     # L
)

where taskprov_salt is defined to be the SHA-256 hash of the octet string "dap-taskprov" and task_id is as defined in Section 3. Functions HKDF-Extract() and HKDF-Expand() are as defined in [RFC5869]. Both functions are instantiated with SHA-256.

4.4. Opting into a Task

Prior to participating in a task, each protocol participant must determine if the TaskConfig disseminated by the Author can be configured. The participant is said to "opt in" to the task if the derived task ID (see Section 3) corresponds to an already configured task or the task ID is unrecognized and therefore corresponds to a new task.

A protocol participant MAY "opt out" of a task if:

  1. The derived task ID corresponds to an already configured task, but the task configuration disseminated by the Author does not match the existing configuration.

  2. The VDAF, DP, or query configuration is deemed insufficient for privacy.

  3. A secure connection to one or both of the Aggregator endpoints could not be established.

  4. The task lifetime is too long.

A protocol participant MUST opt out if the task has expired or if it does not support an indicated task parameter (e.g., VDAF, DP mechanism, or query type).

The behavior of each protocol participant is determined by whether or not they opt in to a task.

4.5. Supporting HPKE Configurations Independent of Tasks

In DAP, Clients need to know the HPKE configuration of each Aggregator before sending reports. (See HPKE Configuration Request in [DAP].) However, in a DAP deployment that supports the task provisioning mechanism described in this section, if a Client requests the Aggregator's HPKE configuration with the task ID computed as described in Section 3, the task ID may not be configured in the Aggregator yet, because the Aggregator is still waiting for the task to be advertised by a Client.

To mitigate this issue, each Aggregator SHOULD choose which HPKE configuration to advertise to Clients independent of the task ID. It MAY continue to support per-task HPKE configurations for other tasks that are configured out-of-band.

In addition, if a Client intends to advertise a task via the Taskbind extension, it SHOULD NOT specify the task_id parameter when requesting the HPKE configuration from an Aggregator.

4.6. Client Behavior

Upon receiving a TaskConfig from the Author, the Client decides whether to opt into the task as described in Section 4.4. If the Client opts out, it MUST not attempt to upload reports for the task.

  • OPEN ISSUE: In case of opt-out, would it be useful to specify how to report this to the Author?

Once the client opts into a task, it may begin uploading reports for the task to the Leader. The extension codepoint taskbind MUST be offered in the extensions field of both Leader and Helper's PlaintextInputShare. In addition, each report's task ID MUST be computed as described in Section 3.

The Client SHOULD advertise the task configuration by specifying the encoded TaskConfig described in Section 3 in the "dap-taskprov" HTTP header, but MAY choose to omit this header in order to save network bandwidth. However, the Leader may respond with "unrecognizedTask" if it has not been configured with this task. In this case, the Client MUST retry the upload request with the "dap-taskprov" HTTP header.

4.7. Leader Behavior

4.7.1. Upload Protocol

Upon receiving a Client report, if the Leader does not support the Section 4 mechanism, it will ignore the "dap-taskprov" HTTP header. In particular, if the task ID is not recognized, then it MUST abort the upload request with "unrecognizedTask".

Otherwise, if the Leader does support this mechanism, it first checks if the "dap-taskprov" HTTP header is specified. If not, that means the Client has skipped task advertisement. If the Leader recognizes the task ID, it will include the client report in the aggregation of that task ID. Otherwise, it MUST abort with "unrecognizedTask". The Client will then retry with the task advertisement.

If the Client advertises the task, the Leader checks that the task ID indicated by the upload request matches the task ID derived from the extension payload as specified in Section 3. If the task ID does not match, then the Leader MUST abort with "unrecognizedTask".

The Leader then decides whether to opt in to the task as described in Section 4.4. If it opts out, it MUST abort the upload request with "invalidTask".

  • OPEN ISSUE: In case of opt-out, would it be useful to specify how to report this to the Author?

Finally, once the Leader has opted in to the task, it completes the upload request as usual.

During the upload flow, if the Leader's report share does not present a taskbind extension type, Leader MUST abort the upload request with "invalidMessage".

4.7.2. Aggregate Protocol

When the Leader opts in to a task, it SHOULD derive the VDAF verification key for that task as described in Section 4.3. The Leader MUST advertise the task to the Helper in every request incident to the task as described in Section 3.

4.7.3. Collect Protocol

The Collector might issue a collect request for a task provisioned by this mechanism prior to opting into the task. In this case, the Leader would need to abort the collect request with "unrecognizedTask". When it does so, it is up to the Collector to retry its request.

  • OPEN ISSUE: This semantics is awkward, as there's no way for the Leader to distinguish between Collectors who support this mechanism and those that don't.

The Leader MUST advertise the task in every aggregate share request issued to the Helper as described in Section 4.2.

4.8. Helper Behavior

Upon receiving a task advertisement from the Leader, If the Helper does not support this mechanism, it will ignore the "dap-taskprov" HTTP header and process the aggregate request as usual. In particular, if the Helper does not recognize the task ID, it MUST abort the aggregate request with error "unrecognizedTask". Otherwise, if the Helper supports this mechanism, it proceeds as follows.

First, the Helper attempts to parse payload of the "dap-taskprov" HTTP header. If this step fails, the Helper MUST abort with "invalidMessage".

Next, the Helper checks that the task ID indicated in the upload request matches the task ID derived from the TaskConfig as defined in Section 3. If not, the Helper MUST abort with "unrecognizedTask".

Next, the Helper decides whether to opt in to the task as described in Section 4.4. If it opts out, it MUST abort the aggregation job request with "invalidTask".

  • OPEN ISSUE: In case of opt-out, would it be useful to specify how to report this to the Author?

Finally, the Helper completes the request as usual, deriving the VDAF verification key for the task as described in Section 4.3. For any report share that does not include the taskbind extension with an empty payload, the Helper MUST mark the report as invalid with error "invalid_message" and reject it.

4.9. Collector Behavior

Upon receiving a TaskConfig from the Author, the Collector first decides whether to opt into the task as described in Section 4.4. If the Collector opts out, it MUST NOT attempt to upload reports for the task.

Otherwise, once opted in, the Collector MAY begin to issue collect requests for the task. The task ID for each request MUST be derived from the TaskConfig as described in Section 4.4. The Collector MUST advertise the task as described in Section 3.

If the Leader responds to a collect request with an "unrecognizedTask" error, the Collector MAY retry its collect request after waiting an appropriate amount of time.

5. Security Considerations

The Taskbind extension has the same security and privacy considerations as the core DAP protocol. In addition, successful execution of a DAP task implies agreement on the task configuration. This is providing by binding the parameters to the task ID, which in turn is bound to each report uploaded for a task. Furthermore, inclusion of the Taskbind extension in the report share means Aggregators that do not implement this extension will reject the report as required by (Section 4.5.1.4 of [DAP]).

The task provisioning mechanism in Section 4 extends the threat model of DAP by including a new logical role, called the Author. The Author is responsible for configuring Clients prior to task execution. For privacy we consider the Author to be under control of the adversary. It is therefore incumbent on protocol participants to verify the privacy parameters of a task before opting in.

Another risk is that the Author could configure a unique task to fingerprint a Client. Although Client anonymization is not guaranteed by DAP, some systems built on top of DAP may hope to achieve this property by using a proxy server with Oblivious HTTP [RFC9458] to forward Client reports to the Leader. If the Author colludes with the Leader, the attacker can learn some metadata information about the Client, e.g., the Client IP, user agent string, which may deanonymize the Client. However, even if the Author succeeds in doing so, the Author should learn nothing other than the fact that the Client has uploaded a report, assuming the Client has verified the privacy parameters of the task before opting into it. For example, if a task is uniquely configured for the Client, the Client can enforce the minimum batch size is strictly more than 1.

Another risk for the Aggregators is that a malicious coalition of Clients might attempt to pollute an Aggregator's long-term storage by uploading reports for many (thousands or perhaps millions) of distinct tasks. While this does not directly impact tasks used by honest Clients, it does present a Denial-of-Service risk for the Aggregators themselves. This can be mitigated by limiting the rate at which new tasks or configured. In addition, deployments SHOULD arrange for the Author to digitally sign the task configuration so that Clients cannot forge task creation.

6. Operational Considerations

The Taskbind extension does not introduce any new operational considerations for DAP.

The task provisioning mechanism in Section 4 is designed so that the Aggregators do not need to store individual task configurations long-term. Because the task configuration is advertised in each request in the upload, aggregation, and collection protocols, the process of opting-in and deriving the task ID and VDAF verify key can be re-run on the fly for each request. This is useful if a large number of concurrent tasks are expected. Once an Aggregator has opted-in to a task, the expectation is that the task is supported until it expires. In particular, Aggregators that operate in this manner MUST NOT opt out once they have opted in.

7. IANA Considerations

8. Normative References

[DAP]
Geoghegan, T., Patton, C., Pitman, B., Rescorla, E., and C. A. Wood, "Distributed Aggregation Protocol for Privacy Preserving Measurement", Work in Progress, Internet-Draft, draft-ietf-ppm-dap-09, , <https://datatracker.ietf.org/doc/html/draft-ietf-ppm-dap-09>.
[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/rfc/rfc2119>.
[RFC4648]
Josefsson, S., "The Base16, Base32, and Base64 Data Encodings", RFC 4648, DOI 10.17487/RFC4648, , <https://www.rfc-editor.org/rfc/rfc4648>.
[RFC5869]
Krawczyk, H. and P. Eronen, "HMAC-based Extract-and-Expand Key Derivation Function (HKDF)", RFC 5869, DOI 10.17487/RFC5869, , <https://www.rfc-editor.org/rfc/rfc5869>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/rfc/rfc8174>.
[RFC9180]
Barnes, R., Bhargavan, K., Lipp, B., and C. Wood, "Hybrid Public Key Encryption", RFC 9180, DOI 10.17487/RFC9180, , <https://www.rfc-editor.org/rfc/rfc9180>.
[RFC9458]
Thomson, M. and C. A. Wood, "Oblivious HTTP", RFC 9458, DOI 10.17487/RFC9458, , <https://www.rfc-editor.org/rfc/rfc9458>.
[SHS]
"Secure Hash Standard", FIPS PUB 180-4 , .
[VDAF]
Barnes, R., Cook, D., Patton, C., and P. Schoppmann, "Verifiable Distributed Aggregation Functions", Work in Progress, Internet-Draft, draft-irtf-cfrg-vdaf-08, , <https://datatracker.ietf.org/doc/html/draft-irtf-cfrg-vdaf-08>.

Contributors

Junye Chen Apple Inc. junyec@apple.com

Suman Ganta Apple Inc. sganta2@apple.com

Gianni Parsa Apple Inc. gianni_parsa@apple.com

Michael Scaria Apple Inc. mscaria@apple.com

Kunal Talwar Apple Inc. ktalwar@apple.com

Christopher A. Wood Cloudflare caw@heapingbits.net

Authors' Addresses

Shan Wang
Apple Inc.
Christopher Patton
Cloudflare