Internet-Draft | OETP | November 2023 |
Lukianets | Expires 18 May 2024 | [Page] |
The Open Ethics Transparency Protocol (OETP) is an application-level protocol for publishing and accessing ethical Disclosures of IT Products and their Components. The Protocol is based on HTTP exchange of information about the ethical "postures", provided in an open and standardized format. The scope of the Protocol covers Disclosures for systems such as Software as a Service (SaaS) Applications, Software Applications, Software Components, Application Programming Interfaces (API), Automated Decision-Making (ADM) systems, and systems using Artificial Intelligence (AI). OETP aims to bring more transparent, predictable, and safe environments for the end-users. The OETP Disclosure Format is an extensible JSON-based format.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 18 May 2024.¶
Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
The Open Ethics Transparency Protocol (OETP or Protocol) describes the creation and exchange of voluntary ethics Disclosures for IT products. It is brought as a solution to increase the transparency of how IT products are built and deployed. This document provides details on how disclosures for data collection and data processing practice are formed, stored, validated, and exchanged in a standardized and open format.¶
OETP provides facilities for:¶
Informed consumer choices : End-users able to make informed choices based on their own ethical preferences and product disclosure.¶
Industrial-scale monitoring : Discovery of best and worst practices within market verticals, technology stacks, and product value offerings.¶
Legally-agnostic guidelines : Suggestions for developers and product-owners, formulated in factual language, which are legally-agnostic and could be easily transformed into product requirements and safeguards.¶
Iterative improvement : Digital products, specifically, the ones powered by artificial intelligence could receive nearly real-time feedback on how their performance and ethical posture could be improved to cover security, privacy, diversity, fairness, power balance, non-discrimination, and other requirements.¶
Labeling and certification : Mapping to existing and future regulatory initiatives and standards.¶
The Open Ethics Transparency Protocol (OETP) is an application-level protocol for publishing and accessing ethical Disclosures of IT products and their components. The Protocol is based on HTTP exchange of information about the ethical "postures", provided in an open and standardized format. The scope of the Protocol covers Disclosures for systems such as Software as a Service (SaaS) Applications, Software Applications, Software Components, Application Programming Interfaces (API), Automated Decision-Making (ADM) systems, and systems using Artificial Intelligence (AI). OETP aims to bring more transparent, predictable, and safe environments for the end-users. The OETP Disclosure Format is an extensible JSON-based format.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
Disclosure (Ethics Disclosure, or self-disclosure) is application-specific information about the data collection, data processing, and decision-making practices of a Product, provided by the Product Vendor (an individual developer or an organization).¶
A historical sequence of Disclosures, made for a specific Product.¶
The automated Disclosure processing is enabled by requests to both the Open Ethics Disclosure database powered by Disclosure Identity Providers (DIP) and the Product's OETP Disclosure file, stored in the product's website root following OETP specification. DIP serves as a service point to generate and retrieve generated disclosures.¶
A legal person (an individual developer or an organization) that owns one or several end-user Products, or acts as a Supplier and provides Components for other Vendors.¶
A legal person (an individual developer or an organization) that deploys technology-powered services to the end-users based on Product(s) from third-party Vendors.¶
An IT system in the form of software, software as a service system, application, software component, application programming interface, or a physically embodied automated decision-making agent.¶
An IT system supplied by Vendor and integrated/embedded into end-user Products. Components themselves do not necessarily interface with end-users.¶
A Component that sends its outputs to the Product Downstream in the data processing chain. Disclosure for the Upstream Component is represented as a Child relative to the Disclosure node of the Downstream Product.¶
A Component that receives inputs from the Components Upstream in the data processing chain. Disclosure for the Downstream Component is represented as a Parent relative to the Disclosure node of the Upstream Component.¶
Automated decision-making is the process of making a decision by automated means without any human involvement. These decisions can be based on factual data, as well as on digitally created profiles or inferred data.¶
A machine-readable Disclosure with predefined structure, supplied in the JSON format.¶
A sequence of automated software-based checks to control validity and security elements in the OETP Disclosure.¶
A third-party legal person trusted to perform Verification checks and to issue Verification Proofs.¶
An automated software-based tool authorized to perform Verification checks and to issue Verification Proofs.¶
A procedure to control the correspondence of the elements in the OETP Disclosure and the actual data processing and data collection practices of the Vendors.¶
A result of the formal Disclosure Verification procedure presented to a requestor.¶
A process of combining Disclosures of individual Components into a composite high-level Disclosure for a Product.¶
User-facing graphical illustrations and textual descriptions of the Product that facilitate understanding of the values and risks the Product carries.¶
The Disclosure creation and delivery consist of the two parts, starting from (I) the submission of the Disclosure form, chaining of the Suppliers' Disclosures, Signature of the disclosed information, and the delivery part (II) that first checks that the Disclosure is Valid, and then that the information specified in it is Verified by the third-parties. Figure 4 shows disclosure creation steps.¶
The initial Disclosure is created by filling out a standardized disclosure form (for example, see 1. https://openethics.ai/label/). A Vendor representative, a Product Owner, or a Developer, MUST submit data-processing and data-collection information about the Product. The information about the end-point URL, as well as a contact email address, MUST be specified. Disclosure MAY also be created in a fully automated way as a part of the CI/CD DevOps pipeline. Figure 5 shows basic disclosure submission process.¶
The Disclosure is organized into a predefined data schema and MUST be cryptographically signed by the Signature Generator (Open Ethics or federated providers) using standard SHA3-512 hash implementation. The integrity hash MUST be appended to a disclosure as the OETP.schema.integrity
element.¶
Both the signature integrity hash and the Disclosure SHOULD be stored in the log-centric root database and MAY be mirrored by other distributed databases for redundancy and safety.¶
Open Ethics Label SHOULD be automatically generated by mirroring the submitted Disclosure into a set of graphical icons and simple human-readable descriptions. Additional Labels MAY be generated following successful third-party Verification and by mapping the regulatory requirements to Verified Disclosures.¶
The most recent OETP file SHOULD be stored in the root of the Product's specified end-point URL, allowing requests to the OETP file from third-party domains. When establishing a Vendor relationship, the Integrator or a downstream Vendor MAY examine the Disclosure for their Components using the following HTTP request: GET https://testexample.com/oetp.json
, where testexample.com is the URL of the Supplier's end-point.¶
A Vendor SHOULD place a visual Label generated as a result of the Disclosure process in the Product informational materials (for example Marketing Materials, User Guides, Safety Instructions, Privacy Policy, Terms of Service, etc). The Label reflects the content of the Disclosure and SHOULD be displayed in any digital media by embedding a software widget. Visual labels in the print media SHOULD carry a visually distinguishable Integrity signature to enable manual Validation by the User.¶
Labels in the online digital media MUST be generated automatically based on the content of the Disclosure and MUST contain a URL allowing to check the complete Integrity hash and explore more detailed information about the Disclosure.¶
Labels in the offline media MUST be generated automatically based on the content of the Disclosure and should carry the first 10 digits of the corresponding Integrity hash.¶
Based on the Verification performed for the OETP Disclosure file, the labels MAY include Conformity assessment marks, Certification marks, as well as marks showing adherence to certain standards. These marks MAY be generated and displayed automatically based on the Verification Proofs.¶
Accessibility of the Labels for the visually impaired Users SHOULD be considered. The OETP Processing system MUST provide alternative forms of the Label so that text-to-speech tools could be used to narrate the Label without the lost of meaning.¶
1) A Label MUST contain a title. Title could be either marked by the aria-label
attribute for the narration software or be labeled by another content tag(s) present via aria-labelledby
attribute, pointing to the ID(s) describing the label content.¶
2) Every icon that is present in the visual Label MUST contain a title, describing the property illustrated by the icon. A more extended description MAY be provided when necessary. The following patterns are suggested:¶
Pattern for images embedded using SVG tags: <img> + role="img" + alt="[title text here]"
OR <img> + role="img" + aria-label="[title text here]"
¶
Pattern for images embedded using IMG tags: <svg> + role="img" + <title> + <desc> + aria-labelledby="[ID]"
¶
The automated Disclosure processing is enabled by requests to both the Open Ethics Disclosure database powered by Disclosure Identity Providers and the Product's OETP Disclosure file.¶
To allow efficient decentralization and access to the disclosures of autonomous systems, such as AI systems powered by trained machine learning models, the vendor (or a developer) MUST send requests to a Disclosure Identity Provider. Disclosures MAY be resolved using URIs. To satisfy the mentioned requirements for disclosure RI, it is proposed in [OETP-RI] to use the following formats:¶
oetp://<hash>
- Here integrity <hash>
is the SHA3-512 generated during the disclosure process.¶
oetp://<component>@<alias>[:<disclosure>]
- Here <component>
is the ID assigned via Disclosure Identity Provider under its <alias>
during the first disclosure.¶
oetp://<domain>[:<disclosure>]
- For verified domains (Domain Validation), disclosure could be accessed using product's <domain>
instead of <component>@<alias>
.)¶
The OETP Processing system MUST compare integrity hashes in the Open Ethics Disclosure database and entries that arrive as a result of the Disclosure Request response.¶
Every disclosure SHOULD be checked for the existence of the external Verification from Auditors for the entire Disclosures or one of the Disclosure elements.¶
To raise a level of trust in a Disclosure, a Vendor MAY decide to opt-in for a third-party Disclosure Verification. OETP suggests a Progressive Verification scheme where multiple independent external Verification Proofs COULD be issued by third parties to confirm the information specified in the Disclosure.¶
The Progressive Verification applies to a whole Disclosure, or to specific elements of the Disclosure.¶
Figure 6 displays a general scheme for Disclosure requests and responses.¶
The following elements MAY serve as sources for various kinds of Verification proofs: * Qualified Auditor reports * Qualified Vendor of Auditing software tests * Certification Authority assessments * Conformity assessments * User Feedback * Market Brokers * Real-time Loggers¶
The IT industry is getting more mature with Vendors becoming more specialized. Surface-level transparency is not sufficient as supply chains are becoming more complex and distributed across various Components. The following steps MUST be satisfied for the end-to-end transparency:¶
Every Integrator or a Vendor SHOULD disclose the information about their Suppliers (sub-processing Vendors), indicating the scope of the data processing in the Components they provide.¶
If the Supplier information is not provided, Disclosure SHOULD contain information that a Vendor (Integrator) has not provided Supplier information.¶
For greater transparency, Vendors may decide to reveal Components even if they originate from themselves (first-party Components). For the first-party Component, the Supplier identity information SHOULD NOT be provided because it was already disclosed earlier.¶
Required: (Section 4.4.1.3.2) only¶
When disclosing Components originating from the third-party Vendors SHOULD provide both the Supplier identity information and Component information¶
Required: (Section 4.4.1.3.1, Section 4.4.1.3.2)¶
Component Scope of use¶
Personal Data Being Processed by Component¶
Is a Safety Component (YES)/(NO)¶
Component URL (if different from the Vendor URL)¶
Component Disclosure URL (if different from the default Component URL/oetp.json
)¶
Component DPO Contact (if different from Vendor DPO Contact Email)¶
The OETP Processing system MUST send GET requests to the URLs of each Component to obtain their Disclosures. Based on the response to each Disclosure request, the OETP Processing system MUST specify which Components have Disclosures and which don't have Disclosures.¶
Figure 7 shows the process of how Disclosure Chaining requests and responses happen.¶
The same Request-response operation applies recursively for Components of the Components, for the Components of the Components of the Components, etc. It is proposed to view the supply chain as a tree-like hierarchical data structure, where the information about Components is assembled using Level Order Tree Traversal algorithm.¶
In this tree: * Node is a structure that contains the Component's Disclosure; * Root is the top Node representing a Product's Disclosure information; * Edge is the connection between one Node and another, representing the scope of the Data Processing by the Component.¶
Figure 8 displays the order of the Disclosure Chaining with Level Order Tree Traversal algorithm.¶
The current consensus from the user & developer community suggests that Composite Disclosure should follow The "Weakest Link" model. According to this model, the risk that the Product is carrying should not be considered any less than the risk for each of the Components. A similar approach in dealing with software licenses has been successful by allowing to generate Software Bills of Materials (SBOMs) by providing package information in the [SPDX] files.¶
Formally this approach could be illustrated with the use of a conjunction table for risk modeling (see Table 1). The Truth Table for Logical AND operator below takes one risk factor and evaluates risk outcomes as High (H) or Low (L) for hypothetical Disclosure options of the Product(P) and its Component(C).¶
Disclosed risk of P | Disclosed risk of C | Composite P & C |
---|---|---|
L | L | L |
L | H | H |
H | L | H |
H | H | H |
Further evaluation of this approach is required.¶
OETP exchanges data using JSON [RFC8259] which is a lightweight data-interchange format. A JSON-based application can be attacked in multiple ways such as sending data in an improper format or embedding attack vectors in the data. It is important for any application using JSON format to validate the inputs before being processed. To mitigate this attack type, the JSON Key Profile is provided for OETP responses.¶
OETP Processors should be aware of the potential for spoofing attacks where the attacker publishes an OETP disclosure with the OETP.snapshot
value from another product, or, perhaps with an outdated OETP.snapshot.label
element. For example, an OETP Processor could suppress the display of falsified entries by comparing the snapshot integrity from the submission database and a calculated hash for the OETP.snapshot
object. In that situation, the OETP Processor might also take steps to determine whether the disclosures originated from the same publisher require further investigation of the Disclosure Feed and alert the downstream OETP Processors.¶
Dishonest or falsified Disclosures is a problem that is hard to address generally. The approach to it is public control and systematic checks. Vendors or user-facing applications and services could further raise the level of trust in their Disclosures by implementing programmatic control scoring mechanisms, as well as the external verification by trusted Auditors.¶
Disclosures MAY be resolved using their URIs. To allow this requirement, the oetp://
URI scheme should be registered with IANA.¶
The following topics not addressed in this version of the document are possible areas for the future study:¶
Extensibility of the OETP Disclosure Format.¶
Evaluate other methods of Generation of the Composite Disclosure based on the Disclosure Tree¶
Disclosure Chaining mechanisms and various use-cases.¶
Typical scenarios and templates for Disclosure submissions.¶
Mapping of the regulatory requirements and future Disclosure elements.¶
Standardizing Privacy Disclosure and PII data-collection practices.¶
Enhancing Label accessibility with ARIA W3C Recommendation and other approaches.¶
Use of the OETP Disclosure in the ADM explainability (XAI).¶
Disclosure formats for families of "Generative AI" technologies such as Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), Conditional Variational Autoencoders (CVAEs), Attention Mechanisms, Transformer-based Models.¶
Part of this work related to Verification and Validation of Disclosure and Disclosure Chaining was supported by the H2020 Programme of the European Commission under Article 15 of Grant Agreement No. 951972 StandICT.eu 2023¶
The Open Ethics community and expert volunteers contributed with their valuable feedback, discussions, and comments. Thank you Ashley Duque Kienzle, Angela Kim, Ioannis Zempekakis, Karl Müdespacher, Ida Varošanec, Claudia Del Pozo, Joerg Buss, Mariia Kriuchok, Minhaaj Rehman, Oleksii Molchanovskyi, Roberta Barone, Phil Volkofsky and others.¶