This document describes the Algorithm Management Toolkit (AMT) Reporting Standard.
For reproducibility, governance, auditing and sharing of algorithmic systems it is essential to have a reporting
standard so that information about an algorithmic system can be shared. This reporting standard describes how
information about the different phases of an algorithm's life cycle can be reported. It contains, among other things,
descriptive information combined with information about the technical tests and assessments applied.
Disclaimer
The AMT Reporting Standard is work in progress. This means that the current standard is probably suboptimal and will
change significantly in future versions.
More fine-grained information on performance metrics, by extending the metrics_field from the Hugging
Face metadata specification.
Capturing additional measurements on fairness and bias, which can be partitioned into bar plot like
measurements (such as mean absolute SHAP values) and graph plot like measurements (such as partial dependence). This
is achieved by defining a new field measurements.
Capturing assessments (such as
IAMA
and ALTAI).
This is achieved by defining a new field assessments.
Following Hugging Face, this proposed standard will be written in YAML.
This standard does not contain all fields present in the Hugging Face metadata specification. The fields that are
optional in the Hugging Face specification and are specific to the Hugging Face interface are omitted.
Another difference is that we divide our implementation into three separate parts.
system_card, containing information about a group of ML-models which accomplish a specific task.
model_card, containing information about a specific data science model.
assessment_card, containing information about a regulatory assessment.
Include statements
These model_cards and assessment_cards can be included verbatim into a system_card, or referenced with an
!include statement, allowing for minimal cards to be compact in a single file. Extensive cards can be split up for
readability and maintainability. Our standard allows for the !include to be used anywhere.
Specification of the standard
The standard will be written in YAML. Example YAML files are given in the next section. The standard defines three
cards: a system_card, a model_card and an assessment_card. A system_card contains information about an
algorithmic system. It can have multiple models and each of these models should have a model_card. Regulatory
assessments can be processed in an assessment_card. Note that model_card's and assessment_card's can be included
directly into the system_card or can be included as separate YAML files with help of a YAML-include mechanism. For
clarity the latter is preferred and is also used in the examples in the next section.
system_card
A system_card contains the following information.
schema_version (REQUIRED, string). Version of the schema used, for example "0.1a2".
provenance (OPTIONAL). In case this System Card is generated from another source file, this field can capture the
historical context of the contents of this System Card.
git_commit_hash (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used
to create this card.
timestamp (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card.
Timestamp should be given, preferably in UTC (represented as Z), in
ISO 8601 format, i.e. 2024-04-16T16:48:14Z.
uri (OPTIONAL, string). URI to the tool that was used to perform the transformations.
author (OPTIONAL, string). Name of person that initiated the transformations.
name (OPTIONAL, string). Name used to describe the system.
instruments (OPTIONAL, list). List of instruments from
the task registry that must be executed to fill this system card.
There can be multiple instruments. For each instrument the following fields are present.
urn (REQUIRED, string). Uniform Resource Names of the instrument. It is required if an instrument object
is added.
version (OPTIONAL, string). The version of the instrument.
required (OPTIONAL, boolean). Specifies whether the instrument is required to be executed or not.
upl (OPTIONAL, string). If this algorithm is part of a product offered by the Dutch Government,
it should contain a URI from the
Uniform Product List.
owners (OPTIONAL, list). There can be multiple owners. For each owner the following fields are present.
organization (OPTIONAL, string). Name of the organization that owns the model. If ion is NOT provided this
field is REQUIRED.
name (OPTIONAL, string). Name of a contact person within the organization.
email (OPTIONAL, string). Email address of the contact person or organization.
role (OPTIONAL, string). Role of the contact person. This field should only be set when the name field is
set.
description (OPTIONAL, string). A short description of the system.
ai_act_profile (OPTIONAL). Information about the system in relation to the EU AI Act.
The contents of this field can be retrieved by traversing the AI Act decision tree or can be specified by the user.
type (REQUIRED, enum[string]). The type of the system should be chosen from:
["AI-systeem", "AI-systeem voor algemene doeleinden", "AI-model voor algemene doeleinden", "geen algoritme",
"impactvol algoritme", "niet-impactvol algoritme"].
open_source (OPTIONAL, enum[string]). Whether the system is open source or not.
Options are ["open-source", "geen open-source"].
risk_group (OPTIONAL, enum[string]). The publication category of the system should be chosen from:
["hoog-risico AI", "geen hoog-risico AI", "verboden AI", "uitzondering van toepassing"].
systemic_risk (OPTIONAL, enum[string]). Whether the AI model is classified as having systemic risk.
Options are ["systeemrisico", "geen systeemrisico"].
transparency_obligations (OPTIONAL, enum[string]). Whether the system faces transparency obligations.
Options are ["transparantieverplichting", "geen transparantieverplichting"].
role (OPTIONAL, enum[string]). The organization’s role in relation to the system.
Options are (select multiple) ["aanbieder", "gebruiksverantwoordelijke", "distributeur", "importeur"].
conformity_assessment_body (OPTIONAL, enum[string]). Whether the conformity assessment should be conducted by a
conformity assessment body. Options are ["beoordeling door derde partij"].
decision_tree (OPTIONAL). This field is REQUIRED if the above fields are retrieved by traversing the decision tree.
version (REQUIRED, string). The version of the decision tree.
path (REQUIRED). The traversed path through the decision tree.
question (REQUIRED, string). The question id of the question.
answer (REQUIRED, string). The answer to the question.
requirements (OPTIONAL, list). To store the applicable requirements.
urn (REQUIRED, string). The URN of the requirement (from Algoritmekader).
state (REQUIRED, string). The state of the requirement.
version (OPTIONAL, string). The version of the Algoritmekader.
measures (OPTIONAL, list). To store the applicable measures.
urn (REQUIRED, string). The URN of the measure (from Algoritmekader).
state (REQUIRED, string).
value (REQUIRED, string). Description on how the measure is implemented.
version (OPTIONAL, string). The version of the Algoritmekader.
accountable_persons (OPTIONAL, list). The persons who are accountable for the implementation of this measure.
name (REQUIRED, string). The name of the person.
uuid (REQUIRED, string). The uuid of the person.
responsible_persons (OPTIONAL, list). The persons responsible for the execution of this measure.
name (REQUIRED, string). The name of the person.
uuid (REQUIRED, string). The uuid of the person.
reviewer_persons (OPTIONAL, list). The persons who review the responsible people on the execution.
name (REQUIRED, string). The name of the person.
uuid (REQUIRED, string). The uuid of the person.
labels (OPTIONAL, list). This field allows to store meta information about a system. There can be multiple labels.
For each label the following fields are present.
name (REQUIRED, string). Name of the label.
value (OPTIONAL, string). Value of the label.
status (OPTIONAL, string). The status of the system. For example the status can be "production".
begin_date (OPTIONAL, string). The first date the system was used. Date should be given in
ISO 8601 format, i.e. YYYY-MM-DD.
end_date (OPTIONAL, string). The last date the system was used. Date should be given in
ISO 8601 format, i.e. YYYY-MM-DD.
goal_and_impact (OPTIONAL, string). The purpose of the system and the impact it has on citizens and companies.
considerations (OPTIONAL, string). The pro's and con's of using the system.
risk_management (OPTIONAL, string). Description of the risks associated with the system.
human_intervention (OPTIONAL, string). A description to want extend there is human involvement in the system.
legal_base (OPTIONAL, list). If there exists a legal base for the process the system is embedded in, this field
can be filled in with the relevant laws. There can be multiple legal bases. For each legal base the following fields
are present.
name (REQUIRED, string). Name of the law.
link (OPTIONAL, string). URI pointing towards the contents of the law.
used_data (OPTIONAL, string). An overview of the data that is used in the system.
technical_design (OPTIONAL, string). Description on how the system works.
external_providers (OPTIONAL, list). If relevant, these fields allow to store information on external providers.
There can be multiple external providers.
name (REQUIRED, string). Name of the external provider.
version (OPTIONAL, string). Version of the external provider reflecting its relation to previous versions.
references (OPTIONAL, list). This field allows to store references to the system.
name (REQUIRED, string). Name of the reference.
link (OPTIONAL, string). Link or URI to the reference.
interaction_details (OPTIONAL, list[string]). Explain how the AI system interacts with hardware or software,
including other AI systems, or how the AI system can be used to interact with hardware or software.
version_requirements (OPTIONAL, list[string]). Describe the versions of the relevant software or firmware, and any
requirements related to version updates.
deployment_variants (OPTIONAL, list[string]). Description of all the forms in which the AI system is placed on the
market or put into service, such as software packages embedded into hardware, downloads, or APIs.
hardware_requirements (OPTIONAL, list[string]). Provide a description of the hardware on which the AI system must
be run.
product_markings (OPTIONAL, list[string]). If the AI system is a component of products, photos, or illustrations,
describe the external features, markings, and internal layout of those products.
user_interface (OPTIONAL, list). Provide information on the user interface provided to the user responsible for
its operation.
description (OPTIONAL, string). A description of the provided user interface.
link (OPTIONAL, string). A link to the user interface can be included.
snapshot (OPTIONAL, string). A snapshot/screenshot of the user interface can be included with the use of a
hyperlink.
models (OPTIONAL, list[ModelCard]). A list of model cards (as defined below) or !includes of a YAML file
containing a model card. This model card can for example be a model card described in the next section or a model
card from Hugging Face. There can be multiple model cards, meaning multiple models are used.
assessments (OPTIONAL, list[AssessmentCard]). A list of assessment cards (as defined below) or !includes of a
YAML file containing a assessment card. This assessment card is an assessment card described in the next section.
There can be multiple assessment cards, meaning multiple assessment were performed.
model_card
A model_card contains the following information.
provenance (OPTIONAL). In case this Model Card is generated from another source file, this field can capture the
historical context of the contents of this Model Card.
git_commit_hash (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used
to create this card.
timestamp (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card.
Timestamp should be given, preferably in UTC (represented as Z), in
ISO 8601 format, i.e. 2024-04-16T16:48:14Z.
uri (OPTIONAL, string). URI to the tool that was used to perform the transformations.
author (OPTIONAL, string). Name of person that initiated the transformations.
name (OPTIONAL, string). Name used to describe the model.
language (OPTIONAL, list[string]). If relevant, the natural languages the model supports in
ISO 639. There can be multiple languages.
license (REQUIRED).
license_name (REQUIRED, string). Any license from the
open source license list1. If the license is NOT present in the license list
this field must be set to 'other' and the following two fields will be REQUIRED.
license_link (OPTIONAL, string). A link to a file of that name inside the repo, or a URL to a remote file
containing the license contents.
tags (OPTIONAL, list[string]). Tags with keywords to describe the project. There can be multiple tags.
owners (OPTIONAL, list). There can be multiple owners. For each owner the following fields are present.
organization (OPTIONAL, string). Name of the organization that owns the model. If ion is NOT provided this
field is REQUIRED.
name (OPTIONAL, string). Name of a contact person within the organization.
email (OPTIONAL, string). Email address of the contact person or organization.
role (OPTIONAL, string). Role of the contact person. This field should only be set when the name field is
set.
model_index (REQUIRED, list). There can be multiple models. For each model the following fields are present.
name (REQUIRED, string). The name of the model.
model (REQUIRED, string). A URI pointing to a repository containing the model file.
artifacts (OPTIONAL, list). A list of artifacts
uri (OPTIONAL, string) URI refers to a relevant model artifact
content-type (OPTIONAL, string) Optional type, follow the
Content-Type. Recognized values are
"application/onnx"", to refer to an ONNX representation of the model.
md5-checksum (OPTIONAL, string) Optional checksum for the content of the file.
parameters (OPTIONAL, list). There can be multiple parameters. For each parameter the following fields are
present.
name (REQUIRED, string). The name of the parameter, for example "epochs".
dtype (OPTIONAL, string). The datatype of the parameter, for example "int".
value (OPTIONAL, string). The value of the parameter, for example 100.
labels (OPTIONAL, list). This field allows to store meta information about a parameter. There can be
multiple labels. For each label the following fields are present.
name (OPTIONAL, string). The name of the label.
dtype (OPTIONAL, string). The datatype of the feature. If name is set, this field is REQUIRED.
value (OPTIONAL, string). The value of the feature. If name is set, this field is REQUIRED.
results (OPTIONAL, list). There can be multiple results. For each result the following fields are present.
task (OPTIONAL, list).
task_type (REQUIRED, string). The task of the model, for example "object-classification".
task_name (OPTIONAL, string). A pretty name for the model tasks, for example "Object Classification".
datasets (OPTIONAL, list). There can be multiple datasets 2. For each dataset the following fields are
present.
type (REQUIRED, string). The type of the dataset, can be a dataset id from
Hugging Face datasets or any other link to a repository containing the
dataset3, for example "common_voice".
name (REQUIRED, string). Name pretty name for the dataset, for example "Common Voice (French)".
split (OPTIONAL, string). The split of the dataset, for example "train".
features (OPTIONAL, list[string]). List of feature names.
revision (OPTIONAL, string). Version of the dataset, for example
"5503434".
metrics (OPTIONAL, list). There can be multiple metrics. For each metric the following fields are present.
type (REQUIRED, string). A metric-id from Hugging Face metrics4, for
example accuracy.
name (REQUIRED, string). A descriptive name of the metric. For example "false positive rate" is not a
descriptive name, but "training false positive rate w.r.t class x" is.
dtype (REQUIRED, string). The data type of the metric, for example float.
value (REQUIRED, string). The value of the metric.
labels (OPTIONAL, list). This field allows to store meta information about a metric. For example,
metrics can be computed for example on subgroups of specific features. For example, one can compute the
accuracy for examples where the feature "gender" is set to "male". There can be multiple subgroups, which
means that the metric is computed on the intersection of those subgroups. There can be multiple labels.
For each label the following fields are present.
name (OPTIONAL, string). The name of the feature. For example: "gender".
type (OPTIONAL, string). The type of the label. Can for example be set to "feature" or
"output_class". If name is set, this field is REQUIRED.
dtype (OPTIONAL, string). The datatype of the feature, for example float. If name is set, this
field is REQUIRED.
value (OPTIONAL, string). The value of the feature. If name is set, this field is REQUIRED. For
example: "male".
measurements.
bar_plots (OPTIONAL, list). The purpose of this field is to capture bar plot like measurements, for
example SHAP values. There can be multiple bar plots. For each bar plot the following fields are
present.
type (REQUIRED, string). The type of bar plot, for example "SHAP".
name (OPTIONAL, string). A pretty name for the plot, for example "Mean Absolute SHAP Values".
results (REQUIRED, list). The contents of the bar plot. A result represents a bar. There can be
multiple results. For each result the following fields are present.
name (REQUIRED, string). The name of bar.
value (REQUIRED, float). The value of the corresponding bar.
graph_plots (OPTIONAL, list). The purpose of this field is to capture graph plot like measurements,
such as partial dependence plots. There can be multiple graph plots. For each graph plot the following
fields are present.
type (REQUIRED, string). The type of the graph plot, for example "partial_dependence".
name (OPTIONAL, string). A pretty name of the graph, for example "Partial Dependence Plot".
results (REQUIRED, list). Results contains the graph plot data. Each graph can depend on a specific
output class and feature. There can be multiple results. For each result the following fields are
present.
class (OPTIONAL, string/int/float/bool). The output class name that the graph corresponds to.
This field is not always present.
feature (REQUIRED, string). The feature the graph corresponds to. This is required, since all
relevant graphs are dependent on features.
data (REQUIRED, list)
x_value (REQUIRED, float). The \(x\)-value of the graph.
y_value (REQUIRED, float). The \(y\)-value of the graph.
assessment_card
An assessment_card contains the following information.
provenance (OPTIONAL). In case this Assessment Card is generated from another source file, this field can capture
the historical context of the contents of this Assessment Card.
git_commit_hash (OPTIONAL, string). Git commit hash of the commit which contains the transformation file used
to create this card.
timestamp (OPTIONAL, string). A timestamp of the date, time and timezone of generation of this System Card.
Timestamp should be given, preferably in UTC (represented as Z), in
ISO 8601 format, i.e. 2024-04-16T16:48:14Z.
uri (OPTIONAL, string). URI to the tool that was used to perform the transformations.
author (OPTIONAL, string). Name of person that initiated the transformations.
name (REQUIRED, string). The name of the assessment.
urn (OPTIONAL, string). A Uniform Resource Name (URN) of the instrument in the task registry.
date (REQUIRED, string). The date at which the assessment is completed. Date should be given in
ISO 8601 format, i.e. YYYY-MM-DD.
contents (REQUIRED, list). There can be multiple items in contents. For each item the following fields are present:
question (REQUIRED, string). A question.
urn (OPTIONAL, string). A Uniform Resource Name (URN) of the corresponding task in the task registry.
answer (REQUIRED, string). An answer.
remarks (OPTIONAL, string). A field to put relevant discussion remarks in.
authors (OPTIONAL, list). There can be multiple names. For each name the following field is present.
name (OPTIONAL, string). The name of the author of the question.
timestamp (OPTIONAL, string). A timestamp of the date, time and timezone of the answer. Timestamp should be
given, preferably in UTC (represented as Z), in
ISO 8601 format, i.e. 2024-04-16T16:48:14Z.
Deviation from the Hugging Face specification is in the model_index:results:dataset field. Hugging Face only
accepts one dataset, while we accept a list of datasets. ↩↩
Deviation from the Hugging Face specification is in the Dataset Type field. Hugging Face only accepts dataset id's
from Hugging Face datasets while we also allow for any url pointing to the dataset. ↩↩
For this extension to work relevant metrics (such as for example false positive rate) have to be added to the
Hugging Face metrics, possibly this can be done in our organizational namespace. ↩↩