1. Executive summary
-Digital identities have been in development for decades. As governments increasingly consider becoming providers and consumers of these technologies, they more than ever have the potential to change the Web and the concept of identity as we know it.
-Given the scope and scale of this innovation, digital identities are significantly impacting the Web and, in particular, privacy, altering the assumptions and the balance that have shaped its ecosystem.
-This document further develops the concepts described in "Identity on the Web" at W3C’s Member Meeting of April 2024 [identity-on-the-web]. It reviews the intersections of digital identities through their societal, ethical, and technical impacts and highlights several areas where standardization, guidelines, and interoperability could help manage these changes:
--
-
-
-
Enabling passwordless credentials for authentication and payments
- -
-
Enabling federated identity in the Web platform without third-party cookies
- -
-
Modeling security, privacy, and human rights threats of decentralized credentials
- - - -
-
-
Mitigating the threats at technological and governance levels
-
Through exploratory thinking, the following understanding emerge:
--
-
-
-
Standards can help, as they have in the past, to drive innovation while mitigating threats and to enable technical progress while having a positive impact on the world
- -
-
The technology stack is composite and broad, and needs to be coordinated across standards and across Standards Development Organizations (SDOs)
- -
-
People, SDOs and governments are the key actors who need to collaborate to ensure that digital credentials/identities solve more problems than they create, because identity is not only technology, but also governance
- -
-
It is crucial to pay close attention to the impact in security, privacy, and human rights in general, and the proposed method of analysis is threat modeling
-
We seek input from the community on proposals that could help progress on these topics and other topics that this document may contribute to identifying.
-2. Introduction
-Digital Identities have been in development for decades, and at this moment in history, they are about to be implemented government-wide. They can change the Web and the concept of Identity as we know it. There are many opportunities but also threats to society and the Web.
-2.1. Terminology
-The concept of identity is very broad and covers psychology, social sciences, mathematics, and logic. There is no agreed-upon definition of all the terminology. Let us start with a set of definitions to have a common ground in this paper.
-When we think about identity, we often think about our identity as individuals. It is inherent, although we tend to give a different meaning to our identity according to our culture, from the Western "Cogito ergo sum" (I think therefore I am) [discourse-on-the-method] to the African "Ubuntu" (I am because you are) [what-does-ubuntu-really-mean] or the Eastern "tat tvam asl" (that thou art), which express two notions, the man’s real self (ātman), and the Cosmic Self (brahman) [a-dictionary-of-hinduism].
-Analyzing the etymology, the term identity comes from the Latin root “idem”, which means “the same” [oxford-etymology-identity]. From the Cambridge Dictionary, we can say it is “the fact of being, or feeling that you are, a particular type of person, organization, etc.; the qualities that make a person, organization, etc. different from others” [cambridge-dictionary-identity].
-Looking more closely at the Information Technology (IT) domain, the ISO/IEC 24760-1:2019 [ISO-IEC-24760-1] defines Identity is “a set of attributes related to an entity”. Where the entity is something "that has recognizably distinct existence", and that can be "logical or physical" such as "a person, an organization, a device, a group of such items, a human subscriber to a telecom service, a SIM card, a passport, a network interface card, a software application, a service or a website". These attributes are “characteristics or properties” such as “an entity type, address information, telephone number, a privilege, a MAC address, a domain name”. To complete the definition of entity and identitfiers, it is important to note that they always refer to a domain of applicability, the specific context where they can be used (e.g., an organization, a country, a university).
-Thus, a particularly important point is clear: there are not only identities of people, individuals, or human beings. We can also have identities for organizations, pets, and Non-Human Identities (NHI). NHI are all those accounts used by widely used by “devices, services, and servers” in Networking, Cloud, and Workloads [the-evolving-landscape-of-non-human-identity].
-Now, an important logical step. To claim our identities, we present credentials, whether in the physical or digital world. Just as we do not have a one-size-fits-all definition of identity, we also do not have a one-size-fits-all definition of credential, as it changes according to context. Starting with the definition from the Cambridge Dictionary, a (digital) credential is “a piece of information that is sent from one computer to another to check that a user is who they claim to be or to allow someone to see information” [cambridge-dictionary-identity]. While high-level, this definition considers two important aspects: on the one hand, the credential is used to prove our claims, such as who we are, and on the other hand, it can be used to gain access to information:
--
-
-
-
The ISO/IEC 24760-1 definition is very close to the last aspect from the dictionary, where a credential is a “representation of an identity for use in authentication” [ISO-IEC-24760-1].
- -
-
The Identification for Development (ID4D) definition is close to the first aspect: “any document, object, or data structure that vouches for a person’s identity through some method of trust and authentication” [types-of-credentials-and-authenticators].
- -
-
The NIST SP 800-63-3 definition echoes the first aspect, “an object or data structure that authoritatively binds an identity—via an identifier or identifiers—and (optionally) additional attributes to at least one authenticator” [NIST-SP-800-63-3]. It adds the important concept of binding an identity to its attributes—recalling ISO’s definition of identity—and using identifiers.
- -
-
The W3C Verifiable Credentials Data Model (VCDM) definition states, “a set of one or more claims made by an issuer” [vc-data-model-2.0]. On the one hand, this definition seems similar to NIST’s. However, its framing is in the decentralized versus federated model (which we will analyze shortly), and thus, to ISO’s definition of identity mapping the ISO’s attributes to VCDM claims.
-
Note: Therefore, we will refer to the specific definition of credential in the various sections of the document according to the context.
-These definitions introduced important concepts such as identifiers, authentication, and trust that are good to clarify.
-Identifiers are pieces of information used to uniquely refer to an entity within a specific context. According to the W3C Decentralized Identifiers, there are various types of identifiers: “communication addresses (telephone numbers, email addresses, usernames on social media), ID numbers (for passports, driver’s licenses, tax IDs, health insurance), and product identifiers (serial numbers, barcodes, RFIDs). URIs (Uniform Resource Identifiers) are used for resources on the Web, and each web page you view in a browser has a globally unique URL (Uniform Resource Locator)” [did-core].
-Note: Although entity, identity, and identifier are related, they are distinct: Identity refers to the essence of who or what an entity is, while an identifier is a specific piece of information used to recognize and refer to that entity uniquely.
-Let us then try to understand the authentication process and how it differs from identification, verification, and authorization:
--
-
-
-
Identification is recognizing an entity through the information it provides. For example, we enter our first name, surname, and email address in a social network (and there are different levels of proofing of our real identity).
- -
-
Verification allows us to confirm that the presented information is valid through further testing. Verification is a generic process that can take different forms and have different effects. For example, we often receive an email with a confirmation link to verify an email address. We confirm that the email address is under our control by clicking on it. This type of verification demonstrates control over the identifier. Verifying identity information online, such as a specific name and surname, is more complex. When verifying identity information, we use the term identity verification.
- -
-
Authentication is a specific, formal verification type that aims to grant access to a resource, service, or information. This process usually involves verifying control of our identifier with something we know (e.g., a password), something we have (e.g., a hardware token), or something we are (e.g., a biometric characteristic). For instance, similar to the email example, we demonstrate control over a username (the identifier of our identity) by entering the corresponding password.
- -
-
Authorization is another key process that follows authentication. It verifies whether our authenticated identity has the necessary permissions to access a particular resource. This step ensures that we are only granted access to resources we can use even after confirming our identity.
-
Let us see how these concepts can be applied to physical credentials. When we present our passport to cross the border, here is an example of the processes that might be carried out:
--
-
-
-
Identification: We present our passport to the border control officer, claiming our identity through our credentials and its identifier (the passport ID).
- -
-
Verification: The border control officer verifies that the passport is genuine, not tampered with, not expired, and issued by a recognized government.
- -
-
Authentication: In this context, the authentication involves verifying that the person presenting the passport is the rightful holder. This might include checking biometric data stored in the passport against the person’s actual biometrics (e.g., fingerprints or facial recognition).
- -
-
Authorization: Finally, authorization is the process where border control determines whether the authenticated individual has permission to enter the country. This decision is based on various factors, including visa validity, passport not expiring in six months or less, and confirmation that the individual is not on any watchlists, unwanted lists, or other checks.
-
When we use digital credentials on the Internet instead, the issue is more challenging, as illustrated by Peter Steiner’s celebrated cartoon published in the New Yorker in 1993: "On the Internet, nobody knows you are a dog" [nobody-knows-you-re-a-dog]. Historically, digital credentials have taken various forms, such as:
--
-
-
-
The usernames and passwords we use to log in to our favorite social network and communicate with friends.
- -
-
The same usernames and passwords from our favorite social network, but used to authenticate on an e-commerce website and make a purchase.
- -
-
A digital driver’s license in our digital wallet.
-
Note: The last form of credential, as defined by W3C, has a wider range of use cases than just authentication. One important clarification: it may make sense to use a driver’s license to authenticate only on the issuer’s systems (e.g., it is good to authenticate ourselves on government websites but not on our personal email provider). Furthermore, additional information (claims) on the driver’s license, such as date of birth and, in some cases, home address verified by a trusted entity such as a government, enables interesting use cases.
-Therefore, it’s important to remember that we can have digital credentials that are not identity documents, such as diplomas, which, in this case, are issued by universities. Several projects exist, such as the Digital Credential Consortium (DCC) and Blockcerts, which are committed to building an infrastructure for academic digital credentials.
-We introduce the last topic with the example of credentials that universities can issue. In addition to degree certificates, universities usually have student ID cards containing information such as first name, last name, and photo.
-Why is it the case that of a driver’s license and a student ID card, both having the same attributes and being cryptographically verifiable, only the driver’s license is allowed to be used to open a bank account?
-There are several aspects, first of all, the context and domain in which the credential lives. The key difference lies in the trust we place in the issuers of these credentials. Trust can be defined as "the belief that someone is good and honest and will not harm you, or that something is safe and reliable" [cambridge-dictionary-trust]. Essentially, trust is a choice we make; we choose to trust or not trust someone or something [OSSTMM-3], and often, it is not a binary question.
-Cryptographic trust, such as verifying a credential, is not the same as human trust [self-sovereign-identity]. Cryptographic methods ensure that the credentials haven’t been tampered with. Human trust involves trusting the entity that issued the credential and that the issuer provided the credential to the legitmate user.
-This is why we also need governance frameworks or trust frameworks. These frameworks include business, legal, and technical rules that help establish and maintain trust in the issuers of credentials.
-This includes establishing the levels of assurance (LOA). Follow as an example the Identity Assurance Level (IAL) from NIST-SP-800-63-3 [NIST-SP-800-63-3]:
--
-
-
-
IAL1: No requirement to prove a specific real-life identity, e.g., identity can be self-asserted.
- -
-
IAL2: Remote or in-person identity proofing with supporting evidence is required.
- -
-
IAL3: Physical presence is required for identity proofing, with proper verification of evidence.
-
Having concluded this roundup of terminology, before we delve into the various digital identity management models that have come and gone over time and that we have used in the previous examples, let us try to understand why identities are so important.
-2.2. Why identity is important
-Human identities are a very special case, particularly those issued by governments. We know that they are not the only type and that the others are also important and have interesting business implications, but human ones have distinctive characteristics. Let us see why.
-2.2.1. Human rights
-Identity is a fundamental human right that underpins personal dignity and autonomy. Article 6 of the Universal Declaration of Human Rights states, "Everyone has the right to recognition everywhere as a person before the law" [UDHR].
-This principle is reinforced by Article 16 of the International Covenant on Civil and Political Rights, which states that "Everyone shall have the right to recognition everywhere as a person before the law" [ICCPR]. Although the term "identity" is not explicitly used, its concept is inherent in recognizing the identity as a person.
-2.2.2. Sustainable development goal
-Despite being a right, much work still needs to be done to provide identities for all the population.
-However, target 16.9 of the 2030 United Nations Sustainable Development Goals (SDGs) aims to achieve "legal identity for all, including birth registration" [SDGS-16].
-2.2.3. Identity for Development (ID4D)
-Achieving legal identity for all is a challenging goal on several fronts. In response, the World Bank has launched the ID4D initiative, aiming to "secure a unique legal identity and enable digital ID-based services for all by 2030" [ID4D-initiative].
-2.2.4. Opportunities and threats
-Digital identities and credentials are powerful business enablers and offer significant opportunities for individuals, governments, and organizations.
-They can guarantee other rights, such as the right to accessibility promoted by the Marrakesh Treaty [marrakesh-treaty], and to "empower refugees, stateless individuals, and forcibly displaced persons" [UNHCR-digital-identity].
-These technologies can also be used on a humanitarian level. Referring to the NHIs, the International Committee of Red Cross (ICRC) investigated Digital Emblems [ADEM] to identity ICT assets protected under international law [digitalizing-report].
-Note: However, like all innovations, these technologies can have downsides. To paraphrase Paul Watzlawick, the innovation of these technologies must not become “ultra-solutions” where “operation successful, patient dead” [ultra-solutions]. -So, the challenge is enabling this technological innovation by being aware of the threats to Privacy, security, and Human Rights.
-Therefore, it is necessary to analyze the various threats to mitigate them at their root in designing and implementing these technologies and related standards.
-As an example, below is an initial analysis of threats to human rights (harms) concerning government-issued digital identities using Microsoft’s responsible innovation toolkit:
--
-
-
-
Opportunity loss (discrimination): This complex issue spans multiple areas. Digital divide: if digital identities are required for access to public services and no alternatives are present, and if they depend on certain hardware, software, or stable connectivity, it can lead to discrimination for people who do not have availability of these resources. In addition to discrimination within the same country, there is further discrimination if there is no “cross-border” interoperability between the technologies and implementations used by different governments.
- -
-
Economic loss (discrimination): The availability of digital identities and related credentials, which can contain a lot of information regarding wealth status, can be used to discriminate against access to credit. This can also be generalized - as was identified during a W3C breakout session - and concerns the Javons paradox. The more information available, the more likely it is that collection, particularly in greedy data-driven contexts, is abused.
- -
-
Dignity loss (dehumanization): For example, if the vocabulary used does not correctly describe people’s characteristics, this can reduce or obscure people’s humanity and characteristics.
- -
-
Privacy loss (surveillance): if this technology is not designed and implemented properly, it can lead to surveillance by state and non-state actors such as government and private technology providers. For example, centralized or federated models are more prone to these threats, while decentralized models are less so, but it depends on how they are implemented. Therefore, it is necessary to provide privacy-preserving technologies and implement them properly.
-
Note: W3C might consider handling this issue with a Threat Model.
-2.3. Digital identity management models
-With these assumptions, before proceeding, it is important to understand how digital identities are managed and how they have evolved over the years.
-Let us start with the example of a person’s identity, and break it down. We had:
--
-
-
-
Credentials of a social network that are used on the same site.
- -
-
Credentials of a social network that are used on an different site.
- -
-
Driver’s license within a digital wallet application.
-
These examples represent the evolutionary stages of Internet Identity described by Christopher Allen at the Internet Identity Workshop (IIW). From these developmental stages, the community agrees that there are currently three models of identity relationships [three-models-of-digital-identity-relationships]. Let us analyze them.
-2.3.1. Centralized identity model
-In the centralized identity model, also known as siloed or traditional, a single provider offers both the identity (and its credentials, typically a username and password) and the service. This older model was used in the early days of the Internet and the Web and is still used today.
-The centralized identity model is the typical scenario when the user logs in to a social network to use it, and the credentials here are used to authenticate.
- -Here is the Data Flow:
--
-
-
-
Authentication: The user authenticates themselves with the centralizdd system using their credentials.
- -
-
Access granting: This system grants access to the resource.
-
Perspectives:
--
-
-
-
Security: there are different issues. For the user: password re-use in case of compromised password, so the user should use different passwords for different providers; there are also Phishing and Man-in-The-Middle attacks. From the provider’s point of view, as the passwords are stored on their systems, they need to implement proper security measures to protect them at rest and during transport.
- -
-
Privacy: the centralized system can completely track the user.
- -
-
Standards: Standards intervene at different levels. In how credentials are exchanged and sent: historically, Basic Autentication [RFC1945], Digest Authentication [RFC2069] (and related updates), and via HTML forms with
-input type=password
andCookies
for maintaining the session information on the Client. Other standards for increasing authentication factors such as HTOP [RFC4226] and TOTP [RFC6238]. Also, with with SSL/TLS [RFC2246] (and related updates) for the protection of credential transport and and traffic in general. Other standards protecting the credentials at rest such as the (now obsolete) MD5 [RFC1321] and other hashing algorithms by NIST.
To mitigate security threats, in particular the use of multiple passwords and phishing, FIDO Alliance created Passkeys, "a replacement for passwords that provide faster, easier, and more secure sign-ins to websites and apps across a user’s devices" [passkeys-101]. -
The W3C Web Authentication Working Group brought to the Web Platform standardizing Web Authentication Level 2 [webauthn-2], and is developing Level 3 [webauthn-3]. This technology, moreover, can also be used to make online transactions more secure by using the same underlying technology to confirm payments, such as the "payment" extension for Secure Payment Confirmation [secure-payment-confirmation].
-2.3.2. Federated identity model
-In the federated identity model, also known as a third-party Identity Provider (IdP), the function of making available identity information is separated from the one which provides a service to the user - the Service Provider (SP) or Relying Party (RP) [ISO-IEC-24760-1].
-The federated identity model is the typical scenario when a user logs into a third-party site using a social network’s "Sign in with..." feature or through Single Sign-On (SSO) in enterprise environments.
-This model allows users to utilize a single Identity Provider (IdP) to authenticate and access multiple Service Providers (SPs) or Relying Parties (RPs) without needing to create separate accounts for each one.
- -Here is the simplified Data Flow:
--
-
-
-
Authentication: The user sends their credentials to the IdP to authenticate.
- -
-
Obtaining identity assertions: The IdP then creates an identity assertion, a verifiable confirmation of the user’s identity.
- -
-
Sending identity assertions: The user sends their identity assertion to the SP or RP.
- -
-
Trust and access: The SP or the RP, trusting the IdP, accepts the Identity Assertion sent by the user and grants access.
-
Perspectives:
--
-
-
-
Security: This model mitigate the user’s issue of remembering multiple passwords, the identity fragmentation, and relieves the need for the SP or RP to manage the authentication aspects.
- -
-
Privacy: this model still has some implications because the IdP knows what third-party services the user has accessed. Additionally, the technology uses "third-party (cross-site) cookies that are considered harmful to the web and must be removed" [third-party-cookies-must-be-removed].
- -
-
Standards: standards support interoperability between different systems. The most used in this context are OASIS Security Assertion Markup Language (SAML) and OpenID Connect, which underpins OAuth for authorization and different token formats.
-
The Federated Identity Community Group was born to resolve these privacy concerns, incubating the Federated Credential Management API [FEDCM] and other APIs to implement the Federated Identity Model in the Web Platform. The Federated Identity Working Group also came from this group to proceed with standardization.
2.3.3. Decentralized identity model
-In the decentralized model, also known as the Self-Sovereign Identity (SSI) model, the user independently administers their identities and is the highest expression of user-centric identity. It is the newest model, and several pilot projects are underway for large-scale implementations.
-Note: We will examine the decentralized identity model more closely, as it is the source of a new set of challenges.
-2.3.3.1. Architecture
-The decentralized identity model introduces a significant shift in the architecture, moving away from federated IdPs, SPs, or RPs.
- -Instead, it involves a new set of actors and dynamics, described in the W3C Verifiable Credentials Data Model (VCDM) [vc-data-model-2.0]:
--
-
-
-
The Holder (the user), who stores their credentials in a Digital Wallet, is at the heart of this architecture. This wallet, whether a native app or a web-based application, operates much like a physical wallet. Just as a physical wallet holds more than just IDs, the digital wallet can store various credentials and information. The transformation from physical to digital doesn’t change the wallet’s fundamental role but enhances its capabilities with digital features.
- -
-
The Issuer is the entity that creates and issues credentials to the Holder. This can be a trusted third-party entity like governments or universities. In some cases, credentials can be self-issued by the user e.g., to represent informal skills or competencies. This flexibility allows for a broader range of credentials and applications.
- -
-
The Verifier in this model is akin to an SP or RP in federated models. It receives the credentials presented by the Holder and verifies them. Importantly, this process does not necessarily involve informing the Issuer. This decoupling is a key aspect of the decentralized identity model, enhancing privacy and control for the user.
- -
-
The Verifiable Data Registry (VRP) is a crucial entity of this architecture. This registry holds the data needed to verify credentials and their status. This can be government databases, distributed ledgers, or other services. By maintaining this information, the VRP, depending on its form, enables verification without direct communication between the Issuer and the Verifier.
-
Note: In this model, the definition of a credential shifts to a set of claims (attributes) linked to identifiers controlled by the user. While credentials represent identities, not all claims within a credential are used for identification. They can describe various characteristics, extending the application of credentials beyond mere identification.
-The VCDM defines two basic concepts: the Verifiable Credentials and the Verifiable Presentation.
--
-
-
-
Verifiable Credential is what the holder stores in the Wallet. It contains:
--
-
-
-
Metadata: of the Credentials.
- -
-
Claim(s): one or more assertions where a characteristic of a subject is described (e.g., the subject is a citizen of a certain state, was born in a certain place on a certain day, month, and year, and can drive cars of this type).
- -
-
Proof(s): cryptographic proof of the integrity of the credential, typically via a digital signature.
-
-
-
-
-
Verifiable Presentation is what the holder sends to the verifier to show their credentials. The basic case is to present the credential as is. However, in many scenarios, the holder may wish to present only a subset of the claims of a credential to the verifier - this is called Selective Disclosure (SD) - or a combination of information from different credentials. It contains:
--
-
-
-
Metadata: of the presentation.
- -
-
Credential(s): information derived or combined from one or more credentials.
- -
-
Proof(s): cryptographic proof of the integrity of the credential(s) and the presentation.
-
-
-
Note: For a comprehensive overview of Verifiable Credentials (VCs), refer to Ivan Herman’s W3C Verifiable Credentials Overview.
-Another pillar of important element of this architecture are the Decentralized Identifiers (DIDs), a new Uniform Resource Identifier (URI) type that allows entities and resources to be identified through various methods [did-core]. These methods can rely on various technologies, including blockchains such as Bitcoin or Ethereum, the web, InterPlanetary File System (IPFS), and Domain Name System (DNS) [did-spec-registries]. All methods allow cryptographical control over the identifier. Still, they vary in how decentralized they are and the features they support (e.g., key rotation and revocation).
-Here’s the Data Flow:
--
-
-
-
Credential Issuing (CI):
--
-
-
-
The Issuer requests a certain authentication mechanism from the Holder.
- -
-
After authentication, the Holder asks the Issuer for the credential, or the Issuer submits it.
- -
-
If both parties agree, the Issuer sends the credential to the Holder in a specific format.
- -
-
The Holder enters their credential into the Wallet.
-
-
-
-
-
Credential-Presentation (CP)
--
-
-
-
The Holder requests access to a specific resource or service from the Verifier.
- -
-
The Verifier then presents a request for proof to the Holder. This can either be done actively (e.g., the Verifier presents a QR code that the Holder has to scan) or passively (e.g., they accessed a web page and were asked to access a credential).
- -
-
Through the Wallet, the holder’s user agent determines if there are credentials to generate the required Proof.
- -
-
The Holder may use the proof explicitly if they possess it.
- -
-
The user agent of the Holder then prepares the Presentation - which can contain the full credential or part of it- and sends it to the Verifier.
-
-
-
-
-
Credential-Verification (CV)
--
-
-
-
The user agent of the Verifier verifies the Presentation (e.g., if the Presentation and the contained Credentials are signed correctly, issued by an Issuer they trust, compliant with their policy, the Holder is entitled to hold it, and that it has not been revoked or expired). The revocation check can be done using the methods defined by the specific credential.
- -
-
If the verification is successful, the Verifier gives the Holder the access.
-
-
-
-
-
Credential-Revocation (CR)
--
-
-
-
The Issuer can revoke a credential in various ways.
-
-
-
2.3.3.2. Security and Privacy
-It is interesting to reflect on how this model differs from a security and privacy perspective both from previously described models and from the use of physical identity documents, as credentials can enable this use case:
--
-
-
-
Decentralized vs. Federated Model: Let us analyze one of the privacy issues of the federated model: whoever provides the identity can track the user. This is one of the threats this model wants to mitigate since the identity is in the user’s Wallet, and they use it as they wish.
-Does this guarantee its untraceability? One can presume that the answer is "it depends"*. It depends on how the architecture is defined and implemented and the technologies used. For example, when we present our credentials to log in, the verifier contacts the issuer directly, asking if the credentials are still valid, and we continue to be traceable.
- -
-
Decentralized vs Physical Document: If we instead think about the case where I have to send my passport online to open a bank account, to date, the most used method is to send the file with the passport scan. The bank usually sends the file to third-party services, which often use Machine Learning systems for analysis. In addition, the file could be reused by someone who has access to it, exposing all my data and not only the one needed.
-Presenting a digital credential, which could also support the submission of a subset of the contained claims, can improve the situation but, again, we may encounter the problem of the verifier contacting the issuer for verification, a problem that is rarer to happen when sending the file, although databases of stolen documents do exist and a verifier could make a request with the passport-id to verify the status.
-
Note: Architectural change can solve some issues but can also generate new ones, this is why a thorough analysis is necessary.
-We can take a step back and understand what privacy properties are needed for digital identity, considering that the higher the level of credential assurance, the more threats can impact the user. Over the years, several proposals have been made to understand the properties of digital identities, such as Kim Cameron’s 7 Identity Laws and Ben Laure’s Three Properties. Analyzing the latter [selective-disclosure]:
--
-
-
-
Verifiable: Identities, credentials, and various claims must be verifiable, which is possible through appropriate cryptographic proofs. This is part of the security aspects, which include cryptography.
- -
-
Minimal: This is a privacy aspect. When we send information to the verifier, the information should be minimized as much as possible. For example, if I have to show that I am of age, it is okay to submit all the credentials or even the specific claim of my date of birth, but simply that I am of age.
- -
-
Unlinkable: This is another privacy aspect related to the minimal issue anyway. Suppose any party involved in the interaction, such as the Issuer or Verifier, or even a third party, can link and correlate the information we have sent. In that case, this can be done through various techniques, and privacy is compromised.
-
So we have a number of properties, both security and privacy as a starting point. How can we implement them? First of all, different credential formats, have different privacy and security features [verifiable-credentials-flavors-explained] and we can expand the discussion not only to formats but to all other components of the architecture, which must be aligned to ensure security and privacy.
--
Given the levels of complexity, a comprehensive analysis of threats to privacy, security, and human rights is necessary.
-This is especially important for high-assurance credentials, such as those issued by governments. This is highlighted by organizations such as the Electronic Frontier Foundation (EFF) [eff-digital-identification] and Access Now [access-now-whyid].
-W3C recognized the need for rights-respecting digital credentials and started a joint Threat Model for Decentralized Identities with the following groups:
--
-
- - -
- - -
- - -
- - -
- - -
-
-
Threat Modeling Community Group (TMCG) (open to all privacy, security and human rights experts)
-
Threat Modeling is "a family of structured, repeatable processes that allows to make rational decisions to secure applications, software, and systems" [threat-modeling-designing-for-security]. This Threat Model is using different frameworks and toolkits to cover the different threat types, such as:
--
-
-
-
Security: STRIDE (Spoof, Tamper, Repudiation, Denial of service, Escalation of privileges), RFC 3552.
- -
-
Privacy: LINNDUN (Linking, Identifying, Non-Repudiation, Detecting, Data Disclosure, Unawareness & Inintervenability, Non-Compliance), RFC 6973.
- -
-
Human rights: Microsoft’s Types of Harm, and Access Now #WhyID.
-
The Threat Model also includes a list of various mitigation techniques, particularly those based on cryptography techniques such as Zero Knowledge Proof (ZKP), and additional methods for enabling secure and privacy-preserving technology.
-2.3.3.3. Standards
-As noted, VCDM and DID define only certain elements of the architecture. Other Standards Development Organizations (SDOs) define other elements that are essential for the architecture to function. -Therefore, coordination between these entities is necessary to ensure everything works seamlessly. So, let us look at the standards involved for the various components needed.
-To understand the extent of the various standards, is it possible to refer to Michael Palage’s Digital Identity Galaxy.
-This is why several SDOs such as the World Wide Web Consortium (W3C), the Internet Engineering Task Force (IETF), the OpenID Foundation (OIDF), and the Decentralized Identity Foundation (DIF) are coordinating to standardize the components and how they should communicate:
--
-
-
-
Data models: abstract models for Credentials and Presentation such as the Verifiable Credentials Data Model, and mDL in ISO/IEC 18013-5:2021.
- -
-
Identifiers: DIDs and the DID methods, or WebID.
- -
-
Encoding schemas: JSON, JSON-LD, CBOR, CBOR-LD.
- -
-
Securing mechanisms: Each mechanism may or may not support different privacy features or be quantum-resistant:
--
-
-
-
Enveloped formats (credential formats): The proof wraps around the serialization of the credential. -JSONs are enveloped using JSON Object Signing and Encryption (JOSE), and we can find JWT, JWS, and JWK here. JOSE is cryptographically agile (as it can fit different cryptographic primitives) and can also have Selective Disclosure (SD) with SD-JWT (which uses HMAC). New securing mechanisms are coming up, like SD-BLS (which uses BLS) and ongoing efforts to fit BBS#. -CBORs are enveloped using CBOR Object Signing and Encryption (COSE). Other formats include mdoc and SPICE. -The mechanism to use VCDM with JOSE/COSE is described in Securing Verifiable Credentials using JOSE and COSE.
- -
-
Embedded formats (signature algorithms): The proof is included in the serialization alongside the credentials (e.g., BBS, ECDSA, EdDSA). The mechanism is described in Verifiable Credential Data Integrity 1.0.
-
-
-
-
-
Status information (revocation algorithms): Issuers can implement several ways to keep the credential’s status up to date, such as a Revocation List, a Status List (e.g., Bitstring Status List v1.0), and Cryptographic Accumulators, etc.
- -
-
Communication protocols: for the different phases of issuance and presentation (e.g., OID4VCI, OID4VP, SIOPv2).
-
Note: This list is representative. For more detailed information, please refer to the comparison matrix.
--
One of the significant challenges today is the interoperability of various technologies. This issue becomes particularly evident in the context of government-issued digital credentials when comparing the interoperability of physical documents such as Passports.
-The threat here is clear: if a service supports only a specific format, it excludes and discriminates against users who use a different format due to their government’s decisions. Consequently, this limitation affects users and restricts the service’s business potential.
-To address these concerns, there are ongoing discussions about standardizing credential presentation at the web platform level through the Digital Credentials API. - The Working Group already on the Federated Identity Model is undertaking this standardization effort as the API is similar. The goal is to provide an optimal user experience by ensuring user agents operate securely and in privacy-preserving ways.
-Moreover, the Group’s new scope proposal aims to adopt this API with a broader perspective. It suggests incorporating the "Issuance" aspect to enable all possible use cases through the web platform.
-3. Uses cases
-The world of Digital Identities is quite broad and has different uses in different industries, where it can enhance the user experience and act as a business enabler.
-To imagine these use cases, we can play a game: see what is inside our physical or digital wallets.
-For example, the driver’s license (and the international one), the passport (and the passport also have visas for entry to other countries, or if you have minor children, they can be in your passport), payment cards, cash, association cards, tickets (e.g., events, concerts, boarding passes), loyalty cards (from hotels, airlines, the grocery store), the university card, the badge to get into the office, medical insurance card and health card, emergency contacts, some receipts, public transportation card, a business card and the business cards of other people, the card to get into the gym and the library.
-If we extend this concept to include those documents that are often too large to be put inside a physical wallet if not unfolded but which we use during the day, we also have employment contracts, house contracts, utility bills, the papers of our pet (which, if it travels, has a chip and a passport), marriage certificate (for those who are married), a power of attorney to sign the documents of a company, the tax return, bank statements, amateur radio license or other licenses, medical prescriptions, exam results (both medical and college), degree, professional qualifications (e.g., medical doctor, lawyer, psychologist), warranty certificates of purchased items and much more.
--
-
-
-
We use only some digital credentials to verify our identity (e.g., driver’s license, passport), which have additional attributes that can be useful other than identification.
- -
-
Many other credentials are related to our features or entitlements (e.g., degree certificate, work permit), which allow us to do many things but not identify ourselves.
- -
-
We are not the subjects of some credentials, as in the case of pet travel documents.
-
Let us proceed to examine the use cases for those organization-related identities.
-3.1. Organizations
-We can look at organizations from different aspects. On the one hand, they can benefit from their government-issued digital identity; on the otier hand, they can issue identities themselves to better manage their identification and access systems, both for people and for identities of specific services, software, or processes. To top it off, they can leverage people’s identities for greater assurance, particularly when distributed worldwide.
-3.1.1. Organizational identity
-Organizations can also have a digital identity and related identifiers such as a registration number with the government where it was opened, possibly a VAT number if not a legal entity identifier. Although the organization has an identity of its own, it operates through individuals who, in the bylaws, have various authorizations, delegations, and signing powers. Therefore, when you do any transaction, such as opening a bank account or a business transaction, you need the organization’s and the personal documentation of the various individuals involved. The use of digital identity in a wallet, with delegation managed through Verifiable Credentials, certainly streamlines the various transactions both with governments and suppliers and with customers, particularly for those aspects of global transactions where the trust relationship goes through a digital transaction and the Association of Certified Fraud Examiners (ACFE) estimates that organizations lose 5% of revenue to fraud each year [acfe-occupational-fraud-2024].
-3.1.2. Identity and Access Management (IAM)
-The IAM market is thriving, with an estimated growth of 43 billion USD by 2029 [statista-identity-and-access-management]. Such systems enable an employee’s identification, authentication, and authorization on the organization’s platforms according to assigned roles and responsibilities. Decentralized identities enable an additional approach, such as Bring Your Own Identity (BYOI), where users can use their identity to interact with corporate assets and not just for human resource management practices.
-3.1.3. Global workforce
-Digital transformation has been a trend for several years and has played a crucial role, particularly in the workforce, during the COVID-19 pandemic. Organizations could decide whether to stop operations or change as far as possible by digitizing and enabling remote work. This transformation made it clear that looking into the global workforce is possible.
-The fact is that, net of a further trend in the last year of "back to the office", remote workers are estimated to be 67 percent in the technology industry, and this approach is preferred by 91 percent of workers [statista-work-from-home]. In a global context, digital identities can help register employees and contractors by verifying their identities and qualifications, which is particularly challenging for a global workforce. This can speed up hiring, employee management, and other HR processes.
-3.2. Things
-Although applications with identities linked to individuals are the most studied cases and are delicate to handle, identities also find fertile ground in the supply chain and IoT world, which are decentralized and distributed by nature.
-3.2.1. Supply chain
-A particularly common and interesting scenario is the use of identities and the identification of physical assets and other organizations in the supply chain as well as in end-user services:
--
-
-
-
Import-export markets: the "cost of trade" tends to double the cost of goods when exported, creating significant barriers to entry, even for small and medium-sized enterprises (SMEs) [edata-verifiable-credentials-for-cross-border-trade]. Digital Identities for other organizations and goods can support the traceability of the supply chain, especially when there are certifications related to sustainable production.
- -
-
Counterfeit-prone markets: such as luxury goods. Proving the physical goods proper digital identity and demonstrating the ownership of its Digital Twin in the form of a credential issued by the producer can benefit the end-user and mitigate fraud.
-
Identifying physical goods presents unique challenges, such as associating the physical goods with their credentials. Some solutions include using barcodes, DNA fingerprinting of agricultural products, and radio frequency identification (RFID).
-3.2.2. Energy devices (IoT)
-In "Self-Sovereign Identity" [self-sovereign-identity], we find an interesting pilot project in the energy sector initiated by the Austrian Power Grid (APG) and Energy Web Foundation (EWF) to enable small and medium-sized devices called Distributed Energetic Resources (DER), to participate in frequency regulation of the national power grid [distributed-energy-resources-for-frequency-regulation]. This response to the UN’s Sustainable Development Goal 7 "Ensure access to affordable, reliable, sustainable and modern energy for all".
-The challenge is that the transmission grid must maintain a consistent frequency to function properly. Power plants typically coordinate to adjust the input frequency in response to changes in energy consumption. However, this becomes particularly complex when integrating small and distributed devices.
-It is necessary to identify small devices correctly to avoid issues throughout the network. Verifiable Credentials are present within the devices' operating systems to ensure the IAM aspect, as well as DIDs to identify them correctly [energy-web-credentials-overview].
-3.2.3. Automotive (IoT)
-An interesting use case for automotive can be found in "Self-Sovereign Identity - Foundations, Applications, and Potentials of Portable Digital Identities" [ssi-foundation-applications-and-potentials].
-A car, identified by the Vehicle Identification Number (VIN), interacts with various entities throughout its lifecycle, including:
--
-
-
-
*Manufacturer, vendors and workshops *: Tracking maintenance and service history.
- -
-
Governmental entities: Registration and tax payment.
- -
-
Owners and users: Ownership verification and usage rights.
- -
-
Road infrastracture: Toll payments and other interactions during use.
- -
-
Insurance companies: Policy management and claims processing.
-
It could also be opened and closed directly through the owner’s Wallet, making the car a Verifier during unlocking and a Subject in the owner’s wallet.
-This illustrates the utility of IoT identities and credentials, and their integration with governmental and human identities [self-sovereign-identity]. For example, when buying and selling a used car, several elements must be verified, such as:
--
-
-
-
The vehicle’s characteristics and history.
- -
-
Ownership verification of the seller.
- -
-
The buyer’s creditworthiness.
- -
-
Completion of ownership transfer and insurance paperwork.
-
Note: The automotive case is particularly interesting. Even though it is a Non-Human Identity, being often used by humans could have serious privacy implications, as is currently the case with insurance black boxes.
-3.3. Human identities and governments
-Let us return to the initial example and analyze human identities, focusing on those issued by the government. These have the most assurance and thus expose the user to the most security, privacy, and human rights threats.
-A government issues a human-citizen with a specific set of credentials for the purpose of identification and to outline their attributes:
--
-
-
-
Travel documents (e.g., passports and entry visas)
- -
-
Personal licenses (e.g., driver’s licenses, amateur radio licenses, professional licenses, marriage licenses)
- -
-
Permits (e.g., residence permit, work permit)
- -
-
Registration of vehicles, ships, and other property
- -
-
Welfare programs
- -
-
Proof of residency
- -
-
Proof of age
-
We will conduct a thorough historical analysis.
-3.3.1. Physical identity
-In the past, individuals were known and acknowledged based on their physical attributes and voices, particularly in small, close-knit communities where mutual familiarity prevailed. Within such contexts, the establishment of trust among acquaintances served as an effective means of identification.
-Note: Notably, the assurance of our identity in the social realm often relies on a third party, such as society as a collective entity or directly through government authorities.
-3.3.2. Textual credentials
-Up until the 1700s-1800s, when there was a lack of direct knowledge between the parties (and thus trust), such as when traveling, to identify oneself, it began to be necessary to present credentials issued by a trusted third party, such as a government, in the form of a paper with written information proofed by the authority.
-Note: A particularly well-known example of textual credentials is the first driver’s license, issued in 1888 to Karl Benz so he could use his experimental car [how-might-driver-licensing]. It was a paper signed by the local authority (a trusted party), which was required after neighbors complained about noise generated by his driving, so not for identifying himself.
-These credentials are issued by a trusted entity (e.g., a government), carried or presented by the person in question (e.g., the user with a passport), and then verified by those in charge to authenticate (e.g., the border police) and provide something (e.g., permission to cross the border).
-Even then, there were security problems: on the one hand, counterfeiting—which was mitigated by using stamps, seals, or special paper—and the use of documents by persons other than the one for whom the document was issued, which was mitigated by including a written description of the owner’s facial features to bind them to the document as photography had not yet been invented.
-3.3.3. Photographic credentials
-The first documented use of photography for identification was in 1876, thanks to the photographer William Notman, who had used photographs to identify workers and guests at the Centennial Exposition in Philadelphia [the-world-of-william-notman].
-However, government-wide use was introduced only in 1915 after the U.S. government discovered that a German spy was using a U.S. passport because he had physical characteristics similar to those described in written words in the passport and could talk in English [how-have-passport-photos-changed-in-100-years].
-Note: The primary purpose of photography is to associate the passport with the individual to whom it was issued. It is essential to ensure that only the legitimate holder of the credential can utilize it.
-3.3.4. Machine readable credentials
-As the technology evolved, the idea was to use machines to help read the documents. This would speed up the verification process. But it was necessary to make the documents easy for machines to read.
-To address this, particularly for travel documents, ICAO began working on machine-readable travel documents in 1968, and in 1980, it published Document 9303, which contained the specification of a machine-readable code to be printed on documents [doc-9303]. It is the code with many "<
"s in our passports and on some ID cards.
As an evolution, in 1998, Document 9303 also included biometric information transmitted via RFID technology. Nowadays other machine-readable techniques include barcodes and QR codes.
-Note: ISO endorsed this document through ISO/IEC 7501-1, making the role of Standard Development Organizations (SDOs) particularly important for interoperability in this field.
-3.3.5. Physical credentials as digital credentials
-While these practices have certainly sped up reading and verification in physical contexts - when the verifier has access to the original physical document, they are inefficient if used in a digital context, in particular when the verifier has no access to the original document as the physical credential is scanned or photographed and its file is used.
-A classic use of government-issued documents on the Internet and the Web is enrollment in financial services.
-The user must indeed provide these documents. At the same time, the financial service provider must verify that they comply with Know Your Customer (KYC) and Anti-Money Laundering (AML) practices to Counter the Financing of Terrorism (CFT).
-Then, the user photographs or scans the documents (rendering ineffective the anti-counterfeiting measures inherent in the physical document) and themself (to bind with the document) and sends these files to the financial provider.
-Often, the financial provider delegates the process to specialized companies that use Machine Learning and manual control to verify the information.
-Thus, we have at least two problems: the entire document is sent to different places, making a data breach more likely, and Machine Learning systems often analyze it. The problem is well described in "AI & the Web: Understanding and managing the impact of Machine Learning models on the Web".
-Moreover, an additional privacy concern is inherent in this use case - which applies even when the document is used physically. Even if the user uses the document for a specific reason (e.g., proof of address or proof of age), they must send the whole document, thus showing more information than is needed for the specific verification, violating the privacy principle of data minimization.
-3.3.6. Pure digital credentials
-Governments and regulatory bodies have also stepped up to issue digital credentials for citizens. -Each government has made its own architectural choices and can offer different services, from centralized or federated authentication to decentralized identities giving citizens a wallet to hold one’s digital credentials.
-Below is a short list with some implementation examples:
--
-
-
-
Estonia: Estonian e-Identity.
- -
-
Europe: EUDI-ARF (electronic IDentification, Authentication, and trust Services, implementation of EU’s eIDAS 2.0 legislation).
- -
-
India: Aadhaar.
- - - -
-
-
Nigeria: Nigeria’s eID.
- -
-
Singapore: Singpass.
- -
-
Spain: Cl@ve.
- -
-
United Arab Emirates: UAE Pass.
- -
-
United States of America: U.S. DHS on Digital Identities and Mobile Driving Licence (e.g., Maryland, Arizona, Utah, California).
-
Some governments are doing pilot projects with Decentralized Identities, providing their citizens with Digital Wallets and IDs.
-Let’s delve into an extensively debated use case requiring a solution: age verification.
-The holder has a digital passport in the form of government-issued credentials; this credential, in its claims, also contains age information.
--
-
-
-
Full credential: It is possible to send the full credential since it also contains the date of birth, from which the verifier can derive the age. This doesn’t meet the principle of data minimization, though, as a lot of other information is sent which can be misused and make us traceable.
- -
-
Selective disclosure: If only the date of birth is submitted, we still have a minor data release, as the verifier is interested not in the date of birth but in whether the person is of age. If the credential provided supports this privacy feature, which allows us to send individual attributes/claims, we can send only the date of birth, by which the verifier can derive the age. It certainly improves the situation concerning data minimization, but it does not solve it totally. To overcome this problem, some credentials have specific attributes with boolean values to present that our age exceeds a certain value (e.g., 16, 18, 21).
- -
-
Range proof: If we send the verifier the boolean result of a computation related to the value of a specific attribute (e.g., the verifier asks us if we are older than 21 years old, and we send the result of the computation on the date of birth).
-
The problem is that, even in the last two cases, we can present potentially linkable information to us or our issuer, which the verifier can use to make correlations. For example, it is necessary to decouple the signature from the signer and not use the same identifiers in different sessions.
-Conversely, the verifier will have to somehow prove that they performed the age verification, which further complicates the matter.
-Therefore, even in a scenario that may seem trivial, it requires extensive study.
--
According to the Trust Over IP Stack, the ecosystem of Decentralized Identities is very broad and combines technological aspects such as Digital Credentials and Wallets - and those of Governance [introduction-toip].
-Therefore, some threats exist at the technology level and can be managed by SDOs and implementers, but governments must manage others at the governance level. Governments provide the requirements and technology architectures that are then standardized and implemented.
-For example, a centralized identity system is prone to surveillance. Conversely, a decentralized system with certain technological features and cryptographic methods can mitigate surveillance and respect human rights.
-Other issues are related to digital wallets. On the one hand, it is necessary to balance security and hardware and software requirements that could discriminate. On the other hand, it is important to avoid vendor lock-in and prevent what happened with the Digital Market Act and default browser choice.
-Therefore, it is important to do a risk analysis with both technology and government stakeholders to mitigate threats appropriately.
-Suppose threats cannot be managed at the technology level. In that case, they should be managed at the governance level, for example, by banning certain uses or removing features that are not technically possible to mitigate the threat. Two-way communication between governments, SDOs and implementers is therefore needed.
-4. Acknowledgments
-Several individuals contributed to the document. The editor especially thanks Pierre-Antoine Champin, Andrea D’Intino, Giuseppe De Marco, Heather Flanagan, Ivan Herman, Philippe Le Hegaret, Tommaso Innocenti, Ian Jacobs, and Denis Roio.
-