DIDComm gets formal
16 October 2024 10 mins read
Input | Output takes the setting of solid work foundations very seriously. A recent example of this is the research on Decentralized Identifier Communication (DIDComm) through the paper ‘What did come out of it? Analysis and improvements of DIDComm messaging.' which is being presented at this week’s CCS’24, ACM’s flagship conference for computer and communications security. The paper takes a formal look into DIDComm.
What is DIDComm all about?
Let’s take a step back before delving into this work. DIDComm is the main communications framework used in Self-Sovereign Identity (SSI) systems today. DIDComm was first defined through a set of Request for Comments (RFCs) in the Hyperledger consortium, but it recently adopted its own specification, DIDComm v2, now hosted by the Decentralized Identity Foundation. This new version aims to provide a flexible framework for establishing secure communication routes between a sender and a (set of) recipient(s), where all participants are identified by W3C decentralized identifiers (DIDs). Following the philosophy of DIDs, the DIDComm framework aims at minimizing trust in third parties without sacrificing security.
In the main use case of DIDComm, which is schematically depicted below, a sender (Bob) who wants to send a message to a recipient (Alice), who possibly owns multiple devices, fetches Alice’s DID document from her DID string (something like 'did:example:alice'). This DID document specifies the public key of each device owned by Alice and the route that messages intended for these devices must follow – which Bob may concatenate to some extra hops chosen by himself.
For this message exchange to happen securely, DIDComm leverages well-known JSON-based standards and specifies some combinations of public and symmetric key encryption and key derivation algorithms, plus digital signatures. While DIDComm is limited to the SSI space, it is becoming increasingly relevant and widely used elsewhere. Thus, it is essential to formally look into DIDComm now, to help identify and fix potential weaknesses before it is too late.
What’s our research about?
In our work, we formally look into whether DIDComm’s combination of encryption and key derivation schemes achieves the claimed security properties. However, we exclude digital signatures from our analysis. Analyzing the encryption and key derivation part is quite some work. This may appear a simplification, but keep in mind that DIDComm – up to our work, that is – lacks a formal statement of what it aims to achieve security-wise. Thus, formally capturing DIDComm’s security goals is a big part of our work. To do this, we studied its specification and related standards and also comprehensively reviewed the main implementations in Python and Java, and even coded our own improvements, as described next.
Once we have a formal model of DIDComm’s messaging, we prove that DIDComm’s encryption modes actually achieve the claimed (and now formally stated) security requirements. In the process, we identify some minor weakness, easily resolved without changing anything in DIDComm. We also realize that, in one of the proposed encryption mode combinations, DIDComm follows a somehow naive (yet, easy to code) approach, and we propose an alternative way that implies an improvement of about 2x in computational and message size costs, per hop in the DIDComm route, while maintaining code simplicity.
Through our formal model, we also concisely capture that DIDComm always leaks the recipients’ identifiers. While this is a trivial observation after reading the specification, we already cast our model in such a way that it allows us to modulate whether or not we want the recipients’ identifiers to be leaked (or anything else beyond identifiers for that matter). Equipped with this capability to model arbitrary leakage, we are in a position to propose additional DIDComm encryption modes that make DIDComm fully privacy preserving by preventing any leakage. Moreover, we not only propose the new modes, but also prove that they are secure according to our model.
Let’s review these contributions one by one.
DIDComm’s formal model
To capture what DIDComm is expected to achieve, we leverage a cryptographic modeling framework known as Constructive Cryptography (CC), which requires defining an ideal version of the protocol at hand – DIDComm, in this case – and comparing it with the actual real world definition of DIDComm. CC also allows us to abstract out some assumed building blocks. In our case, we assume that a secure public key infrastructure (PKI) and an insecure communication network already exist. In the real world, the PKI will be any DID method, whereas for the networking part, any existing network suffices, as we make no security assumption on it. Moreover, we parameterize our network idealization with a leakage function, which allows us to say things like ‘For every message (msg), this network leaks f(msg), but nothing more.’ We also parameterize our idealization of DIDComm with similar leakage functions. Intuitively, if for a given leakage function we can prove that no adversary can tell whether it is operating in the real world DIDComm or in the ideal world DIDComm, then this means that the former is as secure as the latter for the chosen leakage function.
So, what do we expect DIDComm to leak to network observers? Well, this depends on the specific encryption mode. We look into the main two modes (anoncrypt and authcrypt) and their combination, as described in DIDComm’s specification.
In anoncrypt mode, the sender does not reveal its identifier to the recipient(s), and no sender authentication is ensured, but the message contents are confidential with respect to third parties. In this mode, we expect DIDComm to leak only the message length and the recipient(s) identifier(s) to third parties.
In authcrypt mode, the sender is authenticated towards the recipient(s) through authenticated encryption with associated data schemes (such as AES in GCM mode), and the message contents are also confidential to third parties. In this mode, the message length, the sender identifier, and the recipient(s) identifiers are leaked to third parties.
Finally, DIDComm explains how to combine both anoncrypt and authcrypt: first run authcrypt over the message to be sent and run anoncrypt on the result. The expected security of this is: the sender is authenticated towards the recipient(s) and the message contents are confidential to third parties. However, in this case, only the message length and the recipient(s) identifiers are leaked to third party observers. So in some sense, this is expected to provide the best of both modes.
Once we have this idealization, we proceed to prove whether the real world definition, given by the concrete algorithm choices for symmetric encryption and key exchange in the DIDComm spec, is indistinguishable from the idealization we just described. In a nutshell, yes, the real world DIDComm achieves the same security as the ideal world DIDComm, with the caveat that for the combined mode it needs to restrict the symmetric encryption choice to AES in CBC mode with HMACs (AES-CBC+HMAC). The reason why is a bit subtle. Basically, AES-CBC+HMAC is an Authenticated Encryption Scheme with Associated Data (AEAD) that, in addition, has the property of being key-committing. The other alternative offered for AEAD by DIDComm is AES in GCM mode, which is not committing. The consequence of the latter is that an adversary could in theory come up with a ciphertext that enables him to trick different recipients to receive different plaintext messages, which is something that is clearly not the intended behavior.
An efficient alternative for anoncrypt(authcrypt()) composition
While studying how the previous direct composition of anoncrypt and authcrypt works, we notice that this somehow naive composition is actually duplicating some work. Namely, DIDComm is performing two bulk encryption processes. Moreover, a side effect of the direct composition, independent of the underlying cryptography, is that DIDComm is including (and encrypting) the recipient(s) identifier(s) twice. While this may seem innocuous, think that a single identifier in some DID methods may require several hundreds of bytes. And this is done once for each hop in the DIDComm route from the sender to the recipient!
In our alternative proposal, we conflate both algorithms so that only one bulk encryption is needed and no identifier is duplicated. The caveat? Our proposal is not directly compatible with the JSON draft standard upon which DIDComm builds. However, it is still easily implementable (we give a prototype in GitHub) and could be an interesting addition to the RFC draft.
A fully privacy-preserving encryption mode
As stated in the previous overview of our DIDComm ideal world model, the three modes of operation in DIDComm leak the recipient(s) identifiers to third parties. This may be ok in most situations – that is, for most communication networks, such as plain TCP/IP networks. However, this is not ‘privacy preserving’ in general. Specifically, anonymizing networks such as Tor do not leak this. In our model, this would be captured by the leakage function defined for the assumed network. In the case of TCP/IP, many things are leaked. In the case of Tor, only the message size. So, if one were to use DIDComm on top of Tor, we would not be able to prove that the real world is as secure as the ideal world, because the recipient(s) identifier(s) are leaked.
Our proposal is quite simple: run the key derivation algorithm for each recipient as before, but then encrypt a ‘1’ bit with the resulting key. The con is that, for Alice to determine whether a message is intended for her, she must run the key derivation algorithm and attempt at decrypting the encrypted ‘1’. If she gets a ‘1’, the message is indeed intended for her. If she gets anything else, then it is not. While this may seem too much, we note that there are use cases in which this may still be practical. Namely, a set of users can configure a sort of anonymous inbound mailbox. For any incoming message, they only need to download the header, run the previous process, and download the message in full only if there is a match (maybe including some decoy downloads, for increased privacy). With carefully sized inboxes, this may offer a good tradeoff between efficiency and privacy. Of course, more advanced methods may exist, but this one has a very attractive property: it is already implementable via the JSON standards upon which DIDComm builds!
So, what’s next?
We have laid the first stone towards formally analyzing DIDComm and already proposed a few improvements. Now, it is up to the community (of which we are of course part of) to keep building on them.
Beyond our theoretical contributions, we do believe that the improvements we propose are interesting to DIDComm. Both because of the efficiency gains, which would also benefit the wider JSON encryption community, and also because the use cases that our privacy preserving mode may enable. Think for instance of receiving anonymous credentials in a fully anonymous way! Nobody needs to know whether you are the one who is buying a ticket for some movie, or that you are receiving a membership credential for some organization!
Finally, an important aspect to note for future work. The fact that we chose a composable modeling framework facilitates subsequent analysis of higher-level protocols built on top of DIDComm’s encryption modes. For instance, one could analyze the security of DIDComm routing by defining the expected concrete network leakage, and leveraging the basic encryption modes, which we prove secure in this foundational work.