FIP: Direct Casts #99
CassOnMars
started this conversation in
FIP Stage 1: Ideas
Replies: 2 comments 2 replies
-
This reads really nicely |
Beta Was this translation helpful? Give feedback.
0 replies
-
Such a well written and detailed proposal. Wouldn't it make sense to rely on other protocols such as XMTP for that purpose? Instead of investing resources to do this in FC directly, figure out the best way for FC and its clients to integrate with XMTP and other protocols? If they aren't up to par then bring up these consents with them so they can improve. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Abstract
Direct Casts has been appropriately field tested and battle-hardened, and should be in a good position to be moved to hubs. There are many significant considerations required to retain the same (if not improved) privacy and secrecy guarantees, which this draft will discuss.
Problem
Direct Casts have been a feature of Warpcast (the client and backend) for some time, however supporting an implementation on hubs and supporting group messaging both would necessitate some rework to make either possible, with great overlap in designs, making it a nice confluence of conditions. See Rationale for deeper insights to the problems being addressed.
Specification
To achieve per-device-per-application messaging but ensure users can choose to revoke access to a given device or application for subsequent messages, this proposal adopts the use of the asynchronous variant of the Triple-Ratchet Protocol. To ensure this specification is self-contained, we describe it fully within Appendix A.
To support Direct Casts, we will need to add two new stores to support their corresponding message types: Keys and DirectCasts.
Key Store
The key store would introduce the new type KeyBody to the MessageData:
and add the new MessageTypes:
This would need the addition of a CRDT 2P-Set for Keys:
Add a Key CRDT which validates and accepts KeyAdd and KeyRemove messages. The CRDT also ensures that the message
m
passes these validations:m.signer
must be a valid Signer in the add-set of the Signer CRDT formessage.fid
A conflict occurs if two messages have the same values for
m.data.fid
,m.data.body.device_id
andm.data.body.type
. Conflicts are resolved with the following rules:m.data.timestamp
is distinct, discard the message with the lower timestamp.m.data.timestamp
is identical andm.data.type
is distinct, discard the KeyAdd message.m.data.timestamp
andm.data.type
are identical, discard the message with the lowest lexicographical order.The Key CRDT has a per-user size limit of 200.
Cutoff
Keys explicitly support cutoffs, where the version of the message is bounded to a cutoff date – one year from the release of Keys, the messages will automatically be no longer supported and purged, unless subsequently extended in a new FIP.
KeyBody Message
PublicKey Message
Signature Message
The structure of Key messages implies a hierarchical POV: the signature on a key is expected to be in relation to a longer-living key – an identity key is signed by the user’s signer or wallet key, a signed pre-key is signed by the device’s identity key. For one time pre-keys, the key’s signature is omitted, as the message itself is signed by the user’s signer for authenticity.
Validations
A Key message
m
must pass the following validations:m.signature_scheme
must beSIGNATURE_SCHEME_ED25519
m.data.body
must beKeyBody
m.data.body.type
must be ≤ 4 bytesm.data.body.public_key
must be a validPublicKey
m.data.body.signature
, if present, must be a validSignature
A PublicKey
p
must pass the following validations:p.type
must be a validPublicKeyType
and not set toPUBLIC_KEY_TYPE_UNSPECIFIED
p.public_key_bytes
must be in a compressed representation if available for the key typep.public_key_proof
must be a succinct proof of knowledge of the corresponding private key (for EC keys, see ZKPoK-DL in Appendix A)A Signature
s
must pass the following validations:s.type
must be a validSignatureType
and not set toSIGNATURE_TYPE_UNSPECIFIED
s.signature
must be a valid DER encoded signatures.signer
must be in a compressed representation if available for the key typeA KeyAdd message
m
is valid only if it passes these validations:m.data.type
must beMESSAGE_TYPE_KEY_ADD
A KeyRemove in a message
m
is valid only if it passes these validations:m.data.type
must beMESSAGE_TYPE_KEY_REMOVE
Key APIs
Hubs will be extended to support the following APIs:
Key types expected for use in Direct Casts are:
"idk"
- Identity Key"spk"
- Signed Pre-Key"ibk"
- Inbox KeyDirectCast Store
Direct Casts are a bit unusual compared to other types – they can be treated as message types, but they cannot be validated like other message types. We add the DirectCastRequestBody and DirectCastBody types to the MessageData body:
and add the new MessageTypes:
This would need the addition of CRDT Grow-only Sets for Direct Cast Requests and Direct Casts:
Add a Direct Cast Request CRDT which validates and accepts DirectCastRequestAdd and DirectCastRequestRemove messages. The CRDT also ensures that the message
m
passes these validations:m.signer
must be empty, deeper data validation will ensure integrity.These requests are conflict free implicitly – a conflicting message cannot be produced.
DirectCastRequest Message
DirectCast Message
The structure of DirectCast message types are fully enveloped, using ring signatures to ensure the message is a viable candidate for delivery. Validation cannot be fully ensured by hubs, but hubs can at least guarantee the sender has the right to send to the recipient.
Validation
A Direct Cast Request differs from a Direct Cast Message both in intention and in a few subtler details.
m.signature_scheme
must beSIGNATURE_SCHEME_NONE
m.data.body
must match the requisite type, beingDirectCastRequestBody
for requests,DirectCastBody
for ongoing conversationsm.data.body.signature.type
must beSIGNATURE_TYPE_X25519_RING
and the associated signature must be comprised of some or all inbox keys that are mutual follow links of the recipient plus the recipient (depending on the degree of privacy the user wishes to obtain) at the time of the request, or the recipient plus a sufficiently random selection of other users if the recipient does not have mutual following required to receive messages. For ongoing conversations, the chosen inbox keys should not be changed unless the recipient has changed their inbox key, although for enhanced privacy this may not be an ideal time to immediately .m.data.fid
on the message should be0
, but for DirectCastRequests should be set to the recipient.The DirectCastRequest CRDT has a per-user size limit of 200. The DirectCast CRDT size limit is harder to ascertain the right figure – DCs are inboxed unique to each conversation and have no traceability to a given user from hub-visible information. Ideally, they should be limited per conversation to a reasonable number of messages to be received, as they should be purged by the client after receipt.
Direct Cast APIs
These RPCs serve to provide a very simple POV of how encrypted messaging should work – the clients necessarily must be more involved to interpret and process messages accordingly. Because this opinionated stance is required to ensure security and secrecy properties, the required behaviors on the client side must also be elaborated to make this document whole.
Client Behaviors
The Javascript client library (and additional libraries in other languages) will need to be updated to handle the full scope of both the Double and Triple Ratchet protocols, but also to properly utilize the RPCs to perform the requisite behaviors that ensure full end-to-end privacy.
Initiating a Conversation
A client must prepare a DirectCastRequest message which contains the following contents:
Preparing an initialization message requires initiating a Triple or Double-Ratchet as a sender, taking the resulting RatchetMessage, relevant Device key bundles, and then encrypting the entire envelope with a symmetric key derived from Diffie-Hellman with the recipient’s inbox key and an ephemeral key. Taking the ciphertext of the envelope, the sender produces a ring signature over the ciphertext using a collection of public keys that would serve to most significantly decrease the identifiability of the sender – For the initial version, maintaining compatibility with existing behaviors where only mutual follows can send messages, this would be maximally derived from the set of mutual follows for the recipient, using their inbox keys. If a sender does not care about a brief period of time in which hubs are aware they are initiating conversation with the recipient (although still blind to the contents), then the ring signature can simply be of size 2, only containing the sender and recipient’s inbox key in the set.
Continuing the Conversation
The ongoing conversation continues at the DirectCast message level, where the sender's declared topic may receive messages from the recipient. The recipient's first message should contain a corollary topic id for the initiating sender to reply on. Messages are simply following either Triple or Double-Ratchet. Critically, it is important that a maximum message size be defined, and messages be padded to that size prior to encryption to ensure cryptanalysis on message contents is infeasible. The ongoing recipients that did not initiate should pick a random set of public inbox keys including their own to form their ring signatures.
Rationale
At a high level, Warpcast currently uses an X25519-based implementation of Double-Ratchet without header encryption, where the identity key is signed by the Farcaster wallet key, the signed pre-key remains signed by the identity key. This configuration, while imperfect, fits well with the experimental nature of Pareto-optimal feature support – and with the nature of React Native's native bridge, such experiment was well warranted to shake out any potential design alterations required to compensate for RN's quirks.
Managing these keys utilizes a simple key store, which is write-access controlled to the user, but read-accessible by all. Messages utilize a simple store with seven day expiration, and are weakly-correlative with the participants, an ability which is in line with the existing unencrypted header data.
Moving this to a decentralized context requires some important lifts:
Further elaboration of these problems follows:
General Problems for E2EE Messengers
To contextualize considerations, we will limit discussion of the problem space to E2EE messengers which offer “Signal-level encryption”.
Unwinding Security Guarantees
Many messengers which claim to offer “Signal-level encryption” make compromises to the original properties that made Signal a world-class direct messaging tool. In particular, these properties are:
The latter two are trivial to implement in a centralized messenger, or a messenger with basic E2EE properties, however the Double-Ratchet Protocol was particularly clever because it provided those two properties safely in the context of having the other three. Here’s why that’s hard:
These first two properties, very succinctly, require frequent key rotation, fresh cryptographic material introduction combined with existing state, and most critically, are deleted after use. Where such constructions frequently get unwound is through a compulsory feature of “portability” (although not impossible to provide without breaking these guarantees).
Central Problem
Any kind of single-key-based encryption scheme for this information unwinds these security guarantees, and if a user cannot opt out of this, the application of the Double-Ratchet protocol or similar approaches is effectively security theatre.
Specific Problems in Decentralization
Degrees of User Unmasking
Correlation attacks are extremely easy to perform on decentralized (whether federated or P2P) networks with public data even when the user-identifying data is encrypted – Typical “non-attributable” encrypted messenger inboxing follows the notion of having a dedicated “rendezvous point” inbox per user, where a sender obtains key info about a recipient, sends a encrypted message to that rendezvous point naming a new non-correlated (frequently random, but otherwise un-derivable) topic name to continue the conversation (bidirectional topics), optionally with a second phase from the recipient to confirm or create a second topic for unidirectional topic queues.
Unmasking Bidirectional Topic Users
If both parties are connecting to the same node, this is a blatantly trivial exercise – associate the IP address of the senders to the topic and other activities (publishing new keys, updating rendezvous topic, etc.). As an intermediary node, this is more difficult, but ultimately obtainable – finding associated events published from the nodes tied to a given user will eventually unmask. If the messaging is not permissioned, this can be made trivial with large broadcast sweeps of new topic requests – the timing of a client fetching messages alongside handling the topic request would be enough.
Unmasking Unidirectional Topic Users
Some decentralized messaging protocols take the opinion that wholly separate topics are sufficient for preventing correlation, but while this helps, this is unfortunately insufficient, as again with same-node considerations, the effort is trivial, but for intermediary nodes simply sniffing messages, the same timing analysis applies.
Additional Pattern Analysis
Assuming the above problems are solved on the timing analysis, a tangential discovery mechanism persists – many “Signal-like” messaging clients encrypt messages using a variant of AES, which is a block cipher that operates on 128-bit blocks of data, resulting in ciphertext sizes always divisible by 16 bytes (or in the case of some variants, also including a tag which is frequently appended to the ciphertext). Due to non-uniform sizes of messages, the ciphertext size can be used as a fingerprint. Additionally, the outboxes must contain valid ring signatures that do not identify the participants.
Central Problem
Stated plainly, a solution to this problem should prevent analysis at the same-node level from successfully unmasking users and correlating conversations.
Degrees of Repudiability
Repudiability is the property in which a user has plausible deniability to the authenticity of a message at any time beyond when it was initially sent – that is, when in conversation with another user, the recipient can be assured the message came from the sender, but after the fact, it is impossible to prove the message originated from the sender. This condition is one of the essential features of Signal, and many messaging applications/protocols which profess to be Signal-like fail to uphold this consideration, sometimes intentionally, frequently unintentionally.
Non-Ephemerality
In a distributed systems context, it can be immediately obvious how ephemerality would be harmful to deliverability, and create a poor user experience, but by taking the naive approach by not supporting ephemerality, repudiability becomes infeasible – a rogue node could simply watch and timestamp all traffic, which would assert provenance of the traffic, eliminating the ability to plausibly deny the origination of a message.
Distribution as Storage
Many networks attempt to solve for long-term storage of user inboxes by simply retaining the messages in a given topic as a set. This is immediately problematic because it implicitly destroys repudiability. To prevent this, moving a message into secondary storage on a given network is often considered – this helps, provided said storage is accessible only to the end user and not by intermediaries, which unfortunately is the case with many implementations.
Central Problem
Message queues must be effectively unidirectional, but also guaranteed to be ephemeral, with any associated long-term storage fulfilled either off network or on network in a wholly oblivious way.
Appendix A: Asynchronous Triple-Ratchet Protocol
The asynchronous variant of the Triple-Ratchet protocol is an extension to the Double-Ratchet protocol, utilizing an asynchronous DKG ratchet to provide a “room key” as the counterparty receiver key plugged into the Double-Ratchet algorithm’s Diffie-Hellman ratchet. The DKG ratchet remains secure in the malicious majority model, provided there exists a trusted PKI authority, which can be assumed via the signing key hierarchy of authorized keys issued from an Ethereum wallet holder.
Terminology
Preliminary Info
To fully understand how Triple-Ratchet is implemented, the individual pieces involved are described below.
ZKPoK-DL (Schnorr Proof)
In this variant of ZKPoK, we are computing a value relative to a threshold sharing of a logical secret key. Further description of that sharing in of itself is provided in the Secret Sharing and Distributed Key Generation sections, so for this section, simply assume the threshold sharing of the secret and public key ($sk_i$ , $PK_i$ ) to be already created.
Prove
Verify
Obtain all commitments, then the ZKPoK, random public points and the threshold sharings of the public key are released. With these values, verify:
Secret Sharing
Shamir’s Secret Sharing (SSS)
Shamir’s Secret Sharing is a technique for encoding a secret in the form of a constant of a randomly sampled polynomial, then distributing evaluations of that polynomial to each participant such that the threshold number of participants in the scheme could perform Lagrange interpolation to reproduce the constant:
Given a threshold of three participants, construct a$t-1$ degree polynomial, randomly sampling coefficients ($A, B$ ) from the finite field, setting the constant $C$ as the secret:
The dealer of these secret shares then evaluates the polynomial where$x$ is the identifier of the participant (notably, $x$ cannot equal zero as it would simply be handing the participant the secret, and likewise, $x$ cannot be the order of the group either, as $x\mod{q} = 0$ where $x = q$ .
The dealer distributes these samples to each participant, and when recombining, the participants calculate:
$C = f(0) = \sum_{j=0}^{t-1} y_j\prod_{\substack{(m=0) \ {m!=j}}}^{t-1} {x_m\over{x_m - x_j}}$
Because mathematics notation is laboriously terse, here is the algorithm (assume operations are modulo$q$ , don’t use this code outright):
Feldman Verifiable Secret Sharing (FVSS)
Feldman Verifiable Secret Sharing is the same premise, but with a twist – how do we know that the shares produced by the dealer are correct? We could have all participants converge after distribution to verify any combination of threshold shares succeed in producing the same value, but oftentimes in secret sharing schemes the recipients of the shares are not in the same proximity at the time of construction, often for security purposes. When we need to trust that the secret sharing scheme was legitimate, we amend the sharing protocol to include a proof:
When computing the shares from the secret, take the secret as an exponent to the group (in the case of elliptic curve secrets, this means use the secret as a scalar to the generator$G$ : $s \cdot G = P$ , and do the same to all the shares: $s_0 \cdot G = P_0, s_1 \cdot G = P_1, ...$
Distribute the public values with the secrets to each participant, and distribute$P$ . All participants may do Lagrange interpolation of the polynomial with the public values of the shares (Shamir in the Exponent) like before, this time iterating through all participants. The resulting output, if the dealer did not cheat, will equal $P$ for all combinations of threshold participants. If the dealer did cheat, some or all of the combinations will not equal $P$ . Again, because this is somewhat terse, the algorithm is as follows:
Because the DLA applies to the class of elliptic curves in prime fields that we are working in, it is computationally infeasible to reconstruct$s$ given $P$ .
Distributed Key Generation (DKG)
Given the above information, we now have enough tools at our disposal to create a Distributed Key Generation algorithm. Each party will do the following in rounds of communication (over a secure channel, which we already have via Double-Ratchet 🙂):
Threshold Diffie-Hellman (Threshold DH)
Note that in the DKG’s second round, we used the generator$G$ to produce the public point sharings $PK_i$ . Assume DKG has been performed. Any threshold number of parties may now perform Distributed Diffie-Hellman:
Verifiable Encryption
There exist many verifiable encryption schemes. For the sake of our needs, consider a DKG scenario in which some parties are offline after round 3. We have verifiability outputs, but they are specifically encrypted in transmission to be retrieved by each party member. We could optionally elect to expose the values in the clear so that there is no way in which individual parties could be mislead by the outputs when they later come online, but this places trust on the hubs that the contents are not manipulated or if some amended protocol produces a jagged broadcast model where the outputs of step 2 and 3 are combined and stashed, a rogue hub could collude with an attacker to invalidate the security of the ZKPoK by allowing the attacker to amend their stashed values to conduct a rushing adversary attack.
Simple Encryption
Leveraging the DLA, we can encrypt the point value by multiplying the secret key share ($sk_i$ ) against the public Encryption Key of the recipient ($ECK_j$ ). The recipient can calculate the modular inverse of their secret Encryption Key ($seck_j^{-1}$ ) and multiply it against the encrypted value, producing the resulting public point share ($PK_i$ ). Augmenting the DKG process to transmit this output instead of $PK_i$ would allow the safe transmission of the public point share as a message header, however other parties would be unable to verify that this encryption was correct.
Simple Verifiable Encryption
Leveraging a simple verifiable encryption scheme a la cut-and-choose techniques, we can offer some variant to allow other parties to verify the encryption was done correctly, but further elaboration is less important for the scope of this document.
Triple-Ratchet Protocol
Problem
The problem with the above DKG approach is that the setup phase requires all parties to not only be online at time of key generation, but that they must also all perform each round before any may proceed to the next. While it is feasible in some scenarios to allow a small collection of users to conduct DKG to fit into the “counterparty” receiver side of the DH ratchet with the expectation that they will remain online (typically, this would be in a decentralized service in which at least a quorum of node operators leave their nodes always online), this is inviable for highly asynchronous communicators who may only have one party online at a time.
Relaxing the Rushing Adversary Threat?
This friction can be reduced by simultaneously reducing the security of the protocol through removal of the ZKPoK requirement, and thus participants merely maintain a cache of polynomials/evaluations ahead of time. Losing the security against Rushing Adversaries may not be problematic for a group chat (indeed, Double-Ratchet itself does not care as the DH ratchet itself is quite susceptible to this attack).
Utilizing an epoch-based approach
We could simply allow users to emit one-time use polynomial fragment bundles in step one into a logical set of queues, each encrypted to the individual peers, and upon an epoch, all parties dequeue the latest polynomial fragments, perform their summations, and distribute public points (omitting the ZKPoK process) so that local Shamir in the exponent can be performed to derive the new public key. This can now be our process as well for the “room key” of the Diffie-Hellman ratchet. This poses two problems, with solutions that break the original strength of Double-Ratchet:
A Grab-bag of Polynomials
Revisiting the initial setup phase of DKG, we already have the following available to us:
Let’s iterate over a theoretical asynchronous DKG protocol which allows for ratcheting of the room key in a way that any party can reasonably verify no bifurcations may occur, nor permit message forgeries (this requires consensus in a P2P model, which does not apply here), without losing the Forward Secrecy, Post-Compromise Secrecy or Repudiation/Deniability properties of Double-Ratchet.
Asynchronous Verifiable DKG Ratchet
Polynomial Verifiable Sharing (PVS)
Recall that in PVSS, we had performed Lagrange interpolation over the set of terms raised to the exponent of the generator (Shamir-in-the-exponent). Performing the same for the fragments produced by any given party can provide a meaningful verifiable sharing mechanism, extending from the simple EC additive homomorphism.
Given a series of fragment ($frag_{i,j}$ ) scalars I wish to send to each party, I can calculate the following:
These fragments are distributed to all parties, who confirm:
Asynchronous DKG Ratchet
Now that we have a means to ensure no party can produce invalid fragments, we can adopt a new ratcheting scheme for DKG. Each party will (upon their need to ratchet):
Diffie-Hellman Ratchet
The Diffie-Hellman Ratchet thus remains roughly the same as before, with the Alice role as any sender, the ADKG ratchet as the receiver, and agreement becomes two phase:
KDF Ratchet
The KDF Ratchet also remains roughly the same, except the initial session key (which becomes the initial root key) is derived from the following:
0x00
bytes (we can go with 320x01
bytes, it just needs to be unique so it is distinct from a Double-Ratchet session derivation). TheWhen admitting a new party into the conversation, the current root key must be shared along with all other public values.
The receiver chain key ratchet is the same as before, except a unique receiver ratchet must be maintained for every active party, as their sending chain key will differ. For every DKG rotation, this receiver chain resets, so it is not a high cost.
Security Properties
Forward Secrecy
Because the secrets from previous ratchets are discarded as in Double-Ratchet, we retain the Forward Secrecy property in that previous messages are not decryptable by adversaries who obtain access to the current keys.
Post-Compromise Secrecy
In the event an attacker gains access to the current keys, new key information added in subsequent ratchet rounds enables post-compromise secrecy when the attacker has lost access to key material.
Repudiation
Because the sender’s chain key becomes the sender-oriented recipient chain key for decryption, once the round of messages has concluded, it is no longer distinguishable that the sender had sent a given message, or if the previous messages were forged, granting repudiation.
Appendix B: Breaking the Rogue Hub Link
Given strong analytics, two conversations ids in rapid exchange could be linked, leading to unmasking of senders. To defeat this analysis, hubs can employ a mixnet – RPM in the single party prover scheme or multi-party scheme can sufficiently defeat this kind of analysis, the latter being more ideal as the former still requires a degree of trust on the connection to not be surveilled.
Appendix C: Existing Behaviors
The flow to construct the root key and sending chain key (and thus subsequent DH ratcheting to produce the next root key and alternating with the receiving chain key) is as follows:
This will be modified in the hub-based approach to utilize different domain separators when in double-ratchet vs triple-ratchet modes.
Additionally, the use of header encryption (and thus header key generation) and ring signatures for checking message validity are also present, which were not in the Warpcast API approach.
Appendix D: Additional Curves
Ideally, we will want to move to stronger curves in the near future as attacks on lower bit strength curves become more tenable. For greatest compatibility with existing features/behaviors, Curve448 is a strong drop in replacement, and would only require the additional PublicKeyTypes
and SignatureTypes:
Beta Was this translation helpful? Give feedback.
All reactions