Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wormhole Seeds #2

Open
piegamesde opened this issue Mar 23, 2021 · 42 comments · May be fixed by #17
Open

Wormhole Seeds #2

piegamesde opened this issue Mar 23, 2021 · 42 comments · May be fixed by #17

Comments

@piegamesde
Copy link
Member

Migrated from magic-wormhole/magic-wormhole#77, as I think it's a good idea to discuss feature that are not Python-exclusive here.


Basically, both sides of a Wormhole connection would derive a 128-bit mailbox-id and a 256-bit wormhole secret (from the PAKE session key), and store it for later use as a "Seed". This seed is used exactly like a normal wormhole code, except that the mailbox ID is used directly (instead of being treated as a "nameplate" which then points to a mailbox), and the Seed can be reused.

(we don't strictly need to use PAKE each time, but it happens to provide forward-secrecy, and we already have all the code in place.. it'd actually be more work to use a simple non-PAKE KDF).

(also, the wormhole secret could be considerably shorter, and still be safe, but there's no harm in making it full-sized)

@piegamesde
Copy link
Member Author

Some thoughts about the UX:

  • Every client generates a UUID and also a human-readable username
  • Connections between clients that support Wormhole Seeds will exchange their IDs and other relevant information. It is stored in a local database.
  • After the transfer, a hint "to reconnect with this person again, you can also …"
  • The mapping display name -> UUID must be unique within the database on each machine. There needs to be some collision handling.
  • Maybe some timestamp expiry of entries to keep the database from growing forever?

To be fair, this is mostly independent from the protocol, but some details – like exchanging the IDs – are. An alternative would be to make pairings explicit using a wormhole pair command. In that case, the IDs would be user-generated (every user names all their communication partners).

@meejah
Copy link
Member

meejah commented Mar 26, 2021

Thanks for moving / continuing this discussion! Some thoughts after reading above:

  • it would be good to specify in the protocol a method to allow extra information to flow each way, at first perhaps just a "petname hint" to give a good default to the other side (e.g. mine might send {"preferred_petname": "meejah"}).
  • likely makes sense to keep a suggested list of "relevant information" that implementations should store locally so that APIs etc can stay more in-sync with each other (e.g. mailbox_id, seed, petname, ...)

@warner
Copy link
Collaborator

warner commented Mar 27, 2021

One more comment to migrate from the old ticket:

The API I'm thinking of would be like:

w = wormhole.create(stuff)
w.set_code(code)
w.connect() # ...
seed = w.get_seed()

# later
w = wormhole.from_seed(seed)
w.connect()

@piegamesde
Copy link
Member Author

Working on this again, and some design questions coming up:

  • Should the clients store a nameplate or a mailbox to find each other?
  • If the latter, how does this work client-server-protocol wise? Can the client simply open any mailbox they want without having claimed it beforehand or should there be a new client->server message for this?
  • Generally, should this be part of the core Wormhole protocol, or should this be a application-layer feature (i.e. only for file transfer, and any other application chooses whether they want to have this or not)?

By the way I'm thinking about renaming this to "Wormhole resumption" or something because it's more intuitive.

@meejah
Copy link
Member

meejah commented Aug 3, 2021

Generally, should this be part of the core Wormhole protocol, or should this be a application-layer feature (i.e. only for file transfer, and any other application chooses whether they want to have this or not)?

It should be a general feature.
"Seeds" is what gives us a long-term way to connect with others and seems immediately useful for use-cases besides file-transfer.

Should the clients store a nameplate or a mailbox to find each other?

99% sure "mailbox" is correct here (but I haven't had time to delve all the way in again).

@meejah
Copy link
Member

meejah commented Aug 3, 2021

Re: naming, I do kind of like more whimsical names sometimes because it helps reduce preconceived notions.

A "Wormhole Seed" requires you to look up at least a short definition; "wormhole resumption" lets you immediately decide your own expectation for what "resume" means to you...which can be a bad thing if those don't end up lining up with reality.

"Grow a new wormhole from a Seed" might be more accurate than "resume a previous Wormhole" anyway because IIUC the key-material will be different on each wormhole (more like "grow a new one, that's really similar"?)

@warner
Copy link
Collaborator

warner commented Aug 3, 2021

Clients should store a mailbox, not a nameplate. We only use nameplates because they're short and easy to dictate/transcribe. As such, they're short lived, and mutable (the 2- I use today is pointing to a different mailbox than the 2- that you use tomorrow). Since the clients are able to remember a full-length identifier, and full-length identifiers are not scarce, they don't need a layer of indirection.

And yeah, clients can open any mailbox they want. This is exactly what they do when re-connecting (after they've released the nameplate), such as when their network connection to the mailbox server drops and comes back up, or when their state is saved to disk and the application is restarted. They forget about the nameplate entirely once they've seen evidence that the peer has started using the mailbox.

Agreed that this should be a general feature.

+1 on "seed". Apart from the utility as non-preconceived jargon, it also fits the overall whimsical style of Magic Wormhole. Call it part of our "brand" :).

@piegamesde
Copy link
Member Author

Thank you for the input. I noticed that handling identifiers is more tricky than I initially thought: since anybody can take anyone's UUID, there still be some collision handling required. I'm evaluating a public/private key pair approach to identify and authenticate wormhole devices. I'm also evaluating the usage of the shared mailbox name or something like that as an identifier.

Generally, making connection pairing explicit would make things a lot easier. However, I really like the concept of doing things automatically, because otherwise a lot of people are going to miss this feature.

@meejah
Copy link
Member

meejah commented Aug 3, 2021

The "unique thing" is the mailbox-id. They're long enough that there are no collisions.

I thought you'd proposed UUIDs for local serialization; what other purpose might they serve? (Anyway, a long-enough UUID can also be counted on to be unique).

@piegamesde
Copy link
Member Author

Okay, this is not the uniqueness-problem I am facing right now. The issue is that users need to be able to identify their peers in a reliable way, but we don't have any central naming instance. See the following threat example:

  • A, B and E have exchanged files with each other.
  • E exchanges a file with A, but takes the identity of B
  • A uses send-to B, but actually E will receive the file.

I know it is a rather weird threat model, but it shows that for seeds you need to trust peers more than I'd like to. It would be fine with manual peering, but I don't think this is acceptable for an automatic mechanism (some people send things to randoms on the internet they don't necessarily want to trust).

@meejah
Copy link
Member

meejah commented Aug 3, 2021

Okay, I agree that there can be issues making sure that users are able to communicate to the software which connection they mean. I think this is up to the frontends though -- that is, a UI/UX issue. I'm not really sure what you mean by "takes the identity of B" above (but I assume some social manipulation to make the other user save that wormhole as a different name...?)

For example, a GUI might choose to display a list with names that the user themselves had previously given those connections.

In the Python client, I'd probably choose something similar: a user-assigned petname for that connection. I would also have this be "opt-in" only and not automatically remember any connections at all unless told. That is, something like --save-user alice off the top of my head.

Whether internally those are serialized in a database with UUIDs or use some other mechanism: don't care. We can certainly give guidance as to the desired UX but ultimately this is all up to the UI designers. We could also give some guidance as to the dangers etc that might arise.

To relate this back to @warner 's python pseudo-code above: what the application program does between seed = ... and w = from_seed(...) is up to the program. For example, I would imagine a phone-based program might want to put the seed into the "Contacts" database.

I guess for more context here is how I imagine this functioning for a file-transfer application. If I'm making a GUI client then the basic version just always gives you a code to exchange. A more-featured version might give you the option to save the contact for future use ("[ ] - remember this connection") at which point it would demand a name (from the user on the local end) or some other mechanism for the user to identify that connection again (maybe an icon makes sense to some designers). Then if you initiate another transfer the user can choose between "create a new code" or "re-establish a previous connection"; in the latter case they have to identify it (e.g. picking the name or touching the icon they remember for the other device). Note that the intended recipient also has to do this for it to be successful. That is, Alice tells her software "do another transfer with Bob" and Bob tells his software "do another transfer with Alice" and because they've both remembered the same mailbox-id, it works.

@piegamesde
Copy link
Member Author

I think I've further narrowed down the problem I mentioned earlier:

It starts with the question of what happens if the two sides get their key out of sync (because of the two armies problem, we must expect this might happen).

My suggested solution is to do a key update on every wormhole connection between both peers, even without seeds. Thus, a broken seed could simply be repaired by sending a file with a code; the seed would be self-healing in a way.

But that exact update mechanism opens an attack vector for impersonation I mentioned, if not defended by authenticating the connection peers. Adding a bit of public/private key crypto for this is not a big deal IMO, but let me know what you think.


An obvious alternative solution would be to always use the same keys for a seed. We would lose the forward secrecy, but it would also give us the possibility for multi-client support. (Is there a way to have multi-client support and forward secrecy?)

@meejah
Copy link
Member

meejah commented Aug 26, 2021

Do you mean the Two Generals Problem?

Can you describe what you think might get out of sync (with Seeds)? In the original proposal I don't see any way to update them so I don't see how they can get out of sync...? (That is, my understanding is you'd use the same mailbox-id and the same 256-bit Seed on every new wormhole to the other device for whatever named connection it is -- the actual session key will be different each time though, since the Seed is used as SPAKE2 input not as a key directly).

(I can't even really picture what "multi-client support" means in the context of magic-wormhole -- that should probably be a separate discussion no matter what).

@piegamesde
Copy link
Member Author

Do you mean the Two Generals Problem?

Yes, I remembered the name wrong somehow.

That is, my understanding is you'd use the same mailbox-id and the same 256-bit Seed on every new wormhole to the other device for whatever named connection it is -- the actual session key will be different each time though, since the Seed is used as SPAKE2 input not as a key directly

That's where our understanding differs and the most important question to resolve. I understand it that the used code must be derived from the previous session's key in order to provide forward secrecy. On the other hand, without that mechanism the two generals problem indeed disappears.

@meejah
Copy link
Member

meejah commented Aug 26, 2021

That's where our understanding differs and the most important question to resolve. I understand it that the used code must be derived from the previous session's key in order to provide forward secrecy.

Each session is distinct. There's a random value involved in the SPAKE2 protocol, so using the static secret (the Seed) will still result in a new session key each time (i.e. on each connection). So, "breaking" one session won't give you access to any other session. Huge caveat: I am not a cryptographer. I believe this is what "forward secrecy" means here, though..?

(If you manage to steal the Seed itself, then of course you could impersonate one side of the connection and it's game over .. but I think that's outside the scope of what the protocol can provide?)

@piegamesde
Copy link
Member Author

You are right, the way you understand it kind of gives us forward secrecy.

If you manage to steal the Seed itself, then of course you could impersonate one side of the connection and it's game over .. but I think that's outside the scope of what the protocol can provide?

Hehe, nope. If we regularly do a key rotation by deriving a new code out of a session key (what I was talking about), we get something that is called future secrecy or Post-Compromise Security.

At the moment, my main problem while designing a possible protocol for seeds is that PAKE codes can be used for authentification, but not for identification, meaning that each peer must already know to whom to connect beforehand, and which code to use. For example, if we didn't have to pre-commit to one PAKE code beforehand, we could easily recover from a client failure during key rotation.

@meejah
Copy link
Member

meejah commented Aug 30, 2021

meaning that each peer must already know to whom to connect beforehand, and which code to use

Isn't that kind of the whole point of Seeds? That you, the client, assign a name to some previously-established connection / other-device and so then when you re-connect in the future you don't need a human-typed code? (But you do need that same other device to also attempt a re-connect around the same time). That is, you're not connecting to like wormhole:meejah@meejah.ca you're connecting to "the same thing that I called "meejah's desktop" last time.

At some point, I believe I understood how the signal/double-ratchet worked .. but so are you suggesting putting some sort of addressing + PKI into this..?

With Seeds as-proposed in the original issue, the post-compromise solution is essentially to simply start over: wipe your entire DB of Seeds and re-establish them as-required. (Or just literally start again with a fresh install). This of course takes a fresh human-typed and out-of-band communicated code once again.

I'm not clear on what sort of key-rotation you're suggesting; maybe writing that down would help both of us? (Even if we did some kind of rotation on every connection, wouldn't the post-compromise adversary just have to do a successful connection to become 'the' other party, and make any legitimate attempt fail? -- I don't claim to fully understand that paper you link to above yet, though :) )

@piegamesde
Copy link
Member Author

Isn't that kind of the whole point of Seeds?

Well, yes and no. I know to whom (as a person) to connect (as in, one UUID and a known mailbox address), but I might not know which PAKE code to use: two peers may diverge and disagree on which common session was their last.

but so are you suggesting putting some sort of addressing + PKI into this..?

I was considering it for a while, but found a better solution in the mean time.

With Seeds as-proposed in the original issue, the post-compromise solution is essentially to simply start over

That's actually a valid point: the seeds database does not contain mission-critical data.

I'm not clear on what sort of key-rotation you're suggesting; maybe writing that down would help both of us?

I think I mentioned it some time earlier, but it's probably lost in the backlog. With "key rotation" I mean "derive a new code from the current session for the next time, the same way the initial seed is generated"


I think I have resolved all my conceptual problems, I'll follow up with an alternative proposal for seeds that does key rotation soon™.

@meejah
Copy link
Member

meejah commented Aug 30, 2021

Well, yes and no. I know to whom (as a person) to connect (as in, one UUID and a known mailbox address), but I might not know which PAKE code to use: two peers may diverge and disagree on which common session was their last.

Not sure what you mean here? The user doesn't have to know the PAKE code (you mean like 2-word-foo code, right?) after the connection is "remembered" as a Seed -- at that point, the PAKE code is a 256-bit number.

@piegamesde
Copy link
Member Author

Proposal A (as before)

  • Initialization:
    • Both sides initially generate a shared secret from the session key.
    • They also exchange UUIDs to identify themselves, and also a human readable name
  • Clients need to store a mapping from a human readable name to the UUID+code.
  • Resumption is done by connecting to a mailbox generated from both UUIDs. PAKE is done with the shared secret as code

Proposal B

The key difference is that the shared secret is updated on every resumption (automatic code rotation). This (very probably) gives us post-compromise security and some other nice things at the cost of a bit of additional protocol complexity.

  • Clients store the current and the last shared key with that peer. To differentiate both, they store some session ID alongside. The session ID could be generated from both peer's sides for example.
  • Resumption is mostly done as above. A pre_pake phase is added where the clients exchange the session IDs they know. From this, they'll be able to pick the correct code for the pake phase.
    • Both clients are guaranteed to have one session in common, even if one failed during key rotation. We could also increase the history to account for repeated failure.

@meejah
Copy link
Member

meejah commented Aug 30, 2021

Small clarification: there are no UUIDs in the original, just a mailbox ID (presumably the same one they used for the initial connection) + ('full-strength') PAKE code.

@piegamesde
Copy link
Member Author

You are right, they aren't directly needed for Protocol A, at least from a cryptographic point of view. Any other fixed mailbox would do as well. I still kept them because they made the mapping from names to devices a lot more manageable. If we decide in favor of proposal A, I will re-evaluate whether they are still useful or not.

@piegamesde
Copy link
Member Author

piegamesde commented Sep 27, 2021

Okay, so I'd like to be able to tell for a "normal" connection whether or not a seed with that peer has already been established. This is not required, but it would allow the applications to do give significantly better messages. (Otherwise one would get a generic "this is how to persist a seed regardless of whether we already have one with that person or not.) However, this is tricky to do correctly:

We at least need stable client identifiers (UUIDs) but those are problematic because they can be spoofed. Which would again prompt me to use public-private key pairs as identifiers. Alternatively, we could always send a list of known seeds, but this would require session identifiers (And I'd have to evaluate the possibilities of metadata tracking first). With either of these, we get a similar complexity as in proposal B even though we do no key rotation.

Edit: After further consideration, I'm kind of giving up on this one for now. I'm still open to discuss the subject if someone wants to tackle it, so feel free. This is still a feature worth having. I may come back to this with some new approaches in the future, once I can lift the requirement of not having to change the server protocol.

@piegamesde piegamesde added the on-hold This is blocked by something else or abandoned label Oct 8, 2021
@meejah
Copy link
Member

meejah commented Dec 13, 2021

So a use-case for such a feature would be two users who connect with a "normal" code, giving the software an opportunity to say "you already have a connection to this user, called 'Alice'" or similar?

I'd think of Seeds as "device identifiers/keys" or so. It's hard to imagine how you'd identify an entirely new device (that might still conceptually be "Alice" to the user .. e.g. a new phone or laptop) as a user without something akin to keypairs or some other "across device" identifier (as you say above).

Since in exchange for this, you'd essentially be giving up some amount of privacy, I'm thinking that layering this on top of Seeds as an additional (optional) feature is good. That is, you have something that contains a number of Seeds and probably some other unique thing (public/private key makes the most sense to me too) so that you can tie a bunch of things together as "the devices Alice uses", or so. Anyway, might make it easier to reason about if we have two separate things, one building on the other. There'd also have to be some amount of negotiation (or so?) because I'd ideally want the default to be "as anonymous as currently" (that is, not broadcasting your identity-key on every connection .. although I guess you could just make a new, random one and throw it away if you wanted "whatever anonymity there is currently"). I guess the first device could "opt in" by sending their public-key and the second device opts-in by responding with theirs (possibly "only-if they know about the other side's key already").

Separately, it might be also worth thinking about what the UX looks like for each of these. I guess one approximation could be that if the above dance happens, and both sides know about the other side they can ask the human, "new device X detected for user Alice; add it to their {Pod,Garden,Packet,}..?". (Name TBD obviously, but thinking "something that contains Seeds").

Another aspect to consider is if you want separate identity-keys for each pair. I think "probably", because otherwise two users Bob and Carol could determine that the Alice they both know is the same Alice. (This could also be a feature, maybe). It could probably be optional on the client-side; that is any implementation could decide to have "key-pair per Pod" or could decide "consistent key-pair over all Pods".

(Aside: I haven't had any time to put into this project recently, so sorry that I've not been very responsive .. but that should change in the new year)

@piegamesde piegamesde removed the on-hold This is blocked by something else or abandoned label Mar 11, 2022
@piegamesde
Copy link
Member Author

I think I have solved all my previous problems with the following solution:

Proposal C

Initialization on normal connections:

  • Both sides derive the seed from the shared session key and store it. They also exchange human readable names, but everything besides that is up to the client implementation.
  • This step is skipped when

Recognize known seeds on normal connections:

  • Both sides share a hashed version of all their known seeds. If the intersection between both sides is not empty, they have one in common. This information may then be displayed to the user
  • This exchange is effectively a private set intersection protocol, so that the other sides gains no information other than the number of seeds and which seeds both have in common.
  • If previously only one side had stored the seed after a connection, both sides won't recognize each other. They will store fresh seeds again. This also doubles as failure mode recovery if a seed ever "breaks".

Resumption:

  • Both sides derive a mailbox and code from their seed
  • They will be able to meet if and only if they share the same seed. No other inconsistencies and failure modes are possible

Advantages of this approach compared to my previous attempts:

  • No UUIDs and related issues. (Clients may use some IDs as an implementation detail for managing the database, but that's different)
  • Automatic key/seed rotation could be implemented without too much effort if we wanted to, although I don't have any plans for that at the moment.

Challenges that I am facing in my prototype implementation:

  • Due to the fixed mailbox, no two independent transfers using the same seed may happen at the same time
  • Unless the rendezvous server connection is not encrypted, an eavesdropper may retrieve the used mailbox and block it, rendering the seed unusable
  • The mailbox easily gets in a state where the rendezvous server throws "crowded" errors. This may happen through the above, but furthermore any error between both peers may cause this. In that case, the mailbox – and thus the seed – will be unusable until the server frees the mailbox again, which takes a few minutes

Some of these could be improved by fixing the server behavior. Also, I'm again thinking about fixing a nameplate instead of a mailbox. This would at least alleviate the first issue. An alternative solution would be to add a flag that disables crowded checking for clients that opt-in. Clients that do that would need to always check the sides of their communication partners and also only use high entropy passwords that are secure against brute forcing.

@meejah
Copy link
Member

meejah commented Mar 11, 2022

I'm worried we've got too many concerns competing here. What is the point of the "auto discovering intersections" between existing seeds? (That does leak privacy information: it tells both sides part of the social graph).

I've always considered Seeds to be just a way to quickly (that is, without humans exchanging codes) re-constitute a previous connection between two devices only if both devices opted-in to that in the first place. This opting-in part is implied by the initial API snippet @warner posted (get_seed() and from_seed()). In a CLI I would imagine this being expressed by some additional option to create a Seed in the first place (e.g. --create-seed "Some Name") or to re-use an existing Seed (e.g. --from-seed "Some Name" in place of --code). Only-if both sides asked for a Seed would one be created. (e.g. both the send and receive sides had a --create-seed option, to express that as a CLI might look).

It's certainly interesting to consider other use-cases and features, especially if they could be layered on top of Seeds however I think there's benefits from keeping these layers separate.

For example, any PKI-type things (identities, etc) should live separately because Magic Wormhole itself doesn't include any "accounts" or identities .. although of course applications could already choose to do identity-related things inside the connections. It is desirable to continue to be able to use magic-wormhole without identities at all.

Due to the fixed mailbox, no two independent transfers using the same seed may happen at the same time

I think this shortcoming would be solved once Dilation is available. Then a single "session" can spin up any number of sub-connections for whatever use-cases are required (e.g. file transfer in this example). Dilation currently is the best way to do multiple transfers (that is, one sub-connection per file transferred).

an eavesdropper may retrieve the used mailbox

This is kind of the same as saying that "the Seed" is sensitive, secret information -- and that the mailbox is part of "the Seed". Right?

The mailbox easily gets in a state where the rendezvous server throws "crowded" errors.

This sounds worth enumerating the cases where this happens and considering mitigations. Sounds like one case is where two devices try to start two separate sessions with the same Seed. Would this be mitigated if Dilation was implemented? (i.e. if the file-transfer applications didn't try to start a session for a new transfer, but instead created a new sub-connection?). I guess the more-general case would be a user starting two copies of an application and then re-constituting the same Seed?

@piegamesde
Copy link
Member Author

I'm worried we've got too many concerns competing here. What is the point of the "auto discovering intersections" between existing seeds? (That does leak privacy information: it tells both sides part of the social graph).

This was motivated by the idea that I'd like to have an "after the facts" possibility to store seeds, i.e. without having a dedicated create-seed command. This would allow applications to go "hey, if you send files regularly to that person, might want to store it as contact?" after any transfer.

I can still strip that part out if you don't like it / don't want the complexity. Or we could have both ways, as they are not mutually exclusive.

I think this shortcoming would be solved once Dilation is available. Then a single "session" can spin up any number of sub-connections for whatever use-cases are required (e.g. file transfer in this example).

I have to disagree here. I was talking about independent connections, as in different processes of an application. In order to do what you propose, they'd have to coordinate so that only one of them manages the dilated session. This would effectively force all Wormhole clients to become "single instance" applications, and I'd really dislike that (both as a user and as developer).

This is kind of the same as saying that "the Seed" is sensitive, secret information -- and that the mailbox is part of "the Seed". Right?

The mailbox is derived from the seed, and thus (kind of) part of it. It does not need to be kept secret to ensure security and confidentiality, but for safety.

@piegamesde
Copy link
Member Author

Oh, while trying to implement your create-seed suggestion, I now remember why I hadn't pursued that idea any further: If both sides use the same sub-command, how do they determine which becomes the "leader" in the protocol and which one enters the code?

@meejah
Copy link
Member

meejah commented Mar 11, 2022

If both sides use the same sub-command, how do they determine which becomes the "leader" in the protocol and which one enters the code?

Are you asking, "what if both sides do wormhole send with some particular Seed?"

For a normal send/receive they'd determine this the same as before (e.g. "send" creates a code, "receive" enters it). I'm thinking of "make me a seed" as an option here. Semantically, it means "my human wants to use this connection easily in the future" and if both sides have that set, they both save it as a Seed).

So if you're sending, and ask for a Seed-name, and you already have that Seed then you are the initiator (and the software wouldn't allocate and spit out a code .. instead saying "tell the other human to start their software with the corresponding Seed", approximately).

@meejah
Copy link
Member

meejah commented Mar 11, 2022

The mailbox is derived from the seed, and thus (kind of) part of it. It does not need to be kept secret to ensure security and confidentiality, but for safety.

I see, you're saying that leaking the mailbox is fine for security/confidentiality but allows a third-party to disrupt particular communication-pairs. This makes sense as a bit of a different concern ("availability" basically). And the way to mitigate this is "use TLS", approximately .. although that still allows the server to censor particular mailboxes / communication-pairs.

@meejah
Copy link
Member

meejah commented Mar 11, 2022

I have to disagree here. I was talking about independent connections, as in different processes of an application.

Okay, so the problem here is if the same application (hence same configuration / state) re-uses the same Seed at the same time as another instance (and possibly for completely different purposes, i.e. different post-connection protocols).

This feels like an application concern that implementations could mitigate .. but also it sure would be nice to not force every application have to do "something" here.

@piegamesde
Copy link
Member Author

Are you asking, "what if both sides do wormhole send with some particular Seed?"

No, you misread. The question is specifically about your proposed create-seed command to initially connect two peers together. In that case, how do we know who provides the code and who enters it?

although that still allows the server to censor particular mailboxes

Yes, but I operate under the assumption that the server wants to provide some service and if it didn't want to, then there would be other ways regardless.

@meejah
Copy link
Member

meejah commented Mar 11, 2022

No, you misread. The question is specifically about your proposed create-seed command to initially connect two peers together. In that case, how do we know who provides the code and who enters it?

There's no create-seed command; I meant that to indicate an option to things that would normally either create or consume a code. Something like wormhole send --create-seed 'piegames' --text 'hello world' .. meaning "if there is already a Seed called piegames then use that, otherwise spit out a code". (Maybe the option needs a better name!) ... or I guess that could mean "show me an error if a piegames seed already exists" and you need to use --from-seed piegames to re-use an existing one (is I guess what I meant in the first comment).

@piegamesde
Copy link
Member Author

I dislike your proposed method for users to create seeds. Having to specify CLI flags (or some equivalent checkbox) beforehand is very likely to lead to a lot of "oops I forgot it again" situations. Also, the way you describe it renders the concept of clients exchanging names pointless.

@meejah
Copy link
Member

meejah commented Mar 12, 2022

The particulars of the CLI was just meant to be an illustration; different UX could be imagined.
Maybe concentrating on the API itself is better (and imagine different UIs on top).

@piegamesde
Copy link
Member Author

The UX a client can provide is partially limited by the feature set of the API. The only thing I need to change w.r.t. to Warner's initial proposal is that w.get_seed() also returns whether the seed is new or was created in the past.

@meejah
Copy link
Member

meejah commented Mar 15, 2022

The only thing I need to change w.r.t. to Warner's initial proposal is that w.get_seed() also returns whether the seed is new or was created in the past.

To be clear, you mean "on top of the other suggested changes" -- the original leaves "store the Seed" up to the application code whereas the other proposals put that handling into the wormhole implementations (IIUC).

Since both sides need to agree to remember the Seed in order for it to work again in the future any "after the fact" remembering would need some kind of further communication to the other side (i.e. "I am remembering this connection; will you?").

To summarize my understanding of the original proposal from application-code perspective, it needs to:

  • decide A: "open a fresh wormhole session" or B: "grow a session from an existing Seed"

  • A: "open a fresh wormhole session"

    • proceed as normal BUT somehow indicate "I support Seeds" as well as "I will save a Seed of this session" (perhaps that is just one mechanism -- i.e. saying "I will save a Seed of this session" implies you support Seeds).
    • (I don't have strong opinions on when the above happens in the protocol, but it needs to happen at some point before the session closes)
    • some indication to application code that the other side agreed to save a Seed (could be reflected in the return-value of get_seed() for example)
    • (That is, it's worthless for only one side to save their half of the Seed if the other side will just be discarding it)
    • application-level protocol(s) proceed as normal
  • B: "grow a session from an existing Seed"

    • application code identifies which Seed to deserialize and does that, starting a new session with that mailbox + code
    • the other side has to start up the corresponding Seed for this to work
    • application-level protocol(s) proceed as normal

Some notes:

  • applications decide where / how to store the data that constitutes "a Seed" (i.e. an opaque binary string)
  • applications decide how users identify these (e.g. could be "petnames", could be a "history" list, could be ... )
  • it may be useful for applications to know what the "other side" called their corresponding Seed (if the other application even uses names)
  • we do not want long-term globally-unique identifiers like accounts or identities; a Seed is simply some way to re-grow a session w/o doing the nameplate/code/etc dance
  • the two sides still need to co-ordinate somehow out-of-band (just like they would with the code-phrase). "All" Seeds provide here is a shortcut where you no longer have to provide the code.

To take the original "whimsical" phrasing and story (I believe I am channeling @warner correctly here ;): you open a wormhole when two different people speak the same magic phrase at (roughly) the same time. Sometimes, the wormhole spits out a Seed spell. If so, you may sprout the same wormhole later when two people cast their Seed spell at (roughly) the same time. Note that in both cases the two ends of the wormhole still have to "do something" at approximately the same time.


The protocol will need to add two-way communication of:

  • support for Seeds ("I support Seeds")
  • whether this particular Seed will be saved ("I will save this as a Seed named 'foo' if you do as well")
  • if both sides support Seeds and both sides send an "I will save ..." message then they serialize that seed to their long-term state

I think to fully support a GUI that wants to offer "save this session" after for example a file-transfer (or other application-level interaction) it makes sense to have two separate messages for "I support Seeds" and a different one for "I will save this particular Seed". (This would allow the application to only correctly offer to save the session when both sides support Seeds -- and can support a "decide up front" or a "decide after application-level stuff" workflow, as desired).

I think it makes sense to note what the petname is on either side. Thus, for example, Alice can store her Seed as "workstuff for Bronwen" and inside that seed also store the fact that Bronwen called it "foo". This would be useful when they try to re-establish the wormhole -- instead of saying "transfer with code 1-foo-bar" Alice can say to Bronwen, "re-use the Seed you called 'foo'" whereas if Bronwen started the interaction she could say to Alice "re-use the Seed you called 'workstuff for Bronwen'". The applications should make it clear the name will be shared with the other side. (It may be useful to add an "ack"-type message confirming the save .. this would allow the humans to converge on the same petname if they wanted .. e.g. the application could say "the other side called their Seed 'foo'; do you want to use that?" and so the "ack" message would be "I saved this Seed and called it X" where X might be different from the first name proposed).

I think this represents a minimal protocol. We could choose to add more complexity in the future. For example, the sides could agree to rotate the code (and/or mailbox) on some schedule (like "on demand" or "every time the session is re-grown") by adding another message for that. Another example could be "I am deleting my Seed" as one way to handle expiry.

@meejah
Copy link
Member

meejah commented Mar 15, 2022

p.s. re-reading some of the discussion I think I see where our understandings diverged:

  • which code stores "the Seed" (me: the app, you: this library).
  • ...and is it established "once" (my understanding) or "on every session" (yours).

I believe both those are made explicit in my above comment.

@piegamesde
Copy link
Member Author

piegamesde commented Mar 15, 2022

Actually, I think we agree on more than what you think – most of my previous points were dropped in my "Proposal C" iteration. To make things more clear, maybe have a look at the current prototype implementation (WIP). The public API looks like this:

  • For "normal" connections, provide a SeedAbility and get a SeedResult:
    pub struct SeedAbility {
        /// List of human readable names for a peer
        pub display_names: Vec<String>,
        /// List of known seeds of a peer (blinded by salted hash)
        pub known_seeds: HashSet<xsalsa20poly1305::Key>,
    }
    pub struct SeedResult {
        /** Seed derived from the current session */
        pub session_seed: key::WormholeSeed,
        /** We may already have a seed (or multiple) in common with peer */
        pub existing_seeds: HashSet<xsalsa20poly1305::Key>,
    }
  • For resumption, simply specify the seed:
      pub async fn connect_with_seed(
          config: AppConfig<impl serde::Serialize>,
          seed: xsalsa20poly1305::Key,
      ) -> Result<Self, WormholeError> {}

Notable features that are missing on purpose:

  • Fixed identities. A seed is unique to a connection and thus two peers.
  • "I will save this code" flagging. If one side doesn't save the code, both sides won't recognize each other in the future, so the dance will be the same just as if they'd never met before.
  • Storage handling in the library. It's up to the client if they want to track and provide a list of known seeds or not; resumption will work regardless
  • Key rotation. It could probably be added if we wanted to but I'm not that interested in it anymore.

I hope that this helps to clear up some misunderstandings.

@piegamesde piegamesde linked a pull request Mar 20, 2022 that will close this issue
@meejah
Copy link
Member

meejah commented Mar 29, 2023

Use-case from magic-wormhole/magic-wormhole#475 : have a cron-job on computer A that computer B can use to connect (at the prearranged time) to do a transfer. This should be possible with Seeds: do a one-time setup procedure to save a Seed for this connection on both A and B and use that Seed for subsequent transfers.

(Also, we should ensure this will work with either "classic" or "Dilated File Transfer")

@piegamesde
Copy link
Member Author

Seeds should be tangential to dilation and even the application level protocol used at all

@meejah
Copy link
Member

meejah commented Mar 29, 2023

Yes, for sure .. want to make sure the concrete use-cases work correctly though :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants