-
Notifications
You must be signed in to change notification settings - Fork 380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSC3886: Simple client rendezvous capability #3886
Changes from 24 commits
66cc232
e15fda0
0350bf8
fcc3270
60f8115
5a9c3c7
fac41d1
cbffa67
c67a3d6
1ec9ce2
8a0d559
cacae4e
94ef9dd
953c4ee
ff9a373
6937a86
4ab59f8
5db0af6
8a1af85
97f1709
b5c6c7a
931cf07
aee7d81
90a8b49
dcbbcb0
74d9094
4c63493
58f1e86
65d697c
43d9871
3fecfcd
82fcb44
a08d14b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,302 @@ | ||
# MSC3886: Simple client rendezvous capability | ||
|
||
In [MSC3906](https://github.com/matrix-org/matrix-spec-proposals/pull/3906) a proposal is made to allow a user to login on a new device using an existing device by means of scanning a | ||
QR code. | ||
|
||
In order to facilitate this the two devices need some bi-directional communication channel which they can use to exchange | ||
information such as: | ||
|
||
- the homeserver being used | ||
- the user ID | ||
- facilitation of issuing a new access token | ||
- device ID for end-to-end encryption | ||
- device keys for end-to-end encryption | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We shouldn't be sending (private) device keys over the wire like this. They should be generated by the new device, which may be the device ID given, but not transmitted over the wire. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
||
To enable [MSC3906](https://github.com/matrix-org/matrix-spec-proposals/pull/3906) and support any future proposals this MSC proposes a simple HTTP based protocol that can be used to | ||
establish a direct communication channel between two IP connected devices. | ||
|
||
It will work with devices behind NAT. It doesn't require homeserver administrators to deploy a separate server. | ||
|
||
## Proposal | ||
|
||
It is proposed that a general purpose HTTP based protocol be used to establish ephemeral bi-directional communication | ||
channels over which arbitrary data can be exchanged. | ||
|
||
A typical flow might look like this where device A is initiating the rendezvous with device B: | ||
|
||
```mermaid | ||
|
||
sequenceDiagram | ||
participant A as Device A | ||
participant R as Rendezvous Server | ||
participant B as Device B | ||
Note over A: Device A determines which rendezvous server to use | ||
|
||
A->>+R: POST /rendezvous Hello from A | ||
R->>-A: 201 Created Location: /abc-def-123-456 | ||
|
||
A-->>B: Rendezvous URI between clients, perhaps as QR code: e.g. https://rendzvous-server/abc-def-123-456 | ||
|
||
Note over A: Device A starts polling for contact at the rendezvous | ||
|
||
B->>+R: GET <rendezvous URI> | ||
R->>-B: 200 OK Hello from A | ||
|
||
loop Device A polls for rendezvous updates | ||
A->>+R: GET <rendezvous URI> If-None-Match: <ETag> | ||
R->>-A: 304 Not Modified | ||
end | ||
|
||
B->>+R: PUT <rendezvous URI> Hello from B | ||
R->>-B: 202 Accepted | ||
|
||
Note over A,B: Rendezvous now established | ||
``` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My understanding of this is that A and B take turns writing to the same rendezvous URI until they're done. So when it's B's turn to write, A keeps polling (using the ETag) until the server says the data has changed, and vice versa. What happens if B tries to write, but gets some sort of network error, or an error from a proxy? If the server got B's data, but B received a network error, then it seems to me what could happen is:
So B will miss a message from A, and A will get a duplicate message. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps we could mitigate against this by using a RFC7232 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ISTM that every PUT should be required to cite a previous ETag so that the rendezvous server can enforce a linear ordering. (The initial ETag is included in the POST and GET response, so both A and B should be fully aware of it.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 58f1e86 does this. |
||
|
||
Please note that it is intentional that this protocol does nothing to ensure the integrity of the data exchanged at a rendezvous. | ||
|
||
### Protocol | ||
hughns marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
#### Create a new rendezvous point: `POST /_matrix/client/rendezvous` | ||
|
||
HTTP request headers: | ||
|
||
- `Content-Length` - required | ||
- `Content-Type` - optional, server should assume `application/octet-stream` if not specified | ||
|
||
HTTP request body: | ||
|
||
- any data up to maximum size allowed by the server | ||
|
||
HTTP response codes, and Matrix error codes: | ||
|
||
- `201 Created` - rendezvous created | ||
- `400 Bad Request` (`M_MISSING_PARAM`) - no `Content-Length` was provided. | ||
- `403 Forbidden` (`M_FORBIDDEN`) - forbidden by server policy | ||
- `413 Payload Too Large` (`M_TOO_LARGE`) - the supplied payload is too large | ||
- `429 Too Many Requests` (`M_UNKNOWN`) - the request has been rate limited | ||
- `307 Temporary Redirect` - if the request should be served from somewhere else specified in the `Location` response header | ||
hughns marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
n.b. the relatively unusual [`307 Temporary Redirect`](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/307) response | ||
code has been chosen explicitly for the behaviour of ensuring that the method and body will not change whilst the user-agent | ||
follows the redirect. For this reason, no other `30x` response codes are allowed. | ||
|
||
HTTP response headers for `201 Created`: | ||
hughns marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
- `Location` - required, the allocated rendezvous URI which can be on a different server | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Presumably this needs URI needs to be not guessable, to prevent attackers from guessing this and impersonating the intended recipient? |
||
- `X-Max-Bytes` - required, the maximum allowed bytes for the payload | ||
Comment on lines
+133
to
+134
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why are these headers and not response body parameters? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On reflection I would agree that if it is going to be part of the C-S API then it would make sense to consider consistency with the rest of the C-S API where headers are not used. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are we concerned with just these two headers? Or do we want all of the response and request headers to be expressed via HTTP bodies? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The response body for GET ought to be just the payload itself. Since the payload data is an arbitrary byte sequence, it would be painful to embed this in JSON. Therefore I would encourage the current GET response headers (Content-Type, ETag, Expires, Last-Modified) to continue to be expressed via headers. For consistency it makes sense to do so in all other resposnes. For POST this leaves the two highlighted headers: Location and X-Max-Bytes. We could present them as a JSON-encoded body, but it would seem odd to spread the POST response metadata in two places without any meaningful distinction to justify it. My vote would be to leave things as they are. But I neither have a vote, nor any strong opinions. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
We could base64 encode the payload, but I tend to agree. Not all things need to be shoehorned through JSON. |
||
- `ETag` - required, ETag for the current payload at the rendezvous point as per [RFC7232](https://httpwg.org/specs/rfc7232.html#header.etag) | ||
- `Expires` - required, the expiry time of the rendezvous as per [RFC7234](https://httpwg.org/specs/rfc7234.html#header.expires) | ||
- `Last-Modified` - required, the last modified date of the payload as per [RFC7232](https://httpwg.org/specs/rfc7232.html#header.last-modified) | ||
|
||
Example response headers: | ||
|
||
```http | ||
Location: /abcdEFG12345 | ||
X-Max-Bytes: 10240 | ||
ETag: VmbxF13QDusTgOCt8aoa0d2PQcnBOXeIxEqhw5aQ03o= | ||
Expires: Wed, 07 Sep 2022 14:28:51 GMT | ||
Last-Modified: Wed, 07 Sep 2022 14:27:51 GMT | ||
``` | ||
|
||
#### Update payload at rendezvous point: `PUT <rendezvous URI>` | ||
|
||
HTTP request headers: | ||
|
||
- `Content-Length` - required | ||
- `Content-Type` - optional, server should assume `application/octet-stream` if not specified | ||
- `If-Match` - optional, as per [RFC7232](https://httpwg.org/specs/rfc7232.html#header.if-match) server will assume `*` | ||
if not specified | ||
|
||
HTTP request body: | ||
|
||
- any data up to maximum size allowed by the server | ||
|
||
HTTP response codes, and Matrix error codes: | ||
|
||
- `202 Accepted` - payload updated | ||
- `400 Bad Request` (`M_MISSING_PARAM`) - no `Content-Length` was provided. | ||
- `404 Not Found` (`M_NOT_FOUND`) - rendezvous URI is not valid (it could have expired) | ||
- `412 Precondition Failed` (`M_DIRTY_WRITE`, **a new error code**) - when `If-Match` is supplied and the ETag does not match | ||
- `413 Payload Too Large` (`M_TOO_LARGE`) - the supplied payload is too large | ||
- `429 Too Many Requests` (`M_UNKNOWN`) - the request has been rate limited | ||
|
||
HTTP response headers for `202 Accepted` and `412 Precondition Failed`: | ||
|
||
- `ETag` - required, ETag for the current payload at the rendezvous point as per [RFC7232](https://httpwg.org/specs/rfc7232.html#header.etag) | ||
- `Expires` - required, the expiry time of the rendezvous as per [RFC7233](https://httpwg.org/specs/rfc7234.html#header.expires) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the intention that the expiry time is incremented every time the rendezvous payload is updated? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have assumed so in 65d697c |
||
- `Last-Modified` - required, the last modified date of the payload as per [RFC7232](https://httpwg.org/specs/rfc7232.html#header.last-modified) | ||
|
||
#### Get payload from rendezvous point: `GET <rendezvous URI>` | ||
|
||
HTTP request headers: | ||
|
||
- `If-None-Match` - optional, as per [RFC7232](https://httpwg.org/specs/rfc7232.html#header.if-none-match) server will | ||
only return data if given ETag does not match | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Might be nice for servers to have the option to delay responding until it gets content that doesn't match the ETag, so we can do long-polling. |
||
|
||
HTTP response codes, and Matrix error codes: | ||
|
||
- `200 OK` - payload returned | ||
- `304 Not Modified` - when `If-None-Match` is supplied and the ETag does not match | ||
- `404 Not Found` (`M_NOT_FOUND`) - rendezvous URI is not valid (it could have expired) | ||
- `429 Too Many Requests` (`M_UNKNOWN`)- the request has been rate limited | ||
|
||
HTTP response headers for `200 OK` and `304 Not Modified`: | ||
|
||
- `ETag` - required, ETag for the current payload at the rendezvous point as per [RFC7232](https://httpwg.org/specs/rfc7232.html#header.etag) | ||
- `Expires` - required, the expiry time of the rendezvous as per [RFC7233](https://httpwg.org/specs/rfc7234.html#header.expires) | ||
- `Last-Modified` - required, the last modified date of the payload as per [RFC7232](https://httpwg.org/specs/rfc7232.html#header.last-modified) | ||
|
||
- `Content-Type` - required for `200 OK` | ||
|
||
#### Cancel a rendezvous: `DELETE <rendezvous URI>` | ||
|
||
HTTP response codes: | ||
|
||
- `204 No Content` - rendezvous cancelled | ||
- `404 Not Found` (`M_NOT_FOUND`) - rendezvous URI is not valid (it could have expired) | ||
- `429 Too Many Requests` (`M_UNKNOWN`)- the request has been rate limited | ||
|
||
### Authentication | ||
|
||
These API endpoints do not require authentication. This is because the protocol is explicitly treated as untrusted, | ||
with trust established at a higher level outside the scope of the present proposal. | ||
|
||
### Maximum payload size | ||
|
||
The server should enforce a maximum payload size for the payload size. It is recommended that this be no less than 10KB. | ||
|
||
### Maximum duration of a rendezvous | ||
|
||
The rendezvous only needs to persist for the duration of the handshake. So a timeout such as 30 seconds is adequate. | ||
|
||
Clients should handle the case of the rendezvous being cancelled or timed out by the server. | ||
|
||
### ETags | ||
|
||
The ETag generated should be unique to the rendezvous point and the last modified time so that two clients can | ||
distinguish between identical payloads sent by either client. | ||
|
||
### CORS | ||
|
||
To support usage from web browsers, the server should allow CORS requests to the `/rendezvous` endpoint from any | ||
origin and expose the `ETag`, `Location` and `X-Max-Bytes` headers as: | ||
|
||
```http | ||
Access-Control-Allow-Headers: Content-Type,If-Match,If-None-Match | ||
Access-Control-Allow-Methods: GET,PUT,POST,DELETE | ||
Access-Control-Allow-Origin: * | ||
Access-Control-Expose-Headers: ETag,Location,X-Max-Bytes | ||
``` | ||
|
||
Currently the [spec](https://spec.matrix.org/v1.4/client-server-api/#web-browser-clients) specifies a single set of | ||
CORS headers to be used. Therefore, care will be required to make it clear in the spec that the headers will | ||
vary depending on the endpoint. | ||
|
||
### Choice of server | ||
|
||
Ultimately it will be up to the Matrix client implementation to decide which rendezvous server to use. | ||
|
||
However, it is suggested that the following logic is used by the device/client to choose the rendezvous server in order | ||
of preference: | ||
|
||
1. If the client is already logged in: try and use current homeserver. | ||
1. If the client is not logged in and it is known which homeserver the user wants to connect to: try and use that homeserver. | ||
1. Otherwise use a default server. | ||
|
||
## Potential issues | ||
|
||
Because this is an entirely new set of functionality it should not cause issue with any existing Matrix functions or capabilities. | ||
|
||
The proposed protocol requires the devices to have IP connectivity to the server which might not be the case in P2P scenarios. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One potential issue here is that if A sends a message to B, then waits for a message from B using the ETag, but the message that B sends to A happens to be exactly the same as the message that A sent, then A will get the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this still a problem? From the current text, it sounds like |
||
## Alternatives | ||
|
||
### Send-to-Device messaging | ||
|
||
The combination of this proposal and [MSC3903](https://github.com/matrix-org/matrix-spec-proposals/pull/3903) look similar in | ||
some regards to the existing [Send-to-device messaging](https://spec.matrix.org/v1.6/client-server-api/#send-to-device-messaging) | ||
capability. | ||
|
||
Whilst to-device messaging already provides a mechanism for secure communication | ||
between two Matrix clients/devices, a key consideration for the anticipated | ||
login with QR capability is that one of the clients is not yet authenticated with | ||
a Homeserver. | ||
|
||
Furthermore the client might not know which Homeserver the user wishes to | ||
connect to. | ||
|
||
Conceptually, one could create a new type of "guest" login that would allow the | ||
unauthenticated client to connect to a Homeserver for the purposes of | ||
communicating with an existing authenticated client via to-device messages. | ||
|
||
Some considerations for this: | ||
|
||
Where the "actual" Homeserver is not known then the "guest" Homeserver nominated | ||
by the new client would need to be federated with the "actual" Homeserver. | ||
|
||
The "guest" Homeserver would probably want to automatically clean up the "guest" | ||
accounts after a short period of time. | ||
|
||
The "actual" Homeserver operator might not want to open up full "guest" access | ||
so a second type of "guest" account might be required. | ||
|
||
Does the new device/client need to accept the T&Cs of the "guest" Homeserver? | ||
|
||
### Other existing protocols | ||
|
||
Try and do something with STUN or TURN or [COAP](http://coap.technology/). | ||
|
||
### Implementation details | ||
|
||
Rather than requiring the devices to poll for updates, "long-polling" could be used instead similar to `/sync`. | ||
|
||
## Security considerations | ||
|
||
### Confidentiality of data | ||
|
||
Whilst the data transmitted can be encrypted in transit via HTTP/TLS the rendezvous server does have visibility over the | ||
data and can also perform man in the middle attacks. | ||
|
||
As such, for the purposes of authentication and end-to-end encryption the channel should be treated as untrusted and some | ||
form of secure layer should be used on top of the channel such as a Diffie-Hellman key exchange. | ||
|
||
### Denial of Service attack surface | ||
|
||
Because the protocol allows for the creation of arbitrary channels and storage of arbitrary data, it is possible to use | ||
it as a denial of service attack surface. | ||
|
||
As such, the following standard mitigations such as the following may be deemed appropriate by homeserver implementations | ||
and administrators: | ||
|
||
- rate limiting of requests | ||
- imposing a low maximum payload size (e.g. kilobytes not megabytes) | ||
- limiting the number of concurrent channels | ||
|
||
## Unstable prefix | ||
|
||
While this feature is in development the new endpoint should be exposed using the following unstable prefix: | ||
|
||
- `/_matrix/client/unstable/org.matrix.msc3886/rendezvous` | ||
|
||
Additionally, the feature is to be advertised as unstable feature in the `GET /_matrix/client/versions` | ||
response, with the key `org.matrix.msc3886` set to `true`. So, the response could look then as | ||
following: | ||
|
||
```json | ||
{ | ||
"versions": ["r0.6.0"], | ||
"unstable_features": { | ||
"org.matrix.msc3886": true | ||
} | ||
} | ||
``` | ||
|
||
## Dependencies | ||
|
||
None, although it's intended to be used with [MSC3906](https://github.com/matrix-org/matrix-spec-proposals/pull/3906). | ||
|
||
## Credits | ||
|
||
This proposal was influenced by https://wiki.mozilla.org/Services/KeyExchange which also has some helpful discussion | ||
around DoS mitigation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On one hand, this is a really simple and elegant standalone function. On the other hand, I'm a bit worried that it duplicates the semantics of to-device API (i.e. basic store & forward between devices), albeit with short-polling rather than long-polling.
I wonder how bad it would be if we opened up to-device messages to guests, and used the existing APIs for rendezvous? So a new device would go and /login as a guest to get a temporary access token, and then publish its device ID & HS url in its QR code to let another device rendezvous with it.
My only reason for proposing this is to avoid having two store-and-forward APIs which look suspiciously similar, but have different semantics (short/long poll), and so require more code for client implementors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood. I'll work up an alternative based on to-device messages and see how that feels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have started some discussion on the to-device based alternative as part of #3903
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ugh, the complexity of this feels horrible to me.
Sure, having two store-and-forward APIs is rather less than ideal, but this one is so simple and easy to use that I don't really buy that it's a meaningful amount of extra code for clients comparing to have to grab a temporary access token and then start /syncing.
For me, the simplicity of this proposal outweighs the fact it looks a bit like to-device messaging. (Or even matrix rooms, if you squint hard enough and invent "ephemeral rooms".)
The only thing I'd say here is that it would be good if the "Alternatives" section in this MSC said something about this idea (even if it's just a link to MSC3903's alternatives section).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I was broadly coming to a similar conclusion to Rich. Adding guest access to to-device feels about as complex as this separate impl.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've move the section from MSC3903 alternatives section into this proposal as there is much feedback here than on MSC3903 itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the above, it appears we've settled on using a new channel rather than exposing to-device to guests. @matrix-org/spec-core-team if you disagree then please raise comments :)