-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Presence
should send less presence when joining a room
#15818
Conversation
…hen joining a room. This is based on the premise that a user sharing another room with a different users is already receiving presence and can therefore be excluded from this list. Care was taken to maintain the split between previous users and newly joined users, as this split is crucial to a filter used further down to remove 'old presence' data from what is sent.
synapse/handlers/presence.py
Outdated
for other_user_id in prev_users_excluding_current_user: | ||
# If this host has been seen in a list already, skip it. | ||
if host not in prev_remote_hosts: | ||
if await is_only_one_room_shared(user_id, other_user_id): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not following this, why are we comparing all users previously in the room to see if they share a room? Notably, I think both user_id
and other_user_id
can be remote users?
I'd have thought that at least one iteration here should be across the newly joined users or hosts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not following this, why are we comparing all users previously in the room to see if they share a room?
Excellent question. We want to build a full list of all remote servers in the room, not the actual users. (We shortcut out so not to hit the database more than necessary). But, we want the list to excluded some servers, hence the comparison. The list needs to be both as full as possible and as empty as possible at the same time.
Notably, I think both
user_id
andother_user_id
can be remote users?
Not just can be, they are. It means that we know the host from user_id
is in this room, but if other_user_id
's host(which we know is in this room obviously) is in yet another room that we know about because it's federated to this server, than user_id
's host can be excluded, since they are already receiving presence. For example:
@me:myserver
, @alice:server1
and @bob:server2
are already in Synapse Admins
.
@me:myserver
, @bob:server2
and @charlie:server2
are in Matrix HQ
.
@alice:server1
is joining Matrix HQ
.
Because we know server1
is already receiving presence info about @bob:server2
, it can be excluded from the list of prev_remote_hosts
.
In retrospect, perhaps an additional list maintaining all the hosts we know can be excluded would also be a good idea, to cut down on database hits? However, I see a potential gotcha here, as what happens to @charlie:server2
? If we maintain an excluded list as well, he would never be checked against and therefore(at least at first) not receive the new presence data.(Depending of course on if @bob
or @charlie
is checked first)
I'd have thought that at least one iteration here should be across the newly joined users or hosts?
Newly joined(either local or remote) is dealt with in the next block below, starting at about line 1496.
It could actually be argued that presence would be retrieved on the first(next?) sync that occurs after the room is fully joined anyways, so why are we bothering with this at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had an epiphany while at work tonight; the prev_remote_hosts
only needs to be built if one of the newly joined users is local. If that's what you were getting at with your question, mission accomplished 😉. I'll do a refactor in a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had an epiphany while at work tonight; the
prev_remote_hosts
only needs to be built if one of the newly joined users is local. If that's what you were getting at with your question, mission accomplished 😉. I'll do a refactor in a bit.
I still want to make this optimization a thing, but perhaps it's best left to another PR. If that's ok? Nope, never mind. Still need it after all. Testing is great. Can't whittle down the list if we don't have the data. I'm still gonna ponder on this, but right now it's not easily actionable.
…the batching version of get_rooms_for_users() instead and do the processing by hand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@realtyem is this something that you still think will help after the recent refactoring?
TLDR: in the category of reducing federation spam caused by presence, this is a 3 on the priority scale of 1-10(10 being highest). You may close it for a redo, because bitrot. A fair question. In regards to the other work done recently, this has no bearing either way. However I know a lot more about the internals of the presence system now compared to when this was written, and it stands to reason there may be a better way. Especially after the data came along that the entire batch of state from a room didn't come in at once but in stages. Since the whole point of this bit of work was the deduplication of sending presence to a server for the same join, it may not hold relevance in it's current form(truthfully, I had rather forgotten about this at all). Either way, age has not treated it well, it's fair to close and I'll try and revisit it later. |
Duh, it's my own PR. I'll close it myself 🤦♂️ |
Currently, when a local user joins a room, presence is sent to all other servers in that room.
From my understanding, if this local user is already in another room, then any other user(implying a remote server) is already receiving presence from this local user. If that is correct, then removing this remote server from the list of servers that should be sent new presence data is as simple as determining if other rooms this local user shares with other users in this new room are known on the local server.
There exists a function already to do this very thing(with the additional caveat that it will return this new room as well as other existing rooms).This turned out to be a hugely inefficient way to get rooms for any two users at a time. Let's just use the batching version ofget_mutual_rooms_between_users()
was added as part of MSC2666(which is still marked as unstable, I believe) and usesget_rooms_for_user()
under the hood.get_rooms_for_users()
to collect all the currently relevant users room data in one go, which turned out to be better on an order of almost 2 magnitudes(probably because of the internal caching).As part of the existing process for determining if a host should be sent presence:
a. previous occupant
b. newly joined
for
loop(so, both of these will be done twice):a. local users - added to a list to be dealt with separately, as this is 'in-house'(and not part of this work)
b. remote users - host is added(from the user_id) to a list of servers to send presence to(this is where we will focus this work on)
Items under 2b can be checked with
theget_mutual_rooms_between_users()
Dict[user_id, Set[room_ids]]
returned fromget_rooms_for_users()
to verify if thishost
should be in this list at all.This feels like it could be a lot of processing, as it would mean checking every single user against every other user. This is rather inefficient and has been optimized further by not processing if the host being evaluated is already in the list of remote servers to send presence to.(this is much better now that we are doing the processing on the spot instead of leaning on the database)For e.g. say we have 10 users, 7 of which are on m.org and the other 3 on different remote servers. Instead of checking every single m.org user in the database for if they share another room with one of these users, once m.org is in the host list, just bypass it. This is not a perfect solution, as the one user who actually shares another room might be the last one checked, but at least has the potential to be the first as well. Such is the random of unordered sets.This optimization is still in place but as we are not doing individual database hits for pairs of users this particular situation is no longer relevant.As part of this, I added 3 tests to
PresenceJoinTestCase
:One to test the behavior I sought to change
And 2 more to verify I didn't break the 'join-leave-join' presence behavior(plus I didn't see these modeled, so why not?)
I base this summary of how to handle this on information gleaned from all these issues(and possibly others) as well as things linked inside them:
My gut tells me there is still a potential for a leak here. If verifying a remote user who found one other remote user that didn't share a room and required sending presence, but then would have found(but won't now because we bailed early) a third remote user on the second user's server later that meant it would not need to, it does not get removed. Without checking every single user(which means a lot more database hits), I'm not sure there is a way around this.Again, no longer the case.UPDATES:
Pictures of metrics(not included in PR):
To get these, I left then purged then rejoined
Matrix HQ
. Rejoining took approximately 11 hours(I think, my event persister seems to still be processing what it got 5 hours after that, but it could be something else).Here is where my join to the room got processed:
As you can see, it is only processing myself. I am a newly joined user, so I am processed...but then nothing is actually sent out to any remote hosts(because 'previous host after filtered' is empty). I think this is because I've received the state of my own join, but other users state of join is not determined/received yet.
While waiting for
Matrix HQ
to continue it's join, in a different room... You can see here how some of the hosts were filtered outInteresting thing here, we are only processing this at all because someone else joined the room. None of those hosts will be receiving anything but our own
PresenceState
as the newly joined user isn't local to this server. But it's nice the filtering appears work.(Different room again)Worse case scenario, no remote hosts are filtered out.
Again, it's not a local user that just joined, so they will only receive my
PresenceState
.Sometime later(different room again, I believe) where the local homeserver processes a remote user joining. Again, only my own
PresenceState
is sent.Here looks to be the final processing of the join to
Matrix HQ
. This looks correct. Lots of newly joined users and hosts and filtering is working.Pull Request Checklist
(run the linters)
Signed-off-by: Jason Little realtyem@gmail.com