You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When client receives input and flags a first_incorrect frame, adjust_gamestate pushes load + advance requests to perform rollback. it then calls self.sync_layer.reset_prediction() which resets tracked first_incorrect frame.
In advance_frame when applying local input, if PredictionThreshold error is returned, function exits with error and drops the requests for the correction. Due to first incorrect frame being reset, this correction is missed and causes desync.
I'm still reasoning about exactly where the issue is / what the solution is. I believe the example games are vulnerable to this bug too.
Additional Context
Here is example of logs on a client that missed the correction (3 client game. One client missed correction for player 1's input, the 2nd applied it correctly, now players 0 and 2 are desynced). (the frame listed in input log is predicted frame, in case that is confusing). Might be more confusing than helpful, but demonstrates that both clients were notified of player 1's input, ggrs pushed requests for corrections, but the one that hit prediction threshold did not rollback, causing desync.
Player 0's logs (Missed correction on player 1's input due to prediction threshold)
We confirm frame 500 without rollback applied. (No log here for bones performing rollback like in second block, only prediction error)
2024-04-10T02:42:31.164609Z DEBUG bones_framework::networking: Net player(1) local: false, status: Predicted, frame: 504 input: DensePlayerControl { .0: 992, jump_pressed: false, shoot_pressed: false, grab_pressed: false, slide_pressed: false, ragdoll_pressed: false, move_direction: DenseMoveDirection(Vec2(1.0, 0.0)) }
2024-04-10T02:42:31.180776Z WARN ggrs::input_queue: Setting first incorrect frame: 500
2024-04-10T02:42:31.180807Z WARN ggrs::sessions::p2p_session: Requesting rollback to frame: 500
2024-04-10T02:42:31.180818Z WARN ggrs::sessions::p2p_session: Set last confirmed frame: 498
2024-04-10T02:42:31.180823Z WARN ggrs::sync_layer: Prediction threshold error
2024-04-10T02:42:31.180829Z WARN bones_framework::networking: Freezing game while waiting for network to catch-up.
2024-04-10T02:42:31.214065Z WARN ggrs::sessions::p2p_session: Set last confirmed frame: 500
2024-04-10T02:42:31.216185Z DEBUG bones_framework::networking: Net player(1) local: false, status: Predicted, frame: 505 input: DensePlayerControl { .0: 0, jump_pressed: false, shoot_pressed: false, grab_pressed: false, slide_pressed: false, ragdoll_pressed: false, move_direction: DenseMoveDirection(Vec2(0.0, 0.0)) }
Player 2's logs (correction applied correctly on player 1's input):
Rollback in game is logged, and player 1's next input is confirmed.
I can repro in jumpy pretty easily / can provide steps - but I will try to write a test to repro this for testing + preventing regression once figure out what to do here.
One thing that helps repro is having 4 clients open at once (with a relay server so not p2p locally), and having 200+ ping, lots of prediction threshold errors :) (Silly way to say that poor conditions definitely bring this to light, possibly high ping + lower prediction window might do it).
The text was updated successfully, but these errors were encountered:
Haven't gotten to exploring what kind of fix might make the most sense, but I wrote a test that reproduces this.
I implemented a fake DebugSocket mechanism that allows the test implementation to control when messages are actually delivered between clients, to help reproducibly enter a state in which a correction happens at same time as prediction threshold error.
The simplest form I have found to repro this is with 3 clients.
Client A changes its input, but B/C do not receive it yet.
Client C gets A's new input and triggers rollback, but has not yet received B's input, so also gets prediction threshold error.
Test fails because C does not get requests for rollback due to error, and misses the correction.
Then tick a couple more normal frames to get desync detection messages exchanged, and then test fails due to desync.
Hi! thanks for posting this bug report. Sorry for not responding so far. I have just posted a PR that slightly alters the logic for handling inputs. I am not 100% sure this fixes this issue, but it might. Would you be able to test this again on the lockstep branch?
-> #79
Describe the bug
When client receives input and flags a
first_incorrect
frame,adjust_gamestate
pushes load + advance requests to perform rollback. it then callsself.sync_layer.reset_prediction()
which resets tracked first_incorrect frame.In
advance_frame
when applying local input, ifPredictionThreshold
error is returned, function exits with error and drops the requests for the correction. Due to first incorrect frame being reset, this correction is missed and causes desync.I'm still reasoning about exactly where the issue is / what the solution is. I believe the example games are vulnerable to this bug too.
Additional Context
Here is example of logs on a client that missed the correction (3 client game. One client missed correction for player 1's input, the 2nd applied it correctly, now players 0 and 2 are desynced). (the frame listed in input log is predicted frame, in case that is confusing). Might be more confusing than helpful, but demonstrates that both clients were notified of player 1's input, ggrs pushed requests for corrections, but the one that hit prediction threshold did not rollback, causing desync.
Player 0's logs (Missed correction on player 1's input due to prediction threshold)
Player 2's logs (correction applied correctly on player 1's input):
To Reproduce
I can repro in jumpy pretty easily / can provide steps - but I will try to write a test to repro this for testing + preventing regression once figure out what to do here.
One thing that helps repro is having 4 clients open at once (with a relay server so not p2p locally), and having 200+ ping, lots of prediction threshold errors :) (Silly way to say that poor conditions definitely bring this to light, possibly high ping + lower prediction window might do it).
The text was updated successfully, but these errors were encountered: