PredictionThreshold error drops requests missing correction #75

MaxCWhitehead · 2024-04-10T03:07:18Z

Describe the bug

When client receives input and flags a first_incorrect frame, adjust_gamestate pushes load + advance requests to perform rollback. it then calls self.sync_layer.reset_prediction() which resets tracked first_incorrect frame.

In advance_frame when applying local input, if PredictionThreshold error is returned, function exits with error and drops the requests for the correction. Due to first incorrect frame being reset, this correction is missed and causes desync.

I'm still reasoning about exactly where the issue is / what the solution is. I believe the example games are vulnerable to this bug too.

Additional Context

Here is example of logs on a client that missed the correction (3 client game. One client missed correction for player 1's input, the 2nd applied it correctly, now players 0 and 2 are desynced). (the frame listed in input log is predicted frame, in case that is confusing). Might be more confusing than helpful, but demonstrates that both clients were notified of player 1's input, ggrs pushed requests for corrections, but the one that hit prediction threshold did not rollback, causing desync.

Player 0's logs (Missed correction on player 1's input due to prediction threshold)

We confirm frame 500 without rollback applied. (No log here for bones performing rollback like in second block, only prediction error)

2024-04-10T02:42:31.164609Z DEBUG bones_framework::networking: Net player(1) local: false, status: Predicted, frame: 504 input: DensePlayerControl { .0: 992, jump_pressed: false, shoot_pressed: false, grab_pressed: false, slide_pressed: false, ragdoll_pressed: false, move_direction: DenseMoveDirection(Vec2(1.0, 0.0)) }
2024-04-10T02:42:31.180776Z  WARN ggrs::input_queue: Setting first incorrect frame: 500
2024-04-10T02:42:31.180807Z  WARN ggrs::sessions::p2p_session: Requesting rollback to frame: 500
2024-04-10T02:42:31.180818Z  WARN ggrs::sessions::p2p_session: Set last confirmed frame: 498
2024-04-10T02:42:31.180823Z  WARN ggrs::sync_layer: Prediction threshold error
2024-04-10T02:42:31.180829Z  WARN bones_framework::networking: Freezing game while waiting for network to catch-up.
2024-04-10T02:42:31.214065Z  WARN ggrs::sessions::p2p_session: Set last confirmed frame: 500
2024-04-10T02:42:31.216185Z DEBUG bones_framework::networking: Net player(1) local: false, status: Predicted, frame: 505 input: DensePlayerControl { .0: 0, jump_pressed: false, shoot_pressed: false, grab_pressed: false, slide_pressed: false, ragdoll_pressed: false, move_direction: DenseMoveDirection(Vec2(0.0, 0.0)) }

Player 2's logs (correction applied correctly on player 1's input):

Rollback in game is logged, and player 1's next input is confirmed.

2024-04-10T02:42:31.182435Z DEBUG bones_framework::networking: Net player(1) local: false, status: Predicted, frame: 503 input: DensePlayerControl { .0: 992, jump_pressed: false, shoot_pressed: false, grab_pressed: false, slide_pressed: false, ragdoll_pressed: false, move_direction: DenseMoveDirection(Vec2(1.0, 0.0)) }
2024-04-10T02:42:31.189945Z  WARN ggrs::input_queue: Setting first incorrect frame: 500
2024-04-10T02:42:31.189965Z  WARN ggrs::sessions::p2p_session: Requesting rollback to frame: 500
2024-04-10T02:42:31.189974Z  WARN ggrs::sessions::p2p_session: Set last confirmed frame: 499
2024-04-10T02:42:31.190980Z DEBUG bones_framework::networking: Loading (rollback) frame: 500
2024-04-10T02:42:31.190999Z DEBUG bones_framework::networking: Net player(1) local: false, status: Confirmed, frame: 504 input: DensePlayerControl { .0: 0, jump_pressed: false, shoot_pressed: false, grab_pressed: false, slide_pressed: false, ragdoll_pressed: false, move_direction: DenseMoveDirection(Vec2(0.0, 0.0)) }

To Reproduce

I can repro in jumpy pretty easily / can provide steps - but I will try to write a test to repro this for testing + preventing regression once figure out what to do here.

One thing that helps repro is having 4 clients open at once (with a relay server so not p2p locally), and having 200+ ping, lots of prediction threshold errors :) (Silly way to say that poor conditions definitely bring this to light, possibly high ping + lower prediction window might do it).

The text was updated successfully, but these errors were encountered:

MaxCWhitehead · 2024-04-11T02:03:13Z

Haven't gotten to exploring what kind of fix might make the most sense, but I wrote a test that reproduces this.

I implemented a fake DebugSocket mechanism that allows the test implementation to control when messages are actually delivered between clients, to help reproducibly enter a state in which a correction happens at same time as prediction threshold error.

The simplest form I have found to repro this is with 3 clients.

Client A changes its input, but B/C do not receive it yet.
Client C gets A's new input and triggers rollback, but has not yet received B's input, so also gets prediction threshold error.
Test fails because C does not get requests for rollback due to error, and misses the correction.
Then tick a couple more normal frames to get desync detection messages exchanged, and then test fails due to desync.

Here's the test for reference: main...MaxCWhitehead:ggrs:prediction-error-rollback-test

Will look into how might fix / test against this.

gschup · 2024-06-01T06:53:03Z

Hi! thanks for posting this bug report. Sorry for not responding so far. I have just posted a PR that slightly alters the logic for handling inputs. I am not 100% sure this fixes this issue, but it might. Would you be able to test this again on the lockstep branch?
-> #79

gschup · 2024-06-01T09:55:16Z

#70 has additional information on this issue.

MaxCWhitehead added the bug Something isn't working label Apr 10, 2024

This was referenced Apr 11, 2024

Possible fix for PredictionThreshold error dropping rollback requests causing desync #77

Draft

Rollback requests ignored when max_predictions number of local inputs is saved #70

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PredictionThreshold error drops requests missing correction #75

PredictionThreshold error drops requests missing correction #75

MaxCWhitehead commented Apr 10, 2024 •

edited

Loading

MaxCWhitehead commented Apr 11, 2024

gschup commented Jun 1, 2024

gschup commented Jun 1, 2024

PredictionThreshold error drops requests missing correction #75

PredictionThreshold error drops requests missing correction #75

Comments

MaxCWhitehead commented Apr 10, 2024 • edited Loading

Describe the bug

Additional Context

To Reproduce

MaxCWhitehead commented Apr 11, 2024

gschup commented Jun 1, 2024

gschup commented Jun 1, 2024

MaxCWhitehead commented Apr 10, 2024 •

edited

Loading