Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReconnectingConnection: Add heartbeat check. #264

Merged
merged 2 commits into from
Jun 25, 2023

Conversation

shachlanAmazon
Copy link
Contributor

No description provided.

@shachlanAmazon
Copy link
Contributor Author

@barshaul moved the heartbeat logic to ClientCMD.

Copy link
Collaborator

@barshaul barshaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments

};
log_debug("ClientCMD", "performing heartbeat");
if connection
.send_packed_command(&redis::cmd("PING"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to run it with timeout

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and what should we do on a timeout?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would retry the PING for 2-3 times if a timeout occur, then if timeout still persist close the connection and try to reconnect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why retry the ping?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that for a response timeout we can retry before we decide that the connection needs to be re-established

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why isn't one timeout indicative?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because we might fell on a time where the node was busy. But it doesnt matter, we can do it without retrying

tokio::time::sleep(super::HEARTBEAT_SLEEP_DURATION).await;
if reconnecting_connection.is_dropped() {
log_debug("ClientCMD", "heartbeat stopped after client was dropped");
// Client was dropped, heartbeat can stop.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't have to be that the client was dropped, we can drop the connection for topology changes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, replace "client" with "connection"?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

})
});

let _new_server = receiver.await;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we check before that the connection is indeed in 'reconnecting' state?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it mean that the CMD client is in reconnecting state? It will contain multiple internal connections.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we can check it throws a connection error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this tests the client, not the connection. what does it mean that a client is in a reconnecting state?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that we expect it to raise a connection error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then how will the test differ from test_report_disconnect_and_reconnect_after_temporary_disconnect?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok so leave it as is

@shachlanAmazon
Copy link
Contributor Author

ping @barshaul

@shachlanAmazon shachlanAmazon merged commit 5f6cc1f into valkey-io:main Jun 25, 2023
8 checks passed
@shachlanAmazon shachlanAmazon deleted the heartbeat branch June 25, 2023 15:38
@shachlanAmazon shachlanAmazon mentioned this pull request Jul 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants