Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edit httpd balancer member host and port #365

Closed
wants to merge 2 commits into from
Closed

Edit httpd balancer member host and port #365

wants to merge 2 commits into from

Conversation

Mrida
Copy link

@Mrida Mrida commented Jun 29, 2023

The goal of this pr is to change the balancer member url (host and port) dynamically.

In order to edit a member, we defined 2 params to pass:
b_ewyes: a boolean that needs to be set to 1 for the edit to work
b_ewrkr: the new url for the worker

The UI was updated to account for these 2 fields.

worker->s->port, 0,
worker->cp->dns_pool);
worker->cp->addr = addr;
}
Copy link
Member

@ylavic ylavic Jun 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coincidentally I just started a discussion there https://lists.apache.org/thread/jftfd6mo6p3b0tw694rlh09gdl7189hq about the unsafety of changing worker->cp->addr like this. We need a thread-safe way to do it, probably outside the scope of this PR though..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not to talk about changing worker->s->hostname_ex/port (even under the worker lock) while there are concurrent readers like here which don't take the lock.

@ylavic
Copy link
Member

ylavic commented Jun 30, 2023

Is it not possible to create a new worker and delete/disable the old one instead? This might be easier from a synchronisation perspective.

@Mrida
Copy link
Author

Mrida commented Jun 30, 2023

Disabling the old worker is not possible because we will eventually run out of workers.
I see two issues with deleting the worker:
1- The worker might be already working on something
2- We are not aware of a way to delete a worker safely

I propose to use a lock to make sure that read and write operations are seen by every thread.
If you have performance consideration, we can use an atomic CAS operation to lock only when a worker is being updated.

@Mrida
Copy link
Author

Mrida commented Jul 20, 2023

Hello ylavic,
I would really appreciate your feedback on my last comment in order to proceed forward with this pr.

@ylavic
Copy link
Member

ylavic commented Jul 20, 2023

The problem is that the worker shared data are used in multiple place during runtime, just look at worker->s->name/hostname usages, we can't change the ->name (identifier) of the worker or the ->hostname/port underneath the threads using them for current connections, and we can't (won't) synchronize everywhere neither.
If you look at #367 you can see the kind of synchronization that's needed to "simply" cope with an external DNS record change, the shared hostname does not even change, and it's from a single place/step of connection processing..

The only sane way I can think of to replace a proxy worker is:

  1. disable/shutdown the old worker
  2. create the new worker
  3. delete the old worker (and refuse to do so while there are active users/connections, hence require 1. first possibly)

(actually order between 2. and 3. is business decision, not functional)

So implementing 3. is what's missing, for that we'd need to keep track of the used and unused worker shared slots to reuse them, and of the users. Not a trivial change, but probably doable. Worth it I don't know, one can also over-reserve worker slots and schedule a restart from time to time too.

@Mrida
Copy link
Author

Mrida commented Jul 21, 2023

Hello @ylavic, thank you for the reply.

Our need can be fullfilled by updating the url for a disabled worker only, so what we intend to do is the following:
1- Disable the worker
2- Edit the worker url
3- Re-enable the worker

From httpd side, this means that we need to add a check that a worker is disabled before allowing to update its url, otherwise the post will return an error.
The frontend is to be adapted as well to reflect this change.

Do you think we can proceed with this suggestion ?

@ylavic
Copy link
Member

ylavic commented Jul 21, 2023

Unfortunately this is still racy, some requests might still arrive and reach the disabled worker, and httpd still needs a stable worker name/hostname to behave correctly (return 503) in this case.
This can't be achieved by replacing the worker name/hostname inline I think, and it's why I proposed to add+delete, where delete could wait for the last user of the worker to be effective for instance (the proxy_conn_recs using the worker could be tracked by atomic refcounting).

@Mrida Mrida closed this by deleting the head repository Sep 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants