Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: scale Juju controllers according to anvil cluster size #30

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

wyattrees
Copy link
Contributor

Add controllers to machines missing them once the cluster size is at least 3, increasing the number of controllers for every odd machine added

@SK1Y101
Copy link
Member

SK1Y101 commented Jul 9, 2024

Should we add information in the README the explains that juju controllers will grow with the cluster, rather than leaving it transparent to the user?

anvil-python/anvil/utils.py Outdated Show resolved Hide resolved
anvil-python/anvil/utils.py Outdated Show resolved Hide resolved
anvil-python/anvil/utils.py Outdated Show resolved Hide resolved
anvil-python/anvil/jobs/juju.py Outdated Show resolved Hide resolved
anvil-python/anvil/provider/local/commands.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@skatsaounis skatsaounis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just finished another round of review. Apart from the comments, I think that we should also add a scale down process. That can be handled on the same step with extra logic or the already implemented step should be renamed to something like ScaleUpJujuStep and a second one should be introduced: ScaleDownJujuStep. The scale down will just be remove-unit controller/<unit> and has to run when nodes are removed from the anvil cluster.

anvil-python/anvil/jobs/manifest.py Outdated Show resolved Hide resolved
anvil-python/anvil/provider/local/commands.py Outdated Show resolved Hide resolved
anvil-python/anvil/commands/juju.py Outdated Show resolved Hide resolved
anvil-python/anvil/jobs/juju.py Outdated Show resolved Hide resolved
@wyattrees
Copy link
Contributor Author

I just finished another round of review. Apart from the comments, I think that we should also add a scale down process. That can be handled on the same step with extra logic or the already implemented step should be renamed to something like ScaleUpJujuStep and a second one should be introduced: ScaleDownJujuStep. The scale down will just be remove-unit controller/<unit> and has to run when nodes are removed from the anvil cluster.

juju remove-unit controller/<unit> is not supported--it errors with "ERROR removing unit controller/2 failed: removing units from the controller application not supported." AFAIK you can only remove controller machines, which would also force us to remove the other applications on those machines

@skatsaounis
Copy link
Collaborator

This is blocked by: https://bugs.launchpad.net/juju/+bug/2073986

MAX_JUJU_CONTROLLERS - len(self.controller_machines),
)

cmd = [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This command should be wrapped with try/expect since it can fail and we need to know why. This is a failure than can happen because of the linked bug:

ubuntu@maas-1:~$ juju enable-ha -n 3 --to 3
ERROR juju-ha-space is not set and a unique usable address was not found for machines: 0

With the current code, we fail in an unexpected way:

EBUG    Command '['/snap/maas-anvil/x1/juju/bin/juju', 'enable-ha', '-n', '3', '--to', '3']' returned non-zero exit status 1.                                                        utils.py:38
                    Traceback (most recent call last):                                                                                                                                                      
                      File "/snap/maas-anvil/x1/lib/python3.10/site-packages/anvil/utils.py", line 32, in __call__                                                                                          
                        return self.main(*args, **kwargs)                                                                                                                                                   
                      File "/snap/maas-anvil/x1/lib/python3.10/site-packages/click/core.py", line 1078, in main                                                                                             
                        rv = self.invoke(ctx)                                                                                                                                                               
                      File "/snap/maas-anvil/x1/lib/python3.10/site-packages/click/core.py", line 1688, in invoke                                                                                           
                        return _process_result(sub_ctx.command.invoke(sub_ctx))                                                                                                                             
                      File "/snap/maas-anvil/x1/lib/python3.10/site-packages/click/core.py", line 1688, in invoke                                                                                           
                        return _process_result(sub_ctx.command.invoke(sub_ctx))                                                                                                                             
                      File "/snap/maas-anvil/x1/lib/python3.10/site-packages/click/core.py", line 1434, in invoke                                                                                           
                        return ctx.invoke(self.callback, **ctx.params)                                                                                                                                      
                      File "/snap/maas-anvil/x1/lib/python3.10/site-packages/click/core.py", line 783, in invoke                                                                                            
                        return __callback(*args, **kwargs)                                                                                                                                                  
                      File "/snap/maas-anvil/x1/lib/python3.10/site-packages/click/decorators.py", line 33, in new_func                                                                                     
                        return f(get_current_context(), *args, **kwargs)                                                                                                                                    
                      File "/snap/maas-anvil/x1/lib/python3.10/site-packages/anvil/provider/local/commands.py", line 599, in remove                                                                         
                        run_plan(plan, console)                                                                                                                                                             
                      File "/snap/maas-anvil/x1/lib/python3.10/site-packages/sunbeam/jobs/common.py", line 277, in run_plan                                                                                 
                        result = step.run(status)                                                                                                                                                           
                      File "/snap/maas-anvil/x1/lib/python3.10/site-packages/anvil/commands/juju.py", line 186, in run                                                                                      
                        process = subprocess.run(                                                                                                                                                           
                      File "/usr/lib/python3.10/subprocess.py", line 526, in run                                                                                                                            
                        raise CalledProcessError(retcode, process.args,                                                                                                                                     
                    subprocess.CalledProcessError: Command '['/snap/maas-anvil/x1/juju/bin/juju', 'enable-ha', '-n', '3', '--to', '3']' returned non-zero exit status 1.                                    
           WARNING  An unexpected error has occurred. Please run 'maas-anvil inspect' to generate an inspection report.                                                                          utils.py:43
           ERROR    Error: Command '['/snap/maas-anvil/x1/juju/bin/juju', 'enable-ha', '-n', '3', '--to', '3']' returned non-zero exit status 1.                                                 utils.py:44
           ERROR    Task was destroyed but it is pending!                                                                                                                                base_events.py:1758
                    task: <Task pending name='Task-59' coro=<Connection._pinger.<locals>._do_ping() done, defined at                                                                                        
                    /snap/maas-anvil/x1/lib/python3.10/site-packages/juju/client/connection.py:599> wait_for=<Future cancelled>                                                                             
                    cb=[create_task_with_handler.<locals>._task_result_exp_handler(task_name='tmp', logger=<Logger juju....ction (ERROR)>)() at                                                             
                    /snap/maas-anvil/x1/lib/python3.10/site-packages/juju/jasyncio.py:39]>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants