Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple vCenters at the same time can cause errors #364

Open
MallocArray opened this issue Nov 15, 2022 · 1 comment
Open

Multiple vCenters at the same time can cause errors #364

MallocArray opened this issue Nov 15, 2022 · 1 comment

Comments

@MallocArray
Copy link

SUMMARY

Might be a repeat of #336 where playbooks with multiple tasks that use the vmware.vmware_rest modules against multiple vCenters at the same time may error. It is not completely repeatable but does happen often. Sometimes a retry fixes it. Unable to reproduce with a single vCenter. Running cloud.common 2.1.2 and vmware.vmware_rest 2.2.0

ISSUE TYPE
  • Bug Report
COMPONENT NAME

Observed in
appliance_access_ssh
appliance_networking_proxy
and possibly others

ANSIBLE VERSION
ansible [core 2.13.6]
  config file = /runner/ansible.cfg
  configured module search path = ['/home/runner/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/local/lib/python3.8/site-packages/ansible
  ansible collection location = /runner/collections
  executable location = /usr/local/bin/ansible
  python version = 3.8.13 (default, Jun 24 2022, 15:27:57) [GCC 8.5.0 20210514 (Red Hat 8.5.0-13)]
  jinja version = 3.1.2
  libyaml = True
COLLECTION VERSION
Collection         Version
------------------ -------
vmware.vmware_rest 2.2.0 
cloud.common 2.1.2  
CONFIGURATION
COLLECTIONS_PATHS(/runner/ansible.cfg) = ['/runner/collections']
DEFAULT_FILTER_PLUGIN_PATH(/runner/ansible.cfg) = ['/runner/custom_filters']
DEFAULT_ROLES_PATH(/runner/ansible.cfg) = ['/runner/roles']
DEFAULT_STRATEGY_PLUGIN_PATH(/runner/ansible.cfg) = ['/runner/custom_plugins/mitogen-0.3.0-rc.0/ansible_mitogen/plugins/strategy']
DEFAULT_TIMEOUT(/runner/ansible.cfg) = 40
HOST_KEY_CHECKING(/runner/ansible.cfg) = False
PARAMIKO_LOOK_FOR_KEYS(/runner/ansible.cfg) = False
OS / ENVIRONMENT

Ansible Execution Environment from ansible-builder
vCenter 7.0 U3

STEPS TO REPRODUCE

Playbook with several tasks in a row all using the same collection. After

  tasks:
    - name: DNS Servers
      vmware.vmware_rest.appliance_networking_dns_servers:
        vcenter_hostname: "{{ inventory_hostname }}"
        vcenter_username: "{{ vcenter_user }}"
        vcenter_password: "{{ vcenter_password }}"
        vcenter_validate_certs: "{{ vmware_validate_certs }}"
        servers: "{{ dns_servers }}"
        mode: is_static
      delegate_to: localhost
      retries: 12
      delay: 10
      tags:
        - dns

    - name: Content Library
      community.vmware.vmware_content_library_manager:
        hostname: "{{ inventory_hostname }}"
        username: "{{ vcenter_user }}"
        password: "{{ vcenter_password }}"
        validate_certs: "{{ vmware_validate_certs }}"
        library_name: "{{ item.library_name }}"
        library_description: "{{ item.library_description | default(omit) }}"
        library_type: "{{ item.library_type | default(omit) }}"
        datastore_name: "{{ item.datastore_name | default(omit) }}"
        subscription_url: "{{ item.subscription_url | default(omit) }}"
        ssl_thumbprint: "{{ item.ssl_thumbprint | default(omit) }}"
        update_on_demand: "{{ item.update_on_demand | default(omit) }}"
        state: "{{ item.state | default(omit) }}"
      delegate_to: localhost
      loop: "{{ content_libraries }}"
      loop_control:
        label: "{{ item.library_name }}"
      when: content_libraries is defined
      retries: 12
      delay: 10
      tags:
        - content_library

    - name: Timezone
      vmware.vmware_rest.appliance_system_time_timezone:
        vcenter_hostname: "{{ inventory_hostname }}"
        vcenter_username: "{{ vcenter_user }}"
        vcenter_password: "{{ vcenter_password }}"
        vcenter_validate_certs: "{{ vmware_validate_certs }}"
        name: America/Chicago
      delegate_to: localhost
      retries: 12
      delay: 10
      tags:
        - timezone

    - name: NTP configuration
      vmware.vmware_rest.appliance_ntp:
        vcenter_hostname: "{{ inventory_hostname }}"
        vcenter_username: "{{ vcenter_user }}"
        vcenter_password: "{{ vcenter_password }}"
        vcenter_validate_certs: "{{ vmware_validate_certs }}"
        servers: "{{ ntp_servers }}"
      delegate_to: localhost
      retries: 12
      delay: 10
      tags:
        - ntp
      notify: Restart the ntpd service

    - name: Enable NTP time sync
      vmware.vmware_rest.appliance_timesync:
        vcenter_hostname: "{{ inventory_hostname }}"
        vcenter_username: "{{ vcenter_user }}"
        vcenter_password: "{{ vcenter_password }}"
        vcenter_validate_certs: "{{ vmware_validate_certs }}"
        mode: NTP
      delegate_to: localhost
      retries: 12
      delay: 10
      tags:
        - ntp

    - name: Network Proxy - HTTP
      vmware.vmware_rest.appliance_networking_proxy:
        vcenter_hostname: "{{ inventory_hostname }}"
        vcenter_username: "{{ vcenter_user }}"
        vcenter_password: "{{ vcenter_password }}"
        vcenter_validate_certs: "{{ vmware_validate_certs }}"
        enabled: true
        server: "{{ http_proxy }}"
        port: "{{ http_proxy_port }}"
        protocol: http
      delegate_to: localhost
      when: http_proxy is defined
      retries: 12
      delay: 10
      tags:
        - proxy
        - http

    - name: Network Proxy - HTTPS
      vmware.vmware_rest.appliance_networking_proxy:
        vcenter_hostname: "{{ inventory_hostname }}"
        vcenter_username: "{{ vcenter_user }}"
        vcenter_password: "{{ vcenter_password }}"
        vcenter_validate_certs: "{{ vmware_validate_certs }}"
        enabled: true
        server: "{{ https_proxy }}"
        port: "{{ https_proxy_port }}"
        protocol: https
      delegate_to: localhost
      when: https_proxy is defined
      retries: 12
      delay: 10
      tags:
        - proxy
        - https
EXPECTED RESULTS

All tasks complete the same as if they were run with the option serial: 1 being set

ACTUAL RESULTS

On the 7th task, one of the 12 vcenters in the inventory threw the error below while all of the others completed successfully. Retrying it may succeed on all, or fail with similar message on a different vcenter with the same error

An exception occurred during task execution. To see the full traceback, use -vvv. The error was: ansible_collections.cloud.common.plugins.module_utils.turbo.exceptions.EmbeddedModuleUnexpectedFailure: Cannot decode plugin answer: b''
fatal: [randomvcenter.domain.com -> localhost]: FAILED! => {"changed": false, "module_stderr": "Traceback (most recent call last):\n  File \"/tmp/ansible_vmware.vmware_rest.appliance_access_ssh_payload_ypjpjcua/ansible_vmware.vmware_rest.appliance_access_ssh_payload.zip/ansible_collections/cloud/common/plugins/module_utils/turbo/common.py\", line 106, in communicate\n  File \"/usr/lib64/python3.8/json/__init__.py\", line 357, in loads\n    return _default_decoder.decode(s)\n  File \"/usr/lib64/python3.8/json/decoder.py\", line 337, in decode\n    obj, end = self.raw_decode(s, idx=_w(s, 0).end())\n  File \"/usr/lib64/python3.8/json/decoder.py\", line 355, in raw_decode\n    raise JSONDecodeError(\"Expecting value\", s, err.value) from None\njson.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File \"/root/.ansible/tmp/ansible-tmp-1668545865.656959-11481-114100221361235/AnsiballZ_appliance_access_ssh.py\", line 107, in <module>\n    _ansiballz_main()\n  File \"/root/.ansible/tmp/ansible-tmp-1668545865.656959-11481-114100221361235/AnsiballZ_appliance_access_ssh.py\", line 99, in _ansiballz_main\n    invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)\n  File \"/root/.ansible/tmp/ansible-tmp-1668545865.656959-11481-114100221361235/AnsiballZ_appliance_access_ssh.py\", line 47, in invoke_module\n    runpy.run_module(mod_name='ansible_collections.vmware.vmware_rest.plugins.modules.appliance_access_ssh', init_globals=dict(_module_fqn='ansible_collections.vmware.vmware_rest.plugins.modules.appliance_access_ssh', _modlib_path=modlib_path),\n  File \"/usr/lib64/python3.8/runpy.py\", line 207, in run_module\n    return _run_module_code(code, init_globals, run_name, mod_spec)\n  File \"/usr/lib64/python3.8/runpy.py\", line 97, in _run_module_code\n    _run_code(code, mod_globals, init_globals,\n  File \"/usr/lib64/python3.8/runpy.py\", line 87, in _run_code\n    exec(code, run_globals)\n  File \"/tmp/ansible_vmware.vmware_rest.appliance_access_ssh_payload_ypjpjcua/ansible_vmware.vmware_rest.appliance_access_ssh_payload.zip/ansible_collections/vmware/vmware_rest/plugins/modules/appliance_access_ssh.py\", line 261, in <module>\n  File \"/usr/lib64/python3.8/asyncio/base_events.py\", line 616, in run_until_complete\n    return future.result()\n  File \"/tmp/ansible_vmware.vmware_rest.appliance_access_ssh_payload_ypjpjcua/ansible_vmware.vmware_rest.appliance_access_ssh_payload.zip/ansible_collections/vmware/vmware_rest/plugins/modules/appliance_access_ssh.py\", line 177, in main\n  File \"/tmp/ansible_vmware.vmware_rest.appliance_access_ssh_payload_ypjpjcua/ansible_vmware.vmware_rest.appliance_access_ssh_payload.zip/ansible_collections/cloud/common/plugins/module_utils/turbo/module.py\", line 119, in __init__\n  File \"/tmp/ansible_vmware.vmware_rest.appliance_access_ssh_payload_ypjpjcua/ansible_vmware.vmware_rest.appliance_access_ssh_payload.zip/ansible_collections/cloud/common/plugins/module_utils/turbo/module.py\", line 154, in run_on_daemon\n  File \"/tmp/ansible_vmware.vmware_rest.appliance_access_ssh_payload_ypjpjcua/ansible_vmware.vmware_rest.appliance_access_ssh_payload.zip/ansible_collections/cloud/common/plugins/module_utils/turbo/common.py\", line 109, in communicate\nansible_collections.cloud.common.plugins.module_utils.turbo.exceptions.EmbeddedModuleUnexpectedFailure: Cannot decode plugin answer: b''\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1}
goneri added a commit to goneri/vmware.vmware_rest that referenced this issue Dec 1, 2022
Test to validate we can run the same module in parallel.

See: ansible-collections#364
softwarefactory-project-zuul bot pushed a commit that referenced this issue Mar 14, 2023
appliance-multi-hosts tests

Test to validate we can run the same module in parallel.
See: #364

Reviewed-by: Alina Buzachis
@mikemorency
Copy link
Collaborator

mikemorency commented Dec 2, 2024

I believe this is an issue with the turbo server. If you try to switch clusters but the turbo server doesnt reset (its reset is time based) then the old cluster will be used instead of the new one. Probably will be fixed by #499

Not ideal but as a workaround, you can add a pause in between the tasks. Do all of cluster A tasks first, pause 15 secs, do all of cluster B tasks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants