Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pylibssh unable to handle a long stream of output from command execution #549

Open
NilashishC opened this issue Dec 19, 2023 · 0 comments
Open

Comments

@NilashishC
Copy link
Contributor

SUMMARY
  • Related to File copy hangs until timeout is reached regardless of transfer time ansible-collections/cisco.nxos#736

  • We are currently encountering a situation with pylibssh, where the execution consistently halts every time a certain % of data has been received.

  • The following command is sent to a Cisco Nexus switch (using pylibssh transport) to initiate copying a file (~2GB) from a remote server to itself.
    copy scp://test@192.168.1.100/ansible/nxos64.10.1.1.bin bootflash:/nxos64.10.1.1.bin vrf management use-kstack

  • In every iteration, it suddenly stops receiving the data once the transfer is apparently ~28% complete. This causes Ansible to wait until persistent_command_timeout is reached and then fail the task. By nature of how network_cli connection works, the subsequent tasks also fail (unless the connection is explicitly reset), since it is not able to identify a command prompt from the last received response window, which of course is stuck at the "28%" output. Note that the actual file pull continues to happen on the device and ends successfully when completed.

  • We have tried bumping command_timeout to a much bigger value than is required for the file pull to complete. The result is still the same.

A small snippet of the output that this command generates and it sent over the wire:

nxos64.10.1.1.bin                           ...               20%  309MB   1.1MB/s   18:31 ETA '
  
nxos64.10.1.1.bin                           ...            22%  340MB   2.7MB/s   07:11 ETA \r' 

nxos64.10.1.1.bin                           ...          25%  379MB   3.8MB/s   04:57 ETA \r'

nxos64.10.1.1.bin                           ...        28%  438MB   5.2MB/s   03:21 ETA \r' 
  • This does not happen if we switch to paramiko.
ISSUE TYPE
  • Bug Report
PYLISSH and LIBSSH VERSION
Name: ansible-pylibssh
Version: 1.1.0
Summary: Python bindings for libssh client specific to Ansible use case
Home-page: https://github.com/ansible/pylibssh
Author: Ansible, Inc.
Author-email: info+github/ansible/pylibssh@ansible.com
License: LGPLv2+
Location: /home/nchakrab/.virtualenvs/core/lib/python3.10/site-packages
Requires: 
Required-by: 
bash-4.4# rpm -qa| grep libssh
libssh-0.9.6-3.el8.x86_64
python39-ansible-pylibssh-1.0.0-1.el8ap.x86_64
libssh-config-0.9.6-3.el8.noarch
OS / ENVIRONMENT
bash-4.4# cat /etc/os-release 
NAME="Red Hat Enterprise Linux"
VERSION="8.6 (Ootpa)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="8.6"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Red Hat Enterprise Linux 8.6 (Ootpa)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:8::baseos"
HOME_URL="https://www.redhat.com/"
DOCUMENTATION_URL="https://access.redhat.com/documentation/red_hat_enterprise_linux/8/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 8"
REDHAT_BUGZILLA_PRODUCT_VERSION=8.6
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="8.6"
STEPS TO REPRODUCE
---
- hosts: nxos
  gather_facts: no
  tasks:
    - name: initiate file copy from device (will take 10 minutes)
      cisco.nxos.nxos_file_copy:
        file_pull: true
        local_file: nxos64-cs.10.2.5.M.bin
        local_file_directory: /
        remote_file: /tmp/nxos64-cs.10.2.5.M.bin
        remote_scp_server: 192.168.1.10
        remote_scp_server_user: admin
        remote_scp_server_password: admin
        vrf: management
      ignore_errors: true
   - name: "SCP Copying file {{ file_to_copy }} to device"
      ansible.netcommon.cli_command:
        check_all: true
        command: "copy scp://{{ https_scp_servers[copy_server]['user'] }}@{{ https_scp_servers[copy_server]['ip'] }}{{ https_scp_servers[copy_server]['path'] }}/{{ file_to_copy }} bootflash:/{{ file_to_copy }} vrf {{ copy_vrf }}"
        prompt: "password"
        answer: "{{ https_scp_servers[copy_server]['pass'] | string }}"
#      register: scp_output
EXPECTED RESULTS
  • Task ends successfully once the file pull operation is complete.
ACTUAL RESULTS
  • Execution halts at a certain stage, causing Ansible to wait until command timeout is reached, then fails the task and the subsequent onces.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant