-
Notifications
You must be signed in to change notification settings - Fork 405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
failed assertion in common-channel.c:705: `!channel->sent_close' #321
Comments
Thanks for the debug log. It looks like something is going wrong when a channel is opened around the same time as another is being closed (which should work fine). I'll try and reproduce it here to debug, if you have any details of the libssh call sequence that would help. |
I've reproduced it here so don't need any more details. |
Great, thanks for looking into it! |
FTR, i'm also seeing this while using the guix bug report: https://issues.guix.gnu.org/73306 |
I've figured what's going on and have a fix here, will push it after a bit of cleaning up later this week. |
If check_close() ran prior to a server channel exec/shell request, it would send a close immediately. This fix changes it to exclude write_fd==FD_UNINIT from being closed there. When a channel was closed by the time shell/exec request was received, then data sent hits an assertion. This fixes #321 on Github. The "pid == 0" check was initially added to avoid waiting to close a channel when a process has never been launched (which is correct), but that isn't correct in the case of the closed-fd test. Fixes: 8e6f73e ("- Remove "flushing" handling for exited processes)
I think #326 should fix this, I haven't yet added the testcase for it. |
Good news! I'm going to apply this on top of 2024.85 next week and will retry it based on our use case. |
The change appears to be good. |
This reproduces the problem reported in github #321 asyncssh is used to drive the connection for this test.
This reproduces the problem reported in github #321 asyncssh is used to drive the connection for this test.
If check_close() ran prior to a server channel exec/shell request, it would send a close immediately. This fix changes it to exclude write_fd==FD_UNINIT from being closed there. When a channel was closed by the time shell/exec request was received, then data sent hits an assertion. This fixes #321 on Github. The "pid == 0" check was initially added to avoid waiting to close a channel when a process has never been launched (which is correct), but that isn't correct in the case of the closed-fd test. Fixes: 8e6f73e ("- Remove "flushing" handling for exited processes)
This reproduces the problem reported in github #321 asyncssh is used to drive the connection for this test.
This reproduces the problem reported in github #321 asyncssh is used to drive the connection for this test.
@mkj I'm going to suggest this fix for Yocto, trying to also get it into its current LTS version, which is still using dropbear-2082.83. Is there anything wrong with applying it to that version? |
It should be right to apply for 2082.83 |
Sorry for the typo, of course I meant 2022.83. |
Hi,
dropbear is used on the server side to control an application that allows to transfers files.
Sporadically the session breaks entirely as part of an automated test that continuously fetches files from the server.
The mentioned assert is triggered in dropbear-2024.85 server on linux/x86_64 (linux-6.1.86 / glibc-2.38) and is no longer reproducible when commits a7ef149 and 8e6f73e are reverted (related to issue #85).
Issue first occurred after having updated to 2022.83.
2020.81 (and likely earlier versions) do not show this behavior.
Server is invoked using a systemd socket unit and the call
/usr/sbin/dropbear -vvvv -i -r /etc/dropbear/dropbear_rsa_host_key -w -W 1048576
stderr is redirected to a file:
dropbear-2024.85_assert.log
It seems a "channel" is tried to be used, which has just been closed.
The used build of dropbear is based on Yocto, which slightly patches the source (these changes are assumed to be not relevant).
As much as I understand the upstream default build options are used, localoptions.h contains
#define DEBUG_TRACE 5
only.
The remote client triggering the assert is based on libssh and intends to copy a couple of files. Most of the files are a few hundred bytes in size, largest is about 100K. It usually happens during a sequence of smaller files.
(Yet there is little information as to how this is implemented, I could dig into it if it helps to understand the general flow. I mostly have a proprietary CLI tool that allows to initiate the copy operation.)
The text was updated successfully, but these errors were encountered: