Add a delay between killing teamd processes #3325

saiarcot895 · 2024-10-14T01:13:49Z

What I did

When killing 10 or more teamd processes, add a delay of 0.1 seconds after every 10 kill signals/proceses. This is because in the LAG scale tests (in ecmp/inner_hashing/test_inner_hashing_lag.py in sonic-mgmt), it may create 100 LAGs, and when destroying them all, some of those LAGs may fail to be properly destroyed, leaving some stale port channels around. This seems to be because the netlink socket buffers on which the teamd processes get notifications become full with events of the other port channels/interfaces going down

Why I did it

As a workaround, add some delays in killing the teamd processes, so that the netlink buffers don't become full, causing messages to get dropped.

This delay was randomly chosen, and it seems to work well with 100 LAGs on a KVM. It can probably made to be a bit more aggressive if needed (i.e. maybe 0.05 seconds every 20 processes).

How I verified it

On a KVM testbed with t0-116 topology with a bit more than 100 LAGs, stop teamd using sudo systemctl stop teamd, and verify that all of the LAGs were deleted, and there were no messages from the kernel similar to the following:

Oct 12 21:33:03 vlab-04 kernel: PortChannel41 (unregistering): Failed to send options change via netlink (err -105)
Oct 12 21:33:03 vlab-04 kernel: PortChannel17 (unregistering): Failed to send options change via netlink (err -105)
Oct 12 21:33:03 vlab-04 kernel: PortChannel22: Failed to send options change via netlink (err -105)
Oct 12 21:33:03 vlab-04 kernel: PortChannel22: Failed to send port change of device Ethernet136 via netlink (err -105)
Oct 12 21:33:03 vlab-04 kernel: PortChannel22: Port device Ethernet136 removed
Oct 12 21:33:03 vlab-04 kernel: PortChannel43: Failed to send options change via netlink (err -105)
Oct 12 21:33:03 vlab-04 kernel: PortChannel43: Failed to send port change of device Ethernet174 via netlink (err -105)

Details if related

Partial fix for sonic-net/sonic-buildimage#19310.

When killing 10 or more teamd processes, add a delay of 0.1 seconds after every 10 kill signals/proceses. This is because in the LAG scale tests (in `ecmp/inner_hashing/test_inner_hashing_lag.py` in sonic-mgmt), it may create 100 LAGs, and when destroying them all, some of those LAGs may fail to be properly destroyed, leaving some stale port channels around. This seems to be because the netlink socket buffers on which the teamd processes get notifications become full with events of the other port channels/interfaces going down. As a workaround, add some delays in killing the teamd processes, so that the netlink buffers don't become full, causing messages to get dropped. This delay was randomly chosen, and it seems to work well with 100 LAGs on a KVM. It can probably made to be a bit more aggressive if needed (i.e. maybe 0.05 seconds every 20 processes). Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

This requires overriding some libc functions and capturing information about kill signals sent or intercepting file open operations. Signe -off-by: Saikrishna Arcot <sarcot@microsoft.com>

saiarcot895 · 2024-10-22T01:20:05Z

/azpw run

mssonicbld · 2024-10-22T01:20:07Z

/AzurePipelines run

azure-pipelines · 2024-10-22T01:20:17Z

Azure Pipelines successfully started running 1 pipeline(s).

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

saiarcot895 · 2024-10-31T20:57:05Z

Comparing the time needed to send SIGTERM to the teamd processes before and after this change, it appears that the time is roughly the same for about 70 LAGs, as tested on a physical device.

Before:

2024 Oct 31 20:33:56.338480 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel1 pid 26
2024 Oct 31 20:33:56.345896 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel10 pid 35
2024 Oct 31 20:33:56.349179 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel101 pid 43
2024 Oct 31 20:33:56.354644 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel102 pid 51
2024 Oct 31 20:33:56.360318 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel103 pid 59
2024 Oct 31 20:33:56.400292 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel104 pid 67
2024 Oct 31 20:33:56.400309 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel11 pid 75
2024 Oct 31 20:33:56.400867 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel12 pid 83
2024 Oct 31 20:33:56.401188 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel13 pid 91
2024 Oct 31 20:33:56.401362 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel14 pid 100
2024 Oct 31 20:33:56.402117 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel15 pid 109
2024 Oct 31 20:33:56.411100 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel16 pid 117
2024 Oct 31 20:33:56.411989 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel17 pid 125
2024 Oct 31 20:33:56.441357 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel18 pid 133
2024 Oct 31 20:33:56.486524 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel19 pid 141
2024 Oct 31 20:33:56.486781 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel2 pid 149
2024 Oct 31 20:33:56.486951 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel20 pid 157
2024 Oct 31 20:33:56.487095 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel21 pid 165
2024 Oct 31 20:33:56.487985 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel22 pid 173
2024 Oct 31 20:33:56.487985 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel23 pid 181
2024 Oct 31 20:33:56.488143 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel24 pid 189
2024 Oct 31 20:33:56.491583 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel25 pid 197
2024 Oct 31 20:33:56.498010 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel26 pid 205
2024 Oct 31 20:33:56.501587 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel27 pid 213
2024 Oct 31 20:33:56.504982 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel28 pid 221
2024 Oct 31 20:33:56.560632 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel29 pid 229
2024 Oct 31 20:33:56.604924 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel3 pid 237
2024 Oct 31 20:33:56.604950 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel30 pid 245
2024 Oct 31 20:33:56.604974 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel31 pid 253
2024 Oct 31 20:33:56.605128 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel32 pid 261
2024 Oct 31 20:33:56.608329 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel33 pid 269
2024 Oct 31 20:33:56.646533 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel34 pid 277
2024 Oct 31 20:33:56.651903 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel35 pid 285
2024 Oct 31 20:33:56.656102 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel36 pid 293
2024 Oct 31 20:33:56.660620 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel37 pid 301
2024 Oct 31 20:33:56.677031 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel38 pid 309
2024 Oct 31 20:33:56.679521 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel39 pid 317
2024 Oct 31 20:33:56.685786 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel4 pid 325
2024 Oct 31 20:33:56.689406 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel40 pid 333
2024 Oct 31 20:33:56.692990 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel41 pid 341
2024 Oct 31 20:33:56.795228 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel42 pid 349
2024 Oct 31 20:33:56.802910 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel43 pid 357
2024 Oct 31 20:33:56.809630 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel44 pid 365
2024 Oct 31 20:33:56.843699 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel45 pid 373
2024 Oct 31 20:33:56.881881 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel46 pid 381
2024 Oct 31 20:33:56.897540 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel47 pid 389
2024 Oct 31 20:33:56.935467 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel48 pid 397
2024 Oct 31 20:33:56.937797 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel49 pid 405
2024 Oct 31 20:33:56.942456 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel5 pid 413
2024 Oct 31 20:33:56.943508 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel50 pid 421
2024 Oct 31 20:33:56.945620 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel51 pid 429
2024 Oct 31 20:33:56.968744 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel52 pid 437
2024 Oct 31 20:33:56.969017 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel53 pid 445
2024 Oct 31 20:33:56.969215 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel54 pid 453
2024 Oct 31 20:33:56.969315 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel55 pid 461
2024 Oct 31 20:33:56.972646 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel56 pid 469
2024 Oct 31 20:33:56.973366 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel57 pid 477
2024 Oct 31 20:33:56.974690 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel58 pid 485
2024 Oct 31 20:33:56.975190 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel59 pid 493
2024 Oct 31 20:33:56.975761 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel6 pid 501
2024 Oct 31 20:33:57.013460 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel60 pid 509
2024 Oct 31 20:33:57.017120 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel61 pid 517
2024 Oct 31 20:33:57.020771 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel62 pid 525
2024 Oct 31 20:33:57.024415 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel63 pid 533
2024 Oct 31 20:33:57.028257 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel64 pid 541
2024 Oct 31 20:33:57.034454 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel65 pid 549
2024 Oct 31 20:33:57.035288 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel66 pid 557
2024 Oct 31 20:33:57.039678 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel67 pid 565
2024 Oct 31 20:33:57.051380 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel68 pid 573
2024 Oct 31 20:33:57.060741 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel69 pid 581

After:

2024 Oct 31 20:42:29.550813 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel1 pid 27
2024 Oct 31 20:42:29.550813 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel10 pid 35
2024 Oct 31 20:42:29.550813 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel101 pid 43
2024 Oct 31 20:42:29.550861 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel102 pid 51
2024 Oct 31 20:42:29.550861 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel103 pid 59
2024 Oct 31 20:42:29.550885 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel104 pid 68
2024 Oct 31 20:42:29.550907 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel11 pid 77
2024 Oct 31 20:42:29.550907 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel12 pid 85
2024 Oct 31 20:42:29.550946 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel13 pid 93
2024 Oct 31 20:42:29.653453 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel14 pid 101
2024 Oct 31 20:42:29.653453 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel15 pid 109
2024 Oct 31 20:42:29.653453 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel16 pid 117
2024 Oct 31 20:42:29.653453 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel17 pid 125
2024 Oct 31 20:42:29.653453 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel18 pid 133
2024 Oct 31 20:42:29.653453 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel19 pid 141
2024 Oct 31 20:42:29.653453 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel2 pid 149
2024 Oct 31 20:42:29.653453 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel20 pid 157
2024 Oct 31 20:42:29.653453 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel21 pid 165
2024 Oct 31 20:42:29.653453 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel22 pid 173
2024 Oct 31 20:42:29.767222 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel23 pid 181
2024 Oct 31 20:42:29.767674 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel24 pid 189
2024 Oct 31 20:42:29.767744 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel25 pid 197
2024 Oct 31 20:42:29.767805 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel26 pid 205
2024 Oct 31 20:42:29.767866 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel27 pid 213
2024 Oct 31 20:42:29.767946 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel28 pid 221
2024 Oct 31 20:42:29.768005 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel29 pid 229
2024 Oct 31 20:42:29.768067 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel3 pid 237
2024 Oct 31 20:42:29.768125 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel30 pid 245
2024 Oct 31 20:42:29.768182 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel31 pid 253
2024 Oct 31 20:42:29.858922 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel32 pid 261
2024 Oct 31 20:42:29.858922 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel33 pid 269
2024 Oct 31 20:42:29.858931 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel34 pid 277
2024 Oct 31 20:42:29.858938 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel35 pid 285
2024 Oct 31 20:42:29.858938 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel36 pid 293
2024 Oct 31 20:42:29.858947 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel37 pid 301
2024 Oct 31 20:42:29.858954 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel38 pid 309
2024 Oct 31 20:42:29.858954 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel39 pid 317
2024 Oct 31 20:42:29.858965 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel4 pid 325
2024 Oct 31 20:42:29.858965 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel40 pid 333
2024 Oct 31 20:42:29.964442 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel41 pid 341
2024 Oct 31 20:42:29.964474 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel42 pid 349
2024 Oct 31 20:42:29.964499 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel43 pid 357
2024 Oct 31 20:42:29.964523 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel44 pid 365
2024 Oct 31 20:42:29.964548 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel45 pid 373
2024 Oct 31 20:42:29.964574 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel46 pid 381
2024 Oct 31 20:42:29.964599 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel47 pid 389
2024 Oct 31 20:42:29.964625 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel48 pid 397
2024 Oct 31 20:42:29.964651 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel49 pid 405
2024 Oct 31 20:42:29.964676 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel5 pid 413
2024 Oct 31 20:42:30.115429 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel50 pid 421
2024 Oct 31 20:42:30.115523 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel51 pid 429
2024 Oct 31 20:42:30.115592 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel52 pid 437
2024 Oct 31 20:42:30.115660 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel53 pid 445
2024 Oct 31 20:42:30.115725 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel54 pid 453
2024 Oct 31 20:42:30.115980 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel55 pid 461
2024 Oct 31 20:42:30.116046 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel56 pid 469
2024 Oct 31 20:42:30.116113 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel57 pid 477
2024 Oct 31 20:42:30.116178 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel58 pid 485
2024 Oct 31 20:42:30.116550 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel59 pid 493
2024 Oct 31 20:42:30.167335 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel6 pid 501
2024 Oct 31 20:42:30.167365 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel60 pid 509
2024 Oct 31 20:42:30.167365 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel61 pid 517
2024 Oct 31 20:42:30.167375 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel62 pid 525
2024 Oct 31 20:42:30.167375 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel63 pid 533
2024 Oct 31 20:42:30.167384 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel64 pid 541
2024 Oct 31 20:42:30.167384 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel65 pid 549
2024 Oct 31 20:42:30.167415 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel66 pid 557
2024 Oct 31 20:42:30.167459 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel67 pid 565
2024 Oct 31 20:42:30.167469 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel68 pid 573
2024 Oct 31 20:42:30.274461 str2-7260cx3-acs-9 NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel69 pid 581

In both cases, for 70 LAGs, it took about 0.6-0.7 seconds to send SIGTERM to the teamd processes, but the distribution of SIGTERMs sent is different. However, on this device, there are still some netlink messages getting dropped resulting in the cleanup not being complete.

dgsudharsan · 2024-11-11T02:50:27Z

@saiarcot895 Can you please run your changes with test_po_cleanup and test_po_cleanup_after_reload? We are noticing these tests statically fail with your changes

saiarcot895 requested a review from judyjoseph as a code owner October 14, 2024 01:13

dgsudharsan added the Request for 202405 Branch label Oct 16, 2024

saiarcot895 added 2 commits October 21, 2024 17:45

Update LAG removal code to use the same logic as cleaning up all LAGs

f4fd3ab

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

Update tests to test LAG cleanup and to test with the new code

7b6fc53

This requires overriding some libc functions and capturing information about kill signals sent or intercepting file open operations. Signe -off-by: Saikrishna Arcot <sarcot@microsoft.com>

saiarcot895 requested a review from prsunny as a code owner October 22, 2024 00:47

Merge remote-tracking branch 'origin/master' into teamd-delay-kill

27f6d3c

saiarcot895 and others added 3 commits October 22, 2024 15:59

Merge remote-tracking branch 'origin/master' into teamd-delay-kill

bdd47c7

Add more tests to cover more cases

c5d84cf

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

Merge branch 'master' into teamd-delay-kill

1dd20a0

dgsudharsan added 2 commits November 4, 2024 07:59

Merge branch 'master' into teamd-delay-kill

8f71480

Merge branch 'master' into teamd-delay-kill

f39d60f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a delay between killing teamd processes #3325

Add a delay between killing teamd processes #3325

saiarcot895 commented Oct 14, 2024

saiarcot895 commented Oct 22, 2024

mssonicbld commented Oct 22, 2024

azure-pipelines bot commented Oct 22, 2024

saiarcot895 commented Oct 31, 2024 •

edited

Loading

dgsudharsan commented Nov 11, 2024

Add a delay between killing teamd processes #3325

Are you sure you want to change the base?

Add a delay between killing teamd processes #3325

Conversation

saiarcot895 commented Oct 14, 2024

saiarcot895 commented Oct 22, 2024

mssonicbld commented Oct 22, 2024

azure-pipelines bot commented Oct 22, 2024

saiarcot895 commented Oct 31, 2024 • edited Loading

dgsudharsan commented Nov 11, 2024

saiarcot895 commented Oct 31, 2024 •

edited

Loading