Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SecondaryNetwork breaks ServiceExternalIP feature #6623

Closed
meibensteiner opened this issue Aug 21, 2024 · 17 comments · Fixed by #6700
Closed

SecondaryNetwork breaks ServiceExternalIP feature #6623

meibensteiner opened this issue Aug 21, 2024 · 17 comments · Fixed by #6700
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@meibensteiner
Copy link

meibensteiner commented Aug 21, 2024

Describe the bug
Using both the ServiceExternalIP and SecondaryNetwork feature with a single host interface breaks the ServiceExternalIP feature. Services of type Loadbalancer are no longer accessible.

To Reproduce

  • Deploy antrea with SecondaryNetwork and ServiceExternalIP feature flag.
  • Attach singular host interface to br-ext
  • Create Service of type Loadbalancer

Helm chart values:

antrea:
  featureGates:
    Multicast: true
    ServiceExternalIP: true
    SecondaryNetwork: true
  multicast:
    enable: true
  tunnelType: "vxlan"
  logVerbosity: 2
  secondaryNetwork:
    ovsBridges: [{bridgeName: "br-ext", physicalInterfaces: ["enp0s1"]}]

Expected
The host interface should be moved to br-ext
Service should still be accessible

Actual behavior
The host interface is moved to br-ext
Service is no longer accessible outside the cluster after interface was attached to br-ext and after waiting a few minutes
Initially it weirdly works, but after a few minutes the service becomes inaccessible
It seems like the outage starts after those log lines. I have no way of correlating those logs to the problem though.

antrea-agent I0821 09:16:24.556996       1 proxier.go:1108] "Processing Service UPDATE event" Service="kube-system/rke2-metrics-server"                                                       
antrea-agent I0821 09:16:24.557029       1 proxier.go:1108] "Processing Service UPDATE event" Service="networking/antrea"                                                                     
antrea-agent I0821 09:16:24.557036       1 proxier.go:1108] "Processing Service UPDATE event" Service="default/kubernetes"                                                                    
antrea-agent I0821 09:16:24.557056       1 proxier.go:1108] "Processing Service UPDATE event" Service="internal-nginx/internal-nginx-ingress-nginx-controller"                                
antrea-agent I0821 09:16:24.557066       1 proxier.go:1108] "Processing Service UPDATE event" Service="internal-nginx/internal-nginx-ingress-nginx-controller-admission"                      
antrea-agent I0821 09:16:24.557070       1 proxier.go:1108] "Processing Service UPDATE event" Service="kube-system/rke2-coredns-rke2-coredns"                                                 
antrea-agent I0821 09:16:24.560144       1 proxier.go:1081] "Processing EndpointSlice UPDATE event" EndpointSlice="default/kubernetes"                                                        
antrea-agent I0821 09:16:24.560169       1 proxier.go:1081] "Processing EndpointSlice UPDATE event" EndpointSlice="internal-nginx/internal-nginx-ingress-nginx-controller-admission-cw2dw"    
antrea-agent I0821 09:16:24.560179       1 proxier.go:1081] "Processing EndpointSlice UPDATE event" EndpointSlice="internal-nginx/internal-nginx-ingress-nginx-controller-q4qw9"              
antrea-agent I0821 09:16:24.560184       1 proxier.go:1081] "Processing EndpointSlice UPDATE event" EndpointSlice="kube-system/rke2-coredns-rke2-coredns-qm5c6"                               
antrea-agent I0821 09:16:24.560189       1 proxier.go:1081] "Processing EndpointSlice UPDATE event" EndpointSlice="kube-system/rke2-metrics-server-f7q2v"                                     
antrea-agent I0821 09:16:24.560192       1 proxier.go:1081] "Processing EndpointSlice UPDATE event" EndpointSlice="networking/antrea-xgf45"                                                   
antrea-agent I0821 09:16:24.563275       1 mcast_route_linux.go:152] "Updating multicast route statistics and removing stale multicast routes"                                                
antrea-agent I0821 09:16:25.472802       1 agent.go:95] "Updating agent monitoring CRD" name="node1" partial=true                                                                             
antrea-agent I0821 09:16:29.562420       1 mcast_discovery.go:98] "Sent packetOut for IGMP query" group="0.0.0.0" version=1 outPort=0                                                         
antrea-agent I0821 09:16:29.562448       1 mcast_discovery.go:98] "Sent packetOut for IGMP query" group="0.0.0.0" version=2 outPort=0                                                         
antrea-agent I0821 09:16:29.562464       1 mcast_discovery.go:98] "Sent packetOut for IGMP query" group="0.0.0.0" version=3 outPort=0                                                         
antrea-agent I0821 09:17:06.507994       1 cluster.go:264] "Processed Node UPDATE event, labels not changed" nodeName="node1" 

Versions:

  • Antrea version: antrea/antrea-agent-ubuntu:v2.1.0
  • Kubernetes version: v1.30.3+rke2r1
  • Container runtime: containerd://1.7.17-k3s1
  • Linux kernel version: 6.8.0-39-generic

Additional context
Since the node is accessible this time, I can actually create a supportbundle :)

support-bundles_20240821T105205+0200.zip

@meibensteiner meibensteiner added the kind/bug Categorizes issue or PR as related to a bug. label Aug 21, 2024
@meibensteiner
Copy link
Author

Deleting the antrea-agent pod makes the service accessible again for a while.

@jianjuns
Copy link
Contributor

This is probably about external IP traffic forwarding from host interface to the primary OVS bridge when the interface is moved to the secondary bridge. @tnqn and @xliuxu should have some ideas.

@antoninbas
Copy link
Contributor

This is probably because the ARP responder which is in charge of responding to ARP requests for Service LoadBalancerIPs is initialized before the secondary bridge creation. When the ARP responder is initialized, it "resolves" the transport interface by name (enp0s1), and uses the interface to listen for ARP requests and send ARP replies. The interface handle used for this is never updated after that. However, once the secondary bridge is created and the host interface (transport interface) is assigned to the bridge, the interface handle held by the ARP responder is no longer "valid" (it now points to enp0s1~ instead of the new enp0s1 host interface).

Because secondary bridge initialization can be delayed by up to ~10s after the agent starts (see #6504), and probably because we send a gratuitous ARP for the Service LoadBalancerIP when we first process it, you would probably have connectivity to the Service until the ARP entry expires in the client / router, which seems to be what you are observing.

If we want to support this case (ServiceExternalIP + transport interface assigned to SecondaryNetwork bridge), we can do one of the following:

  1. periodically resolve the interface by name in the ARP responder
  2. introduce a communication channel so that the ARP responder can resolve the interface by name again after the secondary bridge is initialized
  3. order things so that the ARP responder does not resolve the interface by name until the secondary bridge has been initialized

The first option may be the simplest and the least risky. It would also be a good way to confirm that my analysis is correct.

@antoninbas
Copy link
Contributor

@meibensteiner If you can capture traffic, you can also confirm that ARP requests are not answered correctly, causing the connectivity issue

@meibensteiner
Copy link
Author

Can confirm. Unanswered ARP requests.

10:57:01.819248 ARP, Request who-has 192.168.66.251 tell node2, length 28
10:57:02.842307 ARP, Request who-has 192.168.66.251 tell node2, length 28
10:57:06.365280 ARP, Request who-has 192.168.66.251 tell node2, length 28
10:57:06.465441 IP _gateway.57621 > 192.168.66.255.57621: UDP, length 44
10:57:07.383679 ARP, Request who-has 192.168.66.251 tell node2, length 28
10:57:08.408941 ARP, Request who-has 192.168.66.251 tell node2, length 28
10:57:18.832761 ARP, Request who-has 192.168.66.251 tell _gateway, length 28

@xliuxu
Copy link
Contributor

xliuxu commented Aug 23, 2024

This is probably because the ARP responder which is in charge of responding to ARP requests for Service LoadBalancerIPs is initialized before the secondary bridge creation. When the ARP responder is initialized, it "resolves" the transport interface by name (enp0s1), and uses the interface to listen for ARP requests and send ARP replies. The interface handle used for this is never updated after that. However, once the secondary bridge is created and the host interface (transport interface) is assigned to the bridge, the interface handle held by the ARP responder is no longer "valid" (it now points to enp0s1~ instead of the new enp0s1 host interface).

Because secondary bridge initialization can be delayed by up to ~10s after the agent starts (see #6504), and probably because we send a gratuitous ARP for the Service LoadBalancerIP when we first process it, you would probably have connectivity to the Service until the ARP entry expires in the client / router, which seems to be what you are observing.

If we want to support this case (ServiceExternalIP + transport interface assigned to SecondaryNetwork bridge), we can do one of the following:

  1. periodically resolve the interface by name in the ARP responder
  2. introduce a communication channel so that the ARP responder can resolve the interface by name again after the secondary bridge is initialized
  3. order things so that the ARP responder does not resolve the interface by name until the secondary bridge has been initialized

The first option may be the simplest and the least risky. It would also be a good way to confirm that my analysis is correct.

Option 1 should be ok to resolve the issue as the transport interface will not be changed frequently. I can work on a fix for this issue.

@xliuxu xliuxu self-assigned this Aug 23, 2024
@meibensteiner
Copy link
Author

Happy to test it if you provide me with an image! 😬😄

@meibensteiner
Copy link
Author

Will this make it into the 2.2 release?

@xliuxu
Copy link
Contributor

xliuxu commented Sep 11, 2024

@meibensteiner We should be able to ship the fix with 2.2. Btw could you help to confirm if a manual restart of the agent could help to fix the issue as a workaround?

@meibensteiner
Copy link
Author

It fixes it only for a few seconds

@xliuxu
Copy link
Contributor

xliuxu commented Sep 12, 2024

Ah, I see. This is expected because antrea-agent will revert the bridging of host interfaces upon exit.
I am currently testing the fix and need more tests for the ipv6 implementation.

@meibensteiner
Copy link
Author

Sorry, if Im pushy, but could we give this some attention? I have the automation to banish multus from our clusters ready and Im just waiting for this fix.

@xliuxu
Copy link
Contributor

xliuxu commented Sep 26, 2024

Sorry for the delayed response. While testing the new implementation, I encountered a subtle bug related to IPv6 (context: mdlayher/ndp#32). With this issue addressed, I will submit the PR later this week.

xliuxu added a commit to xliuxu/antrea that referenced this issue Sep 29, 2024
For secondary-network scenarios, the transport interface can be
changed after the agent is started. The ARP/NDP responders should
be started after the initialization of secondary-network to bind
to the transport interface of the new index.

Besides, this change also addresses the following issues:
- NDP responder may fail to bind to the new interface due to the
Duplicate Address Detection process.
- Golang caches the zone index for the interface, which may result
in NDP responder binding on the stale interface

Fixes: antrea-io#6623

Signed-off-by: Xu Liu <xu.liu@broadcom.com>
xliuxu added a commit to xliuxu/antrea that referenced this issue Sep 29, 2024
For secondary-network scenarios, the transport interface can be
changed after the agent is started. The ARP/NDP responders should
be started after the initialization of secondary-network to bind
to the transport interface of the new index.

Besides, this change also addresses the following issues:
- NDP responder may fail to bind to the new interface due to the
Duplicate Address Detection process.
- Golang caches the zone index for the interface, which may result
in NDP responder binding on the stale interface

Fixes: antrea-io#6623

Signed-off-by: Xu Liu <xu.liu@broadcom.com>
xliuxu added a commit to xliuxu/antrea that referenced this issue Sep 29, 2024
For secondary-network scenarios, the transport interface can be
changed after the agent is started. The ARP/NDP responders should
be started after the initialization of secondary-network to bind
to the transport interface of the new index.

Besides, this change also addresses the following issues:
- NDP responder may fail to bind to the new interface due to the
Duplicate Address Detection process.
- Golang caches the zone index for the interface, which may result
in NDP responder binding on the stale interface

Fixes: antrea-io#6623

Signed-off-by: Xu Liu <xu.liu@broadcom.com>
xliuxu added a commit to xliuxu/antrea that referenced this issue Sep 29, 2024
For secondary-network scenarios, the transport interface can be
changed after the agent is started. The ARP/NDP responders should
be started after the initialization of secondary-network to bind
to the transport interface of the new index.

Besides, this change also addresses the following issues:
- NDP responder may fail to bind to the new interface due to the
Duplicate Address Detection process.
- Golang caches the zone index for the interface, which may result
in NDP responder binding on the stale interface

Fixes: antrea-io#6623

Signed-off-by: Xu Liu <xu.liu@broadcom.com>
@xliuxu
Copy link
Contributor

xliuxu commented Sep 29, 2024

@antoninbas I pushed the fix #6700 but it was implemented with option 3

order things so that the ARP responder does not resolve the interface by name until the secondary bridge has been initialized

I think option 1 can also be added to the current implementation easily but i am not sure if it is still necessary.

periodically resolve the interface by name in the ARP responder

xliuxu added a commit to xliuxu/antrea that referenced this issue Sep 29, 2024
For secondary-network scenarios, the transport interface can be
changed after the agent is started. The ARP/NDP responders should
be started after the initialization of secondary-network to bind
to the transport interface of the new index.

Besides, this change also addresses the following issues:
- NDP responder may fail to bind to the new interface due to the
Duplicate Address Detection process.
- Golang caches the zone index for the interface, which may result
in NDP responder binding on the stale interface

Fixes: antrea-io#6623

Signed-off-by: Xu Liu <xu.liu@broadcom.com>
@antoninbas
Copy link
Contributor

@xliuxu thanks for working on this. I left a comment on the PR. I would say that in retrospect, I am not a huge fan of option 3, as managing initialization dependencies between Antrea Agent components is tricky and error-prone, and the fewer "hidden" dependencies we have between components, the better.

I also understand that the other options are not ideal: option 1 has some lag and querying system interfaces periodically is not a "free' operation; option 2 introduces code complexity. I was thinking that another approach could be possible. We could subscribe to link events (e.g., using https://pkg.go.dev/github.com/vishvananda/netlink#LinkSubscribe ?) in the responder, hopefully with the appropriate filter(s). Then the responder could react to link changes, and we would not introduce any new dependency between components. Do you think that's a viable option?

@xliuxu
Copy link
Contributor

xliuxu commented Oct 3, 2024

I also understand that the other options are not ideal: option 1 has some lag and querying system interfaces periodically is not a "free' operation; option 2 introduces code complexity. I was thinking that another approach could be possible. We could subscribe to link events (e.g., using https://pkg.go.dev/github.com/vishvananda/netlink#LinkSubscribe ?) in the responder, hopefully with the appropriate filter(s). Then the responder could react to link changes, and we would not introduce any new dependency between components. Do you think that's a viable option?

I think it is possible to watch link events and reconfigure responders on transport interface index changes. If so, do you think we still need to change the ordering of the controller start sequence?

@antoninbas
Copy link
Contributor

I think it is possible to watch link events and reconfigure responders on transport interface index changes. If so, do you think we still need to change the ordering of the controller start sequence?

No, the main advantage would be that we do not need to change that ordering, and we don't introduce a new dependency between the SecondaryNetwork initialization and other features such as Egress.

xliuxu added a commit to xliuxu/antrea that referenced this issue Oct 23, 2024
For secondary-network scenarios, the transport interface can be
changed after the agent is started. The ARP/NDP responders should
be started after the initialization of secondary-network to bind
to the transport interface of the new index.

Besides, this change also addresses the following issues:
- NDP responder may fail to bind to the new interface due to the
Duplicate Address Detection process.
- Golang caches the zone index for the interface, which may result
in NDP responder binding on the stale interface

Fixes: antrea-io#6623

Signed-off-by: Xu Liu <xu.liu@broadcom.com>
xliuxu added a commit to xliuxu/antrea that referenced this issue Oct 23, 2024
For secondary-network scenarios, the transport interface can be
changed after the agent is started. The ARP/NDP responders should
be started after the initialization of secondary-network to bind
to the transport interface of the new index.

Besides, this change also addresses the following issues:
- NDP responder may fail to bind to the new interface due to the
Duplicate Address Detection process.
- Golang caches the zone index for the interface, which may result
in NDP responder binding on the stale interface

Fixes: antrea-io#6623

Signed-off-by: Xu Liu <xu.liu@broadcom.com>
xliuxu added a commit to xliuxu/antrea that referenced this issue Oct 23, 2024
For secondary-network scenarios, the transport interface can be
changed after the agent is started. The ARP/NDP responders should
be started after the initialization of secondary-network to bind
to the transport interface of the new index.

Besides, this change also addresses the following issues:
- NDP responder may fail to bind to the new interface due to the
Duplicate Address Detection process.
- Golang caches the zone index for the interface, which may result
in NDP responder binding on the stale interface

Fixes: antrea-io#6623

Signed-off-by: Xu Liu <xu.liu@broadcom.com>
xliuxu added a commit that referenced this issue Oct 30, 2024
For secondary-network scenarios, the transport interface can be
changed after the agent is started. The ARP/NDP responders should
be started after the initialization of secondary-network to bind
to the transport interface of the new index.

Besides, this change also addresses the following issues:
- NDP responder may fail to bind to the new interface due to the
Duplicate Address Detection process.
- Golang caches the zone index for the interface, which may result
in NDP responder binding on the stale interface

Fixes: #6623

Signed-off-by: Xu Liu <xu.liu@broadcom.com>
xliuxu added a commit that referenced this issue Nov 6, 2024
For secondary-network scenarios, the transport interface can be
changed after the agent is started. The ARP/NDP responders should
be started after the initialization of secondary-network to bind
to the transport interface of the new index.

Besides, this change also addresses the following issues:
- NDP responder may fail to bind to the new interface due to the
Duplicate Address Detection process.
- Golang caches the zone index for the interface, which may result
in NDP responder binding on the stale interface

Fixes: #6623

Signed-off-by: Xu Liu <xu.liu@broadcom.com>
xliuxu added a commit that referenced this issue Nov 7, 2024
For secondary-network scenarios, the transport interface can be
changed after the agent is started. The ARP/NDP responders should
be started after the initialization of secondary-network to bind
to the transport interface of the new index.

Besides, this change also addresses the following issues:
- NDP responder may fail to bind to the new interface due to the
Duplicate Address Detection process.
- Golang caches the zone index for the interface, which may result
in NDP responder binding on the stale interface

Fixes: #6623

Signed-off-by: Xu Liu <xu.liu@broadcom.com>
xliuxu added a commit to xliuxu/antrea that referenced this issue Nov 8, 2024
For secondary-network scenarios, the transport interface can be
changed after the agent is started. The ARP/NDP responders should
be started after the initialization of secondary-network to bind
to the transport interface of the new index.

Besides, this change also addresses the following issues:
- NDP responder may fail to bind to the new interface due to the
Duplicate Address Detection process.
- Golang caches the zone index for the interface, which may result
in NDP responder binding on the stale interface

Fixes: antrea-io#6623

Signed-off-by: Xu Liu <xu.liu@broadcom.com>
xliuxu added a commit to xliuxu/antrea that referenced this issue Nov 8, 2024
For secondary-network scenarios, the transport interface can be
changed after the agent is started. The ARP/NDP responders should
be started after the initialization of secondary-network to bind
to the transport interface of the new index.

Besides, this change also addresses the following issues:
- NDP responder may fail to bind to the new interface due to the
Duplicate Address Detection process.
- Golang caches the zone index for the interface, which may result
in NDP responder binding on the stale interface

Fixes: antrea-io#6623

Signed-off-by: Xu Liu <xu.liu@broadcom.com>
xliuxu added a commit to xliuxu/antrea that referenced this issue Nov 8, 2024
For secondary-network scenarios, the transport interface can be
changed after the agent is started. The ARP/NDP responders should
be started after the initialization of secondary-network to bind
to the transport interface of the new index.

Besides, this change also addresses the following issues:
- NDP responder may fail to bind to the new interface due to the
Duplicate Address Detection process.
- Golang caches the zone index for the interface, which may result
in NDP responder binding on the stale interface

Fixes: antrea-io#6623

Signed-off-by: Xu Liu <xu.liu@broadcom.com>
xliuxu added a commit to xliuxu/antrea that referenced this issue Nov 8, 2024
For secondary-network scenarios, the transport interface can be
changed after the agent is started. The ARP/NDP responders should
be started after the initialization of secondary-network to bind
to the transport interface of the new index.

Besides, this change also addresses the following issues:
- NDP responder may fail to bind to the new interface due to the
Duplicate Address Detection process.
- Golang caches the zone index for the interface, which may result
in NDP responder binding on the stale interface

Fixes: antrea-io#6623

Signed-off-by: Xu Liu <xu.liu@broadcom.com>
xliuxu added a commit to xliuxu/antrea that referenced this issue Nov 8, 2024
For secondary-network scenarios, the transport interface can be
changed after the agent is started. The ARP/NDP responders should
be started after the initialization of secondary-network to bind
to the transport interface of the new index.

Besides, this change also addresses the following issues:
- NDP responder may fail to bind to the new interface due to the
Duplicate Address Detection process.
- Golang caches the zone index for the interface, which may result
in NDP responder binding on the stale interface

Fixes: antrea-io#6623

Signed-off-by: Xu Liu <xu.liu@broadcom.com>
antoninbas pushed a commit that referenced this issue Nov 9, 2024
For secondary-network scenarios, the transport interface can be
changed after the agent is started. The ARP/NDP responders should
be started after the initialization of secondary-network to bind
to the transport interface of the new index.

Besides, this change also addresses the following issues:
- NDP responder may fail to bind to the new interface due to the
Duplicate Address Detection process.
- Golang caches the zone index for the interface, which may result
in NDP responder binding on the stale interface

Fixes: #6623

Signed-off-by: Xu Liu <xu.liu@broadcom.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants