-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed to start instances on new cluster members #247
Comments
@roosterfish whenever you're reporting a (potential) cross-snap issue (or really anytime you're reporting a microcloud issue) it would be useful to see the output of |
For now a workaround is to reload the LXD daemon on the affected cluster member using |
@roosterfish what do the LXD logs show for the error/reason for the network not being starable? |
I have updated the description. |
@roosterfish @masnax the |
I managed to reproduce this too. I'm having a look at it. I have one question though: for a 4 nodes configuration, we have 3 |
Looks like this is an issue with LXD cluster joins. It seems joining a cluster after the fact by using I'm able to reproduce this only when adding nodes to an existing cluster, whereas using the same nodes and initializing the whole cluster at that size results in the network working fine. I'm still trying to figure out what LXD's doing exactly, but what I've gathered from the request payloads so far is that when creating the network on init, the payloads look like {NetworkPut:{Config:map[parent:enp6s0] Description:} Name:UPLINK Type:physical}"
{NetworkPut:{Config:map[] Description:Uplink for OVN networks} Name:UPLINK Type:physical}"
{NetworkPut:{Config:map[parent:enp6s0 volatile.last_state.created:false] Description:} Name:UPLINK Type:physical}"
{NetworkPut:{Config:map[network:UPLINK] Description:Default OVN network} Name:default Type:ovn}"
{NetworkPut:{Config:map[bridge.mtu:1442 ipv4.address:10.18.8.1/24 ipv4.nat:true ipv6.address:fd42:cbc4:cc49:8d30::1/64 ipv6.nat:true network:UPLINK parent:enp6s0] Description:} Name:default Type:ovn}" and when adding a node, it's {NetworkPut:{Config:map[parent:enp6s0 volatile.last_state.created:false] Description:} Name:UPLINK Type:physical}"
{NetworkPut:{Config:map[bridge.mtu:1442 ipv4.address:10.18.8.1/24 ipv4.nat:true ipv6.address:fd42:cbc4:cc49:8d30::1/64 ipv6.nat:true network:UPLINK] Description:} Name:default Type:ovn}" Main difference is that the |
Hm, indeed that actually was it. If that final payload has |
From my initial testing, this appears to be fixed in LXD now. The default network is still created on the new cluster members without @roosterfish If you remember the initial setup you used to replicate this, could you please give it a shot to ensure I'm not missing an edge case? |
It looks I suspect LXD I have deployed the following set of snaps:
The same error occurs with MicroCloud
And when using LXD
The reproducer steps:
|
@roosterfish is 5.21/edge affected? |
Mh, Using all the stable snaps:
|
Great, so its been fixed in a backport. And will be in 5.21.3. |
Version
Same versions of the snaps on all cluster members.
Description
After adding a new member to the MicroCloud cluster using
microcloud add
, existing instances can be moved to the new cluster member but fail when getting started:The networks status on the new member is also marked as Unavailable:
In the logs of
m4
you can see the following message every minute:The text was updated successfully, but these errors were encountered: