-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
azuredisk-node-win fails to mount disk: requested access path is already in use #2690
Comments
@ps610 what is your windows vm sku? is it hyperv Gen2 VM? |
Hi @andyzhangx, We are running on |
does the same disk volume mounted and unmounted on the node frequently? you could run
|
nvm, this PR should fix the issue: #2691, this is the testing image: |
root cause is that the first disk format process costing more than 2min, thus timeout, and then another mount process is called, so you would hit this error, I could sometimes repro in e2e tests.
|
Is that fixed with your PRs? Just asking because you reopened this issue here |
@lippertmarkus not yet, but I found how to fix it, stay tuned. |
Adding that we have the same issue with Windows 2019 nodes on AKS 1.29. The Pods are created from a KEDA ScaledJob and they use an Azure Disk ephemeral volume. We are not re-using paths, but the node pool is using deallocate scale-down mode. |
Thank you, @andyzhangx. |
@ps610 I will publish new csi driver version this week, pls email me your aks cluster fqdn, I will upgrade your csi driver version on Windows directly after new version release. |
What happened:
We have a cluster with several Windows nodes (2022), on which Windows pods are executed depending on the demand of our users. The Windows (application) pods are Microsoft Business Central Containers with at least one volume (PVC) containing the application's database. (see base/example helm chart)
In times with high demand and therefore the parallel start of many pods, it happens sporadically that the pod's volume cannot be mounted, which means that the pod cannot start and remains in the “Containercreating” state. As a workaround, the “stuck” pod can be deleted manually and it will then work for an automatically recreated pod (based on the deployment).
The error first appeared under Kubernetes version 1.28.5, we upgraded via 1.29.9 to
1.30.5
yesterday in the hope that this would fix the problem. But in fact it seems to occur more frequently unfortunately, as in the past roughly 2% of starting pods were affected, but this morning almost 10%.Error from
csi-azuredisk-node-win
log:What you expected to happen:
Mounting the volume always works and the pods are able to start.
How to reproduce it:
Can't provide a reproduction scenarios as it happens very sporadically and in situations with high load.
Anything else we need to know?:
We're using autoscaling for our node pools. The application pods may automatically be removed at the end of the day (depending on the users needs) and only the volume (with the database) is kept. The next day, a new pod can/will be created with the existing volume attached. Which means for the user, it is the "same" environment.
Environment:
kubectl version
): v1.30.5uname -a
):The text was updated successfully, but these errors were encountered: