Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]full restore failed on GKE for pod CrashLoopBackOff #2924

Closed
ahjing99 opened this issue Apr 25, 2023 · 4 comments
Closed

[BUG]full restore failed on GKE for pod CrashLoopBackOff #2924

ahjing99 opened this issue Apr 25, 2023 · 4 comments
Assignees
Labels
bug kind/bug Something isn't working severity/minor It is better to fix the problem for a better user experience
Milestone

Comments

@ahjing99
Copy link
Collaborator

ahjing99 commented Apr 25, 2023

➜ ~ kbcli version
Kubernetes: v1.25.7-gke.1000
KubeBlocks: 0.5.0-beta.9
kbcli: 0.5.0-beta.9

  1. Create pv and pvc
kubectl apply -f -<< EOF
apiVersion: v1
kind: PersistentVolume
metadata:
  name: backup-data-pv
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 20Gi
  claimRef:
    name: backup-data
    namespace: default
  hostPath:
    path: /tmp/backup-data-pv
    type: DirectoryOrCreate
  persistentVolumeReclaimPolicy: Delete
  storageClassName: local-path
  volumeMode: Filesystem
EOF

kubectl apply -f -<< EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: backup-data
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 20Gi
EOF
  1. create cluster,edit backuppolicy and create backup
➜  ~ kbcli cluster create mycluster --termination-policy=Delete --cluster-definition=apecloud-mysql --cluster-version=ac-mysql-8.0.30

Cluster mycluster created

➜  ~ kbcli cluster edit-backup-policy mycluster-mysql-backup-policy
backuppolicy.dataprotection.kubeblocks.io/mycluster-mysql-backup-policy edited

➜  ~ kbcli cluster backup mycluster --backup-type=full
Backup backup-default-mycluster-20230425103029 created successfully, you can view the progress:
	kbcli cluster list-backups --name=backup-default-mycluster-20230425103029 -n default

➜  ~ kbcli cluster list-backups mycluster
NAME                                      CLUSTER     TYPE   STATUS       TOTAL-SIZE   DURATION   CREATE-TIME                  COMPLETION-TIME
backup-default-mycluster-20230425103029   mycluster   full   InProgress                           Apr 25,2023 10:30 UTC+0800

➜  ~ kbcli cluster list-backups mycluster
NAME                                      CLUSTER     TYPE   STATUS      TOTAL-SIZE   DURATION   CREATE-TIME                  COMPLETION-TIME
backup-default-mycluster-20230425103029   mycluster   full   Completed                82s        Apr 25,2023 10:30 UTC+0800   Apr 25,2023 10:31 UTC+0800
  1. Restore
➜  ~ kbcli cluster restore new-mysql-cluster --backup backup-default-mycluster-20230425103029
Cluster new-mysql-cluster created
➜  ~ k get pod
NAME                                                    READY   STATUS                  RESTARTS      AGE
csi-attacher-s3-0                                       1/1     Running                 0             27m
csi-provisioner-s3-0                                    2/2     Running                 0             27m
csi-s3-dnzs8                                            2/2     Running                 0             27m
csi-s3-g2lxw                                            2/2     Running                 0             27m
csi-s3-v994m                                            2/2     Running                 0             27m
kb-addon-alertmanager-webhook-adaptor-b8df446b6-twx2z   2/2     Running                 0             16h
kb-addon-grafana-847ffd849-f8qdh                        3/3     Running                 0             16h
kb-addon-prometheus-alertmanager-0                      2/2     Running                 0             16h
kb-addon-prometheus-server-0                            2/2     Running                 0             16h
kb-addon-snapshot-controller-65b6db596-x2lsm            1/1     Running                 0             16h
kubeblocks-545466d9cb-s8t77                             1/1     Running                 0             16h
mycluster-mysql-0                                       4/4     Running                 0             9m45s
mycluster-mysql-1                                       4/4     Running                 0             9m45s
mycluster-mysql-2                                       4/4     Running                 0             9m45s
new-mysql-cluster-mysql-0                               4/4     Running                 0             4m13s
new-mysql-cluster-mysql-1                               0/4     Init:CrashLoopBackOff   4 (72s ago)   4m13s
new-mysql-cluster-mysql-2                               0/4     Init:CrashLoopBackOff   4 (61s ago)   4m13s

➜  ~ k logs new-mysql-cluster-mysql-1
Defaulted container "mysql" out of: mysql, metrics, kb-checkrole, config-manager, restore-backup-default-mycluster-20230425103029 (init)
Error from server (BadRequest): container "mysql" in pod "new-mysql-cluster-mysql-1" is waiting to start: PodInitializing
➜  ~  k logs new-mysql-cluster-mysql-1 -c restore-backup-default-mycluster-20230425103029
sh: line 8: /backup-default-mycluster-20230425103029/default/mycluster-450c5d09-f2ba-485a-859b-37d10435e66e/mysql/backup-default-mycluster-20230425103029/backup-default-mycluster-20230425103029.xbstream: No such file or directory
➜  ~ k exec -it new-mysql-cluster-mysql-0 -c mysql -- bash
[root@new-mysql-cluster-mysql-0 /]# mysql -p$MYSQL_ROOT_PASSWORD
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
[root@new-mysql-cluster-mysql-0 /]#

➜  ~ k get pv,pvc
NAME                                                        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                                       STORAGECLASS   REASON   AGE
persistentvolume/backup-data-pv                             20Gi       RWO            Delete           Bound    default/backup-data                                         local-path              16h
persistentvolume/pvc-2414b446-6ffb-4f49-bd7c-5691efe9888c   20Gi       RWO            Delete           Bound    default/data-new-mysql-cluster-mysql-2                      standard-rwo            22m
persistentvolume/pvc-4b48254c-3649-453c-b9c9-0bbf5edf9149   20Gi       RWO            Delete           Bound    default/data-mycluster-mysql-1                              standard-rwo            27m
persistentvolume/pvc-4d27f922-5a6f-4bca-8ac1-f923423b4ae3   20Gi       RWO            Delete           Bound    default/data-mycluster-mysql-2                              standard-rwo            27m
persistentvolume/pvc-9fd9ab40-0fcb-40db-ad1e-527640790d20   20Gi       RWO            Delete           Bound    default/data-new-mysql-cluster-mysql-1                      standard-rwo            22m
persistentvolume/pvc-c23785c9-c947-44d8-9e50-c340788c89e9   4Gi        RWO            Delete           Bound    default/storage-volume-kb-addon-prometheus-alertmanager-0   standard-rwo            20h
persistentvolume/pvc-c412ee6b-88f6-402b-9786-168b318bfe94   10Gi       RWO            Delete           Bound    default/storage-volume-kb-addon-prometheus-server-0         standard-rwo            20h
persistentvolume/pvc-d1ec99f6-6fc4-40e7-87b2-ae18f35003b3   20Gi       RWO            Delete           Bound    default/data-new-mysql-cluster-mysql-0                      standard-rwo            22m
persistentvolume/pvc-e44825ea-8ee3-4336-a2e2-c5eb679c787e   20Gi       RWO            Delete           Bound    default/data-mycluster-mysql-0                              standard-rwo            27m

NAME                                                                      STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
persistentvolumeclaim/backup-data                                         Bound    backup-data-pv                             20Gi       RWO            standard-rwo   16h
persistentvolumeclaim/data-mycluster-mysql-0                              Bound    pvc-e44825ea-8ee3-4336-a2e2-c5eb679c787e   20Gi       RWO            standard-rwo   27m
persistentvolumeclaim/data-mycluster-mysql-1                              Bound    pvc-4b48254c-3649-453c-b9c9-0bbf5edf9149   20Gi       RWO            standard-rwo   27m
persistentvolumeclaim/data-mycluster-mysql-2                              Bound    pvc-4d27f922-5a6f-4bca-8ac1-f923423b4ae3   20Gi       RWO            standard-rwo   27m
persistentvolumeclaim/data-new-mysql-cluster-mysql-0                      Bound    pvc-d1ec99f6-6fc4-40e7-87b2-ae18f35003b3   20Gi       RWO            standard-rwo   22m
persistentvolumeclaim/data-new-mysql-cluster-mysql-1                      Bound    pvc-9fd9ab40-0fcb-40db-ad1e-527640790d20   20Gi       RWO            standard-rwo   22m
persistentvolumeclaim/data-new-mysql-cluster-mysql-2                      Bound    pvc-2414b446-6ffb-4f49-bd7c-5691efe9888c   20Gi       RWO            standard-rwo   22m
persistentvolumeclaim/storage-volume-kb-addon-prometheus-alertmanager-0   Bound    pvc-c23785c9-c947-44d8-9e50-c340788c89e9   4Gi        RWO            standard-rwo   20h
persistentvolumeclaim/storage-volume-kb-addon-prometheus-server-0         Bound    pvc-c412ee6b-88f6-402b-9786-168b318bfe94   10Gi       RWO            standard-rwo   20h

➜  ~ k get BackupPolicy mycluster-mysql-backup-policy -o yaml
apiVersion: dataprotection.kubeblocks.io/v1alpha1
kind: BackupPolicy
metadata:
  annotations:
    apps.kubeblocks.io/backup-policy-template: apecloud-mysql-backup-policy-template
    dataprotection.kubeblocks.io/is-default-policy: "true"
    dataprotection.kubeblocks.io/path-prefix: /mycluster-450c5d09-f2ba-485a-859b-37d10435e66e/mysql
  creationTimestamp: "2023-04-25T02:27:11Z"
  finalizers:
  - cluster.kubeblocks.io/finalizer
  - dataprotection.kubeblocks.io/finalizer
  generation: 2
  labels:
    app.kubernetes.io/instance: mycluster
    apps.kubeblocks.io/component-def-ref: mysql
  name: mycluster-mysql-backup-policy
  namespace: default
  ownerReferences:
  - apiVersion: apps.kubeblocks.io/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: Cluster
    name: mycluster
    uid: 450c5d09-f2ba-485a-859b-37d10435e66e
  resourceVersion: "736169"
  uid: 90d8817c-57a6-47d4-9655-449164445ad5
spec:
  full:
    backupToolName: xtrabackup-for-apecloud-mysql
    backupsHistoryLimit: 7
    persistentVolumeClaim:
      createPolicy: IfNotPresent
      initCapacity: 100Gi
      name: backup-data
    target:
      labelsSelector:
        matchLabels:
          app.kubernetes.io/instance: mycluster
          apps.kubeblocks.io/component-name: mysql
      secret:
        name: mycluster-conn-credential
        passwordKey: password
        usernameKey: username
  schedule:
    baseBackup:
      cronExpression: 0 18 * * 0
      enable: false
      type: snapshot
  snapshot:
    backupsHistoryLimit: 7
    hooks:
      containerName: mysql
      postCommands:
      - rm -f /data/mysql/data/.restore_new_cluster; sync
      preCommands:
      - touch /data/mysql/data/.restore_new_cluster; sync
    target:
      labelsSelector:
        matchLabels:
          app.kubernetes.io/instance: mycluster
          apps.kubeblocks.io/component-name: mysql
          kubeblocks.io/role: leader
      secret:
        name: mycluster-conn-credential
        passwordKey: password
        usernameKey: username
  ttl: 7d
status:
  phase: Available
@ahjing99 ahjing99 added kind/bug Something isn't working severity/major Great chance user will encounter the same problem labels Apr 25, 2023
@ahjing99 ahjing99 added this to the Release 0.5.0 milestone Apr 25, 2023
@wangyelei
Copy link
Contributor

wangyelei commented Apr 25, 2023

  1. if the accessMode is readwriteOnce of the persistent volume, restore not supported when the pod are scheduled to the
    different node.

  2. if use the local pv, the restore pod should use the same node with the source cluster pod.

@wangyelei wangyelei added severity/minor It is better to fix the problem for a better user experience and removed severity/major Great chance user will encounter the same problem labels Apr 25, 2023
@github-actions
Copy link

This issue has been marked as stale because it has been open for 30 days with no activity

@github-actions github-actions bot added the Stale label May 29, 2023
@ahjing99 ahjing99 removed the Stale label Jun 9, 2023
@ahjing99
Copy link
Collaborator Author

ahjing99 commented Jun 9, 2023

Remove the stale label since this is moved to 0.7.0

@wangyelei
Copy link
Contributor

supported cloud disk at #4917, but Local PV is not necessarily supported as backup storage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug kind/bug Something isn't working severity/minor It is better to fix the problem for a better user experience
Projects
None yet
Development

No branches or pull requests

3 participants