Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] PG restored cluster is always in Creating status due to Readiness probe failed #8152

Open
tianyue86 opened this issue Sep 14, 2024 · 2 comments
Assignees
Labels
kind/bug Something isn't working
Milestone

Comments

@tianyue86
Copy link

tianyue86 commented Sep 14, 2024

Describe the bug

Kubernetes: v1.29.7-gke.1274000
KubeBlocks: 0.9.1-beta.25
kbcli: 0.9.1-beta.10

To Reproduce
Steps to reproduce the behavior:

  1. Create pg cluster
    kbcli cluster create postgres-icpzcz --termination-policy=DoNotTerminate --cluster-definition=postgresql --enable-all-logs=false --cluster-version=postgresql-14.8.0 --set cpu=100m,memory=0.5Gi,replicas=2,storage=3Gi --namespace default

  2. Create backup
    kbcli cluster backup postgres-icpzcz --method wal-g --namespace default

  3. Restore cluster
    kbcli cluster restore postgres-icpzcz-backup --backup backup-default-postgres-icpzcz-20240914124229 --namespace default

  4. Check cluster status

tianyue@localhost kbcli % k get cluster -A | grep postgres
default     postgres-icpzcz          postgresql            postgresql-14.8.0     DoNotTerminate       Running    61m
default     postgres-icpzcz-backup   postgresql            postgresql-14.8.0     DoNotTerminate       **Creating**   38m
  1. See error
tianyue@localhost kbcli % k describe cluster postgres-icpzcz-backup
Name:         postgres-icpzcz-backup
Namespace:    default
Labels:       clusterdefinition.kubeblocks.io/name=postgresql
              clusterversion.kubeblocks.io/name=postgresql-14.8.0
Annotations:  kubeblocks.io/ops-request: [{"name":"postgres-icpzcz-backup","type":"Restore"}]
              kubeblocks.io/reconcile: 2024-09-14T05:19:49.986719383Z
              kubeblocks.io/restore-from-backup:
                {"postgresql":{"connectionPassword":"EHhYeZrgFEC+x5rv7D+WRo9kZNvT2sIqM40QfqndWwQQIx94","doReadyRestoreAfterClusterRunning":"false","name":...
API Version:  apps.kubeblocks.io/v1alpha1
Kind:         Cluster
Metadata:
  Creation Timestamp:  2024-09-14T04:44:04Z
  Finalizers:
    cluster.kubeblocks.io/finalizer
  Generation:        1
  Resource Version:  9186555
  UID:               c84b3271-c513-450b-b3ef-55c9d37dd378
Spec:
  Affinity:
    Pod Anti Affinity:     Preferred
    Tenancy:               SharedNode
  Cluster Definition Ref:  postgresql
  Cluster Version Ref:     postgresql-14.8.0
  Component Specs:
    Component Def Ref:  postgresql
    Disable Exporter:   true
    Enabled Logs:
      running
    Name:      postgresql
    Replicas:  2
    Resources:
      Limits:
        Cpu:     100m
        Memory:  512Mi
      Requests:
        Cpu:               100m
        Memory:            512Mi
    Service Account Name:  kb-postgres-icpzcz
    Switch Policy:
      Type:  Noop
    Volume Claim Templates:
      Name:  data
      Spec:
        Access Modes:
          ReadWriteOnce
        Resources:
          Requests:
            Storage:  3Gi
  Resources:
    Cpu:     0
    Memory:  0
  Storage:
    Size:              0
  Termination Policy:  DoNotTerminate
Status:
  Cluster Def Generation:  2
  Components:
    Postgresql:
      Phase:       Creating
      Pods Ready:  false
  Conditions:
    Last Transition Time:  2024-09-14T04:44:04Z
    Message:               The operator has started the provisioning of Cluster: postgres-icpzcz-backup
    Observed Generation:   1
    Reason:                PreCheckSucceed
    Status:                True
    Type:                  ProvisioningStarted
    Last Transition Time:  2024-09-14T04:44:04Z
    Message:               Successfully applied for resources
    Observed Generation:   1
    Reason:                ApplyResourcesSucceed
    Status:                True
    Type:                  ApplyResources
    Last Transition Time:  2024-09-14T04:44:04Z
    Message:               pods are not ready in Components: [postgresql], refer to related component message in Cluster.status.components
    Reason:                ReplicasNotReady
    Status:                False
    Type:                  ReplicasReady
    Last Transition Time:  2024-09-14T04:44:04Z
    Message:               pods are unavailable in Components: [postgresql], refer to related component message in Cluster.status.components
    Reason:                ComponentsNotReady
    Status:                False
    Type:                  Ready
  Observed Generation:     1
  Phase:                   Creating
Events:
  Type     Reason                    Age                   From                  Message
  ----     ------                    ----                  ----                  -------
  Normal   PreCheckSucceed           38m                   cluster-controller    The operator has started the provisioning of Cluster: postgres-icpzcz-backup
  Normal   ApplyResourcesSucceed     38m                   cluster-controller    Successfully applied for resources
  Normal   NeedWaiting               38m (x6 over 38m)     component-controller  waiting for restore "postgres-icpzcz-backup-postgresql-c84b3271-preparedata" successfully
  Normal   ComponentPhaseTransition  38m                   cluster-controller    component is Creating
  Warning  Unhealthy                 2m48s (x14 over 37m)  event-controller      Pod postgres-icpzcz-backup-postgresql-0: **Readiness probe failed**: 127.0.0.1:5432 - no response
  Warning  Unhealthy                 2m41s (x12 over 36m)  event-controller      Pod postgres-icpzcz-backup-postgresql-1: **Readiness probe failed**: 127.0.0.1:5432 - no response

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

@tianyue86 tianyue86 added the kind/bug Something isn't working label Sep 14, 2024
@tianyue86 tianyue86 assigned shanshanying and ldming and unassigned shanshanying Sep 14, 2024
@tianyue86 tianyue86 added this to the Release 0.9.1 milestone Sep 14, 2024
@tianyue86 tianyue86 changed the title [BUG]pg cluster [BUG] PG restored cluster is always in Creating status due to Readiness probe failed Sep 14, 2024
@shanshanying
Copy link
Contributor

hi @tianyue86

to backup and restore a PG cluster using wal-g
you should

  1. config wal-g
kbcli cluster backup <clusterName> --method config-wal-g
  1. update parameters using ops
apiVersion: apps.kubeblocks.io/v1alpha1
kind: OpsRequest
metadata:
  generateName: pg-cluster-reconfiguring-
spec:
  clusterRef: <clusterName>
  reconfigure:
    componentName: postgresql
    configurations:
      - keys:
          - key: postgresql.conf
            parameters:
              - key: archive_command
                value: "'envdir /home/postgres/pgdata/wal-g/env /home/postgres/pgdata/wal-g/wal-g wal-push %p'"
        name: postgresql-configuration
  type: Reconfiguring
  1. backup cluster
kbcli cluster backup <clusterName> --method wal-g
  1. restore
kbcli cluster restore <clusterName> --backup <backupName>

@shanshanying
Copy link
Contributor

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants