Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pods, Stateful sets and Daemon sets not running #72

Open
SimoneStarace opened this issue Jun 27, 2019 · 4 comments
Open

Pods, Stateful sets and Daemon sets not running #72

SimoneStarace opened this issue Jun 27, 2019 · 4 comments

Comments

@SimoneStarace
Copy link

Introduction

I was simply trying to run all the charts, like how is explained in the readme file. but every time I try to do it I always get some errors.

Tools

Before I show what errors I get, I want to let you all know what tools I'm using:

  • Kubectl
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.0", GitCommit:"e8462b5b5dc2584fdcd18e6bcfe9f1e4d970a529", GitTreeState:"clean", BuildDate:"2019-06-19T16:40:16Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.0", GitCommit:"e8462b5b5dc2584fdcd18e6bcfe9f1e4d970a529", GitTreeState:"clean", BuildDate:"2019-06-19T16:32:14Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
  • Minikube
$ minikube version
minikube version: v1.2.0
  • Virtual Box version 6.0.8

I have everything installed on my computer which runs Ubuntu 18.04.2 LTS on an HDD with a memory of more than 500GB.

Execution

In this section there are the commands I execute for run the charts.

First I create a new virtual machine with minikube.

$ minikube start
😄  minikube v1.2.0 on linux (amd64)
🔥  Creating virtualbox VM (CPUs=4, Memory=4096MB, Disk=375000MB) ...
🐳  Configuring environment for Kubernetes v1.15.0 on Docker 18.09.6
🚜  Pulling images ...
🚀  Launching Kubernetes ... 
⌛  Verifying: apiserver proxy etcd scheduler controller dns
🏄  Done! kubectl is now configured to use "Name of the profile"

I didn't include the first 2 helm commands, exaplined in the readme file, because those both didn't give me any errors.

When I execute this command it shows the first error.

$ helm install -n my-hdfs charts/hdfs-k8s
Error: could not find tiller

To solve this problem I had to run this command and wait for few minutes before I can run the previous command again.
$ helm init

Now here is where I'm always stuck. Everytime there are some elements that won't run.

$ kubectl get pod -l release=my-hdfs
NAME                              READY   STATUS             RESTARTS   AGE
my-hdfs-client-544d894fc7-gp4zl   1/1     Running            0          15m
my-hdfs-datanode-b9wbx            0/1     Running            5          15m
my-hdfs-journalnode-0             1/1     Running            0          15m
my-hdfs-journalnode-1             0/1     Pending            0          5m8s
my-hdfs-namenode-0                0/1     CrashLoopBackOff   5          15m
my-hdfs-namenode-1                0/1     Pending            0          5m12s
my-hdfs-zookeeper-0               1/1     Running            0          15m
my-hdfs-zookeeper-1               1/1     Running            0          2m19s
my-hdfs-zookeeper-2               1/1     Running            0          108s

$ kubectl get statefulset -l release=my-hdfs
NAME                  READY   AGE
my-hdfs-journalnode   1/3     49m
my-hdfs-namenode      0/2     49m
my-hdfs-zookeeper     3/3     49m

$ kubectl get daemonset -l release=my-hdfs
NAME               DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
my-hdfs-datanode   1         1         0       1            0           <none>          49m

Here are the links where you can see the errors I get
Daemon Set errors
Pods errors
Stateful sets errors

Guessing one error

I think that one of those errors is given because it isn't specified the storage class in the stateful sets.

$ kubectl describe statefulsets my-hdfs-namenode
Name:               my-hdfs-namenode
Namespace:          default
CreationTimestamp:  Thu, 27 Jun 2019 11:38:19 +0200
Selector:           app=hdfs-namenode,release=my-hdfs
Labels:             app=hdfs-namenode
                    chart=hdfs-namenode-k8s-0.1.0
                    release=my-hdfs
Annotations:        <none>
Replicas:           2 desired | 2 total
Update Strategy:    OnDelete
Pods Status:        1 Running / 1 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:  app=hdfs-namenode
           release=my-hdfs
  Containers:
   hdfs-namenode:
    Image:       uhopper/hadoop-namenode:2.7.2
    Ports:       8020/TCP, 50070/TCP
    Host Ports:  8020/TCP, 50070/TCP
    Command:
      /bin/sh
      -c
    Args:
      /entrypoint.sh "/nn-scripts/format-and-run.sh"
    Environment:
      HADOOP_CUSTOM_CONF_DIR:  /etc/hadoop-custom-conf
      MULTIHOMED_NETWORK:      0
      MY_POD:                   (v1:metadata.name)
      NAMENODE_POD_0:          my-hdfs-namenode-0
      NAMENODE_POD_1:          my-hdfs-namenode-1
    Mounts:
      /etc/hadoop-custom-conf from hdfs-config (ro)
      /hadoop/dfs/name from metadatadir (rw,path="name")
      /nn-scripts from nn-scripts (ro)
  Volumes:
   nn-scripts:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      my-hdfs-namenode-scripts
    Optional:  false
   hdfs-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      my-hdfs-config
    Optional:  false
Volume Claims:
  Name:          metadatadir
  StorageClass:  
  Labels:        <none>
  Annotations:   <none>
  Capacity:      100Gi
  Access Modes:  [ReadWriteOnce]
Events:
  Type    Reason            Age   From                    Message
  ----    ------            ----  ----                    -------
  Normal  SuccessfulCreate  59m   statefulset-controller  create Claim metadatadir-my-hdfs-namenode-0 Pod my-hdfs-namenode-0 in StatefulSet my-hdfs-namenode success
  Normal  SuccessfulCreate  59m   statefulset-controller  create Pod my-hdfs-namenode-0 in StatefulSet my-hdfs-namenode successful
  Normal  SuccessfulCreate  49m   statefulset-controller  create Claim metadatadir-my-hdfs-namenode-1 Pod my-hdfs-namenode-1 in StatefulSet my-hdfs-namenode success
  Normal  SuccessfulCreate  49m   statefulset-controller  create Pod my-hdfs-namenode-1 in StatefulSet my-hdfs-namenode successful

How can I solve those errors?

Have a nice day everyone.

@SimoneStarace
Copy link
Author

Update 1

So after doing some tests I got a different output but still not every node is running correctly. I though that the pods don't run because in the Stateful set there isn't specified a storage class but that's not the problem.
I simply deleted the chart doing this command:

$ helm delete --purge my-hdfs

After this I simply run the installation, again, and this time I got different errors:
Pods errors

Those are the description of the namenode 0 and 1

$ kubectl describe pod my-hdfs-namenode-0
Name:           my-hdfs-namenode-0
Namespace:      default
Priority:       0
Node:           minikube/10.0.2.15
Start Time:     Fri, 28 Jun 2019 14:37:13 +0200
Labels:         app=hdfs-namenode
                controller-revision-hash=my-hdfs-namenode-65b74c4cfc
                release=my-hdfs
                statefulset.kubernetes.io/pod-name=my-hdfs-namenode-0
Annotations:    <none>
Status:         Running
IP:             10.0.2.15
Controlled By:  StatefulSet/my-hdfs-namenode
Containers:
  hdfs-namenode:
    Container ID:  docker://7f9f08a34c86333d48eeb6b81bf457fcf25c79956b624c7f1f88ed432473b996
    Image:         uhopper/hadoop-namenode:2.7.2
    Image ID:      docker-pullable://uhopper/hadoop-namenode@sha256:c78c6b3e97a01ce09dd4b0bc23e9885dee9658982c5d358554cad7657be06686
    Ports:         8020/TCP, 50070/TCP
    Host Ports:    8020/TCP, 50070/TCP
    Command:
      /bin/sh
      -c
    Args:
      /entrypoint.sh "/nn-scripts/format-and-run.sh"
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Fri, 28 Jun 2019 14:48:07 +0200
      Finished:     Fri, 28 Jun 2019 14:48:09 +0200
    Ready:          False
    Restart Count:  7
    Environment:
      HADOOP_CUSTOM_CONF_DIR:  /etc/hadoop-custom-conf
      MULTIHOMED_NETWORK:      0
      MY_POD:                  my-hdfs-namenode-0 (v1:metadata.name)
      NAMENODE_POD_0:          my-hdfs-namenode-0
      NAMENODE_POD_1:          my-hdfs-namenode-1
    Mounts:
      /etc/hadoop-custom-conf from hdfs-config (ro)
      /hadoop/dfs/name from metadatadir (rw,path="name")
      /nn-scripts from nn-scripts (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-v8fxs (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  metadatadir:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  metadatadir-my-hdfs-namenode-0
    ReadOnly:   false
  nn-scripts:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      my-hdfs-namenode-scripts
    Optional:  false
  hdfs-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      my-hdfs-config
    Optional:  false
  default-token-v8fxs:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-v8fxs
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  14m                   default-scheduler  Successfully assigned default/my-hdfs-namenode-0 to minikube
  Normal   Pulled     12m (x5 over 14m)     kubelet, minikube  Container image "uhopper/hadoop-namenode:2.7.2" already present on machine
  Normal   Created    12m (x5 over 14m)     kubelet, minikube  Created container hdfs-namenode
  Normal   Started    12m (x5 over 14m)     kubelet, minikube  Started container hdfs-namenode
  Warning  BackOff    4m16s (x46 over 14m)  kubelet, minikube  Back-off restarting failed container
$ kubectl describe pod my-hdfs-namenode-1
Name:           my-hdfs-namenode-1
Namespace:      default
Priority:       0
Node:           <none>
Labels:         app=hdfs-namenode
                controller-revision-hash=my-hdfs-namenode-65b74c4cfc
                release=my-hdfs
                statefulset.kubernetes.io/pod-name=my-hdfs-namenode-1
Annotations:    <none>
Status:         Pending
IP:             
Controlled By:  StatefulSet/my-hdfs-namenode
Containers:
  hdfs-namenode:
    Image:       uhopper/hadoop-namenode:2.7.2
    Ports:       8020/TCP, 50070/TCP
    Host Ports:  8020/TCP, 50070/TCP
    Command:
      /bin/sh
      -c
    Args:
      /entrypoint.sh "/nn-scripts/format-and-run.sh"
    Environment:
      HADOOP_CUSTOM_CONF_DIR:  /etc/hadoop-custom-conf
      MULTIHOMED_NETWORK:      0
      MY_POD:                  my-hdfs-namenode-1 (v1:metadata.name)
      NAMENODE_POD_0:          my-hdfs-namenode-0
      NAMENODE_POD_1:          my-hdfs-namenode-1
    Mounts:
      /etc/hadoop-custom-conf from hdfs-config (ro)
      /hadoop/dfs/name from metadatadir (rw,path="name")
      /nn-scripts from nn-scripts (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-v8fxs (ro)
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:
  metadatadir:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  metadatadir-my-hdfs-namenode-1
    ReadOnly:   false
  nn-scripts:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      my-hdfs-namenode-scripts
    Optional:  false
  hdfs-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      my-hdfs-config
    Optional:  false
  default-token-v8fxs:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-v8fxs
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age                 From               Message
  ----     ------            ----                ----               -------
  Warning  FailedScheduling  80s (x24 over 15m)  default-scheduler  0/1 nodes are available: 1 node(s) didn't have free ports for the requested pod ports.

I really don' t know how to solve this problem.

Can someone help me about this?

@drametoid
Copy link

drametoid commented May 11, 2020

I'm on Kubernetes v1.15.3 and also facing similar issue.

I am doing the same setup as mentioned in the readme, except I change the space taken by namenodes and few others to a lesser value.

This here is the log for my namenode-0 pod which is the first one to go into Error state right after all the pods go into Running state.

Any solutions so far?

@SimoneStarace
Copy link
Author

I'm on Kubernetes v1.15.3 and also facing similar issue.

I am doing the same setup as mentioned in the readme, except I change the space taken by namenodes and few others to a lesser value.

This here is the log for my namenode-0 pod which is the first one to go into Error state right after all the pods go into Running state.

Any solutions so far?

Hi. I'm sorry but I didn't solve this problem and I'm not looking at it because I'm working on a different project right now.
I was thinking to close this issue but since I never solved the problem I left it open.

@Laziz-data
Copy link

Still facing the same problome here ; any one to help 😔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants