Troubleshooting OpenShift Clusters and Workloads

If you are cluster admin, cluster operator or the only developer in the team that actually knows what's going on in your OpenShift cluster, then you will know, that some terrible things happen from time to time. It's inevitable and it's better to be prepared for the moment when shit hits the fan. So, this is the collection of commands that should be part of your arsenal, when it comes to debugging broken deployments, resource consumption, missing privileges, unreachable workloads and more...

Don't Use The Web Console

Yea, it looks nice, it's easy to navigate, it makes it easy to perform some tasks... and it will also become unreachable if there's problem with router, or deployment loses its Endpoint or if operator has some issues...

If you get little too used to doing everything using web console, you might end up not being able to solve problems with CLI at the moment when every minute matters. Therefore my recommendation is to get comfortable with the oc tool and at least for the duration of this article forget that the web console even exists.

Monitoring Node Resources

It's good to check memory and CPU available on your worker nodes from time to time and especially if your pods are stuck in Pending state or are being OOMKilled. You can do that with:


~ $ oc adm top nodes
NAME          CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
<IP_1>        620m         15%    3791Mi          28%       
<IP_2>        418m         10%    2772Mi          21%

Which displays CPU and Memory stats of nodes. In case you want to filter out master nodes and see only workers, then you can use -l node-role.kubenetes.io/worker, like so:


~ $ oc adm top nodes -l node-role.kubernetes.io/worker

Troubleshooting Node Units

If you are running bare metal cluster, then there is a good chance, that you will eventually run into problems related to things running on nodes themselves. To see what's happening with specific systemd units (e.g. crio or kubelet) running on worker nodes, you can use:


~ $ oc adm node-logs NODE_NAME -u unit

This command retrieves logs from specific unit. So, running with -u crio gives us the following:


~ $ oc adm node-logs NODE_NAME -u crio
May 21 16:45:25.218904 ... crio[1180294]: 2020-05-21T11:45:25-05:00 [verbose] Del: <namespace>:<pod>:k8s-pod-network:eth0 {"cniVersion":"0.3.1","name":"k8s-pod-network","plugins":[{"datastore_type":"kubernetes","ipam":{"type":"calico-ipam"},"kubernetes":{"kubeconfig":"/var/run/multus/cni/net.d/calico-kubeconfig"},"mtu":1480,"nodename_file_optional":false,"policy":{"type":"k8s"},"type":"calico"},{"capabilities":{"portMappings":true},"snat":true,"type":"portmap"}]}
May 21 16:45:25.438991 ... crio[1180294]: 2020-05-21 11:45:25.438 [WARNING][3352471] workloadendpoint.go 77: Operation Delete is not supported on WorkloadEndpoint type

When logs are not good enough and you need to actually poke around inside the worker node, you can run oc debug nodes/node-name. This would give you shell inside this specific node, by creating privileged pod on the node. Example of this kind of a interactive session:


~ $ oc debug nodes/node-name
...
sh # chroot /host
sh # crictl ps
...
sh # systemctl is-active ...
sh # systemctl restart ...

In the session above we use crictl to inspect containers running directly on the worker node. This is the place where we could start/restart/delete some containers or system services if needed. Needless to say, be very careful when touching things running on the nodes themselves.

As a side note, if you are running your cluster on some managed public cloud, then most likely you will not have permission for direct access to nodes, as that would be security issues, so the last command might fail on you.

Monitoring Cluster Updates

When you decide, that it's time to update you cluster to the newer version you will probably want to monitor the progress. Alternatively, if some operators are breaking without any clear reason, you might also want to check up on your clusterversion operator to see whether it's progressing towards newer version, which might be reason for temporary service degradation:


~ $ oc describe clusterversion
Name:         version
Namespace:    
Labels:       <none>
Annotations:  <none>
API Version:  config.openshift.io/v1
Kind:         ClusterVersion
Metadata:
   ...
Spec:
  Cluster ID:  ...
Status:
  Available Updates:  <nil>
  Conditions:
    Last Transition Time:  2020-04-24T06:58:32Z
    Message:               Done applying 4.3.18
    Status:                True
    Type:                  Available
    Last Transition Time:  2020-05-04T21:59:03Z
    Status:                False
    Type:                  Failing
    ...

~ $ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.3.18    True        False         8d      Cluster version is 4.3.18

Both commands above retrieve version and information on whether the cluster is currently upgrading or what's the state of cluster operators in general.

All The Ways to Debug Pods

The thing that's going to break most often is of course pod and/or deployment (DeploymentConfig). There's quite a few commands that can give you insight into what is wrong with you application, so starting from the most high-level ones:


~ $ oc status
In project <namespace> on server https://<host:port>

https://<appname>.<host> (redirects) (svc/appname)
  dc/appname deploys de.icr.io/project/appname:latest 
    deployment #1 deployed 6 days ago - 1 pod

https://<appname2>.<host> (redirects) (svc/appname2)
  dc/appname2 deploys de.icr.io/project/appname2:latest 
    deployment #13 deployed 10 hours ago - 1 pod
    deployment #12 deployed 11 hours ago
    deployment #11 deployed 34 hours ago

4 infos identified, use 'oc status --suggest' to see details.

oc status is the easiest way to get overview of resources deployed in project including their relationships and state as shown above.

Next one, you already know - oc describe. The reason I mention it is Events: section at the bottom, which shown only events related to this specific resource, which is much nicer then trying to find anything useful in output of oc events.


~ $ oc describe pod/podname
Name:         appname-14-6vsq7
Namespace:    ...
Priority:     0
Node:         10.85.34.38/10.85.34.38
Start Time:   Thu, 21 May 2020 20:55:23 +0200
Labels:       app=appname
Annotations:  cni.projectcalico.org/podIP: 172.30.142.149/32
              cni.projectcalico.org/podIPs: 172.30.142.149/32
              openshift.io/scc: restricted
Status:       Running
IP:           172.30.142.149
IPs:
  IP:           172.30.142.149
Controlled By:  ReplicationController/appname-14
Containers:
  ...

Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  ...

Events:
  Type    Reason     Age        From                  Message
  ----    ------     ----       ----                  -------
  Normal  Scheduled  <unknown>  default-scheduler     Successfully assigned namespace/appname to ...
  Normal  Pulling    16s        kubelet, ...          Pulling image "de.icr.io/..."
  Normal  Pulled     16s        kubelet, ...          Successfully pulled image "de.icr.io/..."
  Normal  Created    16s        kubelet, ...          Created container appname
  Normal  Started    16s        kubelet, ...          Started container appname

Another common command is oc logs. One thing you might not know about it though, is that it can target specific container of the pod, using -c argument. For example:


~ $ oc logs podname -c cont_name1
~ $ oc logs podname -c cont_name2

# Example
~ $ oc logs test-pod
Error from server (BadRequest): a container name must be specified for pod test-pod, choose one of: [1st 2nd]

# We need to choose pod to get logs from

~ $ oc logs test-pod -c 1st
...  # Some logs
~ $ oc logs test-pod -c 2nd
...  # Some more logs

This might come in handy if you are debugging single container in multi-container pod and want filter only the relevant logs.

Now, for the little less known command(s). oc debug was already shown in the section about debugging nodes, but it can be used to debug deployments or pods too:


~ $ oc debug pods/podname
Starting pod/podname-debug ...
Pod IP: 172.30.142.144
If you don't see a command prompt, try pressing enter.

$  # And we're in the debug pod!
$ ...

# CTRL+D
Removing debug pod ...

Unlike the example with nodes, this won't give you shell to the running pod, but rather create exact replica of the existing pod in debug mode. Meaning, that labels will be stripped and the command changed to /bin/sh.

One reason you might need to debug pod in OpenShift is issue with security policies. In that case you can add --as-root to the command, to stop it from crashing during startup.

Nice thing about this command is that it can be used with any OpenShift resource that creates pod, for example Deployment, Job, ImageStreamTag, etc.

Running Ad-hoc Commands Inside Pods and Containers

Even though creating debugging pods can be very convenient, sometimes you just need to poke around in the actual pods. You can use oc exec for that. These are the variants you could take advantage of:


~ $ oc exec podname -- command --options
~ $ oc exec -it podname -c cont_name -- bash

# Example
~ $ oc exec podname -- ls -la
total 96
drwxr-xr-x.   1 root    root 4096 May 21 12:25 .
drwxr-xr-x.   1 root    root 4096 May 21 12:25 ..
drwxr-xr-x.   2 root    root 4096 Oct 29  2019 bin
...

First of the commands above runs one-off command inside podname with extra options if necessary. The second one will get you shell into specific container in the pod, though you should probably use shorthand for that - oc rsh.

One command that you use could use for troubleshooting, but should never use in production environments is oc cp, which copies files to or from pod. This command can be useful if you need to get some file out of container so you can analyze it further. Another use case would be to copy files into the pod (container) to quickly fix some issue during testing, before fixing it properly in Docker image (Dockerfile) or source code.


~ $ oc cp local/file podname:/file/in/pod  # copy "local/file" to "/file/in/pod" in "podname"

# Examples:
# Copy config into pod:
~ $ oc cp config.yaml podname:/somedir/config.yaml

# Copy logs from pod:
~ $ oc cp podname:/logs/messages.log messages.log
tar: Removing leading `/' from member names

Inspect Broken Images

I think that was enough for debugging pods and containers, but what about debugging application images? For that you should turn to skopeo:


~ $ skopeo inspect someimage
~ $ skopeo list-tags someimage

# Example
# Inspect repository
~ $ skopeo inspect docker://quay.io/buildah/stable
{
    "Name": "quay.io/buildah/stable",
    "Digest": "sha256:4006c153bc76f2a98436d2de604d5bab0393618d7e2ab81ccb1272d3fe56e683",
    "RepoTags": [
        "v1.9.0",
        "v1.9.1",
        ...
        "master",
        "latest"
    ],
    "Created": "2020-05-18T21:13:55.180304657Z",
    "DockerVersion": "18.02.0-ce",
    "Labels": {
        "license": "MIT",
        "name": "fedora",
        "vendor": "Fedora Project",
        "version": "32"
    },
    "Architecture": "amd64",
    "Os": "linux",
    "Layers": [
        "sha256:03c837e31708e15035b6c6f9a7a4b78b64f6bc10e6daec01684c077655becf95",
        ...
        "sha256:2d8f327dcfdd9f50e6ee5f31b06676c027b9e3b1ae7592ac06808bfc59527530"
    ],
    "Env": [
        "DISTTAG=f32container",
        "FGC=f32",
        "container=oci",
        "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
        "BUILDAH_ISOLATION=chroot"
    ]
}

# List image tags
~ $ skopeo list-tags docker://docker.io/library/fedora
{
    "Repository": "docker.io/library/fedora",
    "Tags": [
        "20",
        "21",
        "22",
        "23",
        "24",
        "25",
        "26-modular",
        ...
        "32",
        "33",
        "branched",
        "heisenbug",
        "latest",
        "modular",
        "rawhide"
    ]
}

First command can inspect image repository, which can be useful for example in case image can't be pulled, which might happen if the tag doesn't exist or the image name got misspelled. Second one just gives you list of available tags without you needing to open the registry website, which is pretty convenient.

Gathering All Information Available

When everything else fails, you might try running oc adm must-gather to get all the information available from the cluster, that could be useful for debugging. The file produced by this command can be used for your own debugging or can be sent to Red Hat support in case you need assistance.

Debugging Unreachable Applications

It's not uncommon (at least for me), that applications/deployments seem to work fine but they cannot reach each other. There are couple of reasons why that might be. Let's take following scenario - you have an application and database. Both are running just fine, but for some reason your application can't communicate with database. Here's one way you could go about troubleshooting this:


~ $ oc get svc/db-pod -o jsonpath="{.spec.clusterIP}{'\n'}"
172.21.180.177  # IP of database pod
~ $ oc debug -t dc/app-pod
Starting pod/app-pod-debug ...
Pod IP: 172.30.142.160
If you don't see a command prompt, try pressing enter.

$ curl -v http://172.21.180.177:5432
...
* Connected to 72.21.180.177... (72.21.180.177) port 5432 (#0)
...
$ exit
Removing debug pod ...

# Now same thing other way around
~ $ oc get svc/app-pod -o jsonpath="{.spec.clusterIP}{'\n'}"
172.21.80.251
~ $ oc debug -t dc/db-pod
Starting pod/db-pod-debug ...
Pod IP: 172.30.142.143
If you don't see a command prompt, try pressing enter.

$ curl -v http://172.21.80.251:9080
...
curl: (28) Connection timed out after ... milliseconds
...
$ exit
Removing debug pod ...

~ $ oc get svc
NAME        TYPE       CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
db-pod      NodePort   172.21.180.177   <none>        80:31544/TCP   8d
app-pod     NodePort   172.21.80.251    <none>        80:30146/TCP   12d

~ $ oc get endpoints
NAME       ENDPOINTS             AGE
db-pod     172.30.142.191:9080   8d
app-pod    <none>                12d       # <- Missing Endpoint

~ $ oc describe svc/app-pod
Name:                     app-pod
Namespace:                some-namespace
Labels:                   app=app-pod
...
Selector:                 app=not-app-pod  # <- Fix this
...
Endpoints:                <none>           # <- Missing Endpoint
...

Snippet above assumes that we already have the application and database running, as well as their respective Services. We can start debugging by trying to access the database from the application. First thing we need for that though, is IP of database, which we lok up using the first command. Next, we create carbon copy of the application using oc debug and try reaching the database pod with curl, which is successful.

After that we repeat the test other way around and we can see curl times out, meaning that database cannot reach application IP. We then check previously created Services, nothing weird there. Finally, we check Endpoints and we can see that application pod doesn't have one. This is most likely caused by misconfiguration of respective Service, as shown by last command, where we clearly have wrong selector. After fixing this mistake (with oc edit svc/...), Endpoint gets automatically created and application is reachable.

Fix Missing Security Context Constraints

If your pod is failing for any kind of issue related to copying/accessing files, running binaries, modifying resources on node, etc, then it's most likely problem with Security Context Constraints. Based on the specific error you are getting, you should be able to determine the right SCC for your pod. If it's not clear though, then there are a few pointers that might help you decide:

If your pod can't run because of UID/GID it's using, then you can check UID and GID range for each SCC:


~ $ oc describe scc some-scc
  Run As User Strategy: MustRunAsNonRoot	
    UID:					...
    UID Range Min:				...
    UID Range Max:				...

If these fields are set to <none> though, you should go look at project annotations:


~ $ oc get project my-project -o yaml
apiVersion: project.openshift.io/v1
kind: Project
metadata:
  annotations:
    openshift.io/sa.scc.supplemental-groups: 1001490000/10000
    openshift.io/sa.scc.uid-range: 1001490000/10000
...

These annotations tell you that effective UID of your pod will be in range 1001490000 +/- 10000. if that doesn't satisfy your needs, you would have to set spec.securityContext.runAsUser: SOME_UID to force specific UID. If your pod fails after these changes, then you got to switch SCC or modify it to have different UID range.

One neat trick to determine which SCC is needed by Service Account to be able to run a pod, is to use oc adm policy scc-subject-review command:


~ $ oc get pod podname -o yaml | oc adm policy scc-subject-review -f -
RESOURCE           ALLOWED BY   
Pod/podname        anyuid

What this command does, is check whether user or Service Account can create pod passed in using YAML representation. When output of this commands shows <none>, then it means that resource is not allowed. If any name of SCC is displayed instead, like for example anyuid above, then it means that this resource can be created thanks to this SCC.

To use this command with some Service Account instead of user, add -z parameter, e.g - oc adm policy scc-subject-review -z builder.

When output of this commands shows anything but <none>, then you know that you are good to go.

Conclusion

Biggest takeaway from this article should be for you, that if something doesn't work in your OpenShift cluster, then it's probably RBAC, if not, then it's SCC. If that's also not the case, then it's networking (DNS). In all seriousness, I hope at least some of these command will save you some time next time you need to troubleshoot something in OpenShift. Also it's good to know the more common ones by heart, because you never know when you're really gonna need it. 😉