Troubleshooting
Cluster API Troubleshooting
Verification of the PODS
When a problem occur with cluster API the first thing to do is to check if all the PODs involved are running.
Check is these 4 pods have the STATUS Running
If one of these pods don’t have a Running status, two things can be check in order to get information about the issue.
- Describe the pod that have error
- Get the logs of the pod that have error
Example :
Check the provider VCD
If issue occurs with the actions made on vCloud Director, during cluster creation, upgrade or scaling operation the problem can come from the Cluster API Provider.
The corresponding pod is capvcd-controller-manager inside the capvcd namespace.
Error can be found in the logs :
An option exists to display more logs regarding communication between the provider and vCloud Director.
For that apply this command :
kubectl set env -n capvcd-system deployment/capvcd-controller-manager GOVCD_LOG_ON_SCREEN=true -oyaml
These option could be verbose, so do not forget to disable it when your diagnose is ended, for that :
kubectl set env -n capvcd-system deployment/capvcd-controller-manager GOVCD_LOG_ON_SCREEN-
Check the Cluster API Objects
Cluster API use different types of objects to describe a Kubernetes cluster to manage.
The idea is to explore these objects step by step to find the object that have an error in his status or when we describe the object

Depending if the issue is about worker nodes, master nodes or the overall cluster, it is possible to choose the objects involved using the above diagram.
- Get the objects to find the exact name and to check the status
- Describe the object that have an issue
Repeat action 1 & 2 for all other objects till find the error.
Export logs script
VMware creates a script that export logs in a tar ball and some information about the cluster.
Node deployment troubleshooting
Check kubelet status
systemctl status kubelet
Journalctl
journalctl -xeu containerd
journalctl -xeu kubelet
Multiple files could be parsed to troubleshoot an issue during deployment :
Cloud-init
/var/log/cloud-init-output.log
/var/log/capvcd/customization/status.log
/var/log/capvcd/customization/error.log
containerd
/var/log/containers/*