In this article, I will describe the procedure for disaster recovery of the loss of one master node.

To begin, remember that in any disaster, a backup of the platform is of paramount importance for recovery.

So before going ahead, check out part 1 of the disaster recovery series in which I explain setting up and automatic procedure for generating backups of ETCD.

OCP Disaster Recovery Part 1 - How to create Automated ETCD Backup in OpenShift 4.x

With the backup of ETCD done, the next steps will be essential for a successful recovery.

NOTE: It is only possible to recover an OpenShift cluster if there is still a single integral master left. If you have lost all master nodes, the following steps cannot be replicated successfully.

When you lose more than one master node, the OpenShift API will be completely offline.The following steps will be used for recovering a single master node. Here is  the full online state of the cluster before we begin:

The cluster is functional with all the machines in the deployment:

````
$ oc get nodes
NAME                         STATUS   ROLES    AGE    VERSION
zmaciel-f9fbb-master-0       Ready    master   2d8h   v1.20.0+551f7b2
zmaciel-f9fbb-master-1       Ready    master   2d8h   v1.20.0+551f7b2
zmaciel-f9fbb-master-2       Ready    master   2d8h   v1.20.0+551f7b2
zmaciel-f9fbb-worker-52tds   Ready    worker   2d8h   v1.20.0+551f7b2
zmaciel-f9fbb-worker-nxhw8   Ready    worker   2d8h   v1.20.0+551f7b2
````

Online machines are validated:

````
$ oc get machines -A -ojsonpath='{range .items[*]}{@.status.nodeRef.name}{"\t"}{@.status.providerStatus.instanceState}{"\n"}' | grep -v running
zmaciel-f9fbb-master-0    poweredOn
zmaciel-f9fbb-master-1    poweredOn
zmaciel-f9fbb-master-2    poweredOn
zmaciel-f9fbb-worker-52tds    poweredOn
zmaciel-f9fbb-worker-nxhw8    poweredOn
````

And the cluster operators are available:

````
$ oc get clusteroperators
NAME                                       VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.7.3     True        False         False      39h
baremetal                                  4.7.3     True        False         False      2d8h
cloud-credential                           4.7.3     True        False         False      2d8h
cluster-autoscaler                         4.7.3     True        False         False      2d8h
config-operator                            4.7.3     True        False         False      2d8h
console                                    4.7.3     True        False         False      24h
csi-snapshot-controller                    4.7.3     True        False         False      2d8h
dns                                        4.7.3     True        False         False      2d8h
etcd                                       4.7.3     True        False         False      2d8h
image-registry                             4.7.3     True        False         False      2d8h
ingress                                    4.7.3     True        False         False      24h
insights                                   4.7.3     True        False         False      2d8h
kube-apiserver                             4.7.3     True        False         False      2d8h
kube-controller-manager                    4.7.3     True        False         False      2d8h
kube-scheduler                             4.7.3     True        False         False      2d8h
kube-storage-version-migrator              4.7.3     True        False         False      24h
machine-api                                4.7.3     True        False         False      2d8h
machine-approver                           4.7.3     True        False         False      2d8h
machine-config                             4.7.3     True        False         False      2d8h
marketplace                                4.7.3     True        False         False      24h
monitoring                                 4.7.3     True        False         False      2d8h
network                                    4.7.3     True        False         False      2d8h
node-tuning                                4.7.3     True        False         False      2d8h
openshift-apiserver                        4.7.3     True        False         False      2d8h
openshift-controller-manager               4.7.3     True        False         False      2d8h
openshift-samples                          4.7.3     True        False         False      2d8h
operator-lifecycle-manager                 4.7.3     True        False         False      2d8h
operator-lifecycle-manager-catalog         4.7.3     True        False         False      2d8h
operator-lifecycle-manager-packageserver   4.7.3     True        False         False      2d8h
service-ca                                 4.7.3     True        False         False      2d8h
storage                                    4.7.3     True        False         False      2d8h
````

NOTE: For this article, we used a cluster OpenShift 4.7.3 IPI in vSphere.

Verifications

Let’s showcase the state of the cluster prior to going into recovery.

The master-1 machine was lost:

````
$ oc get nodes
NAME                         STATUS     ROLES    AGE    VERSION
zmaciel-f9fbb-master-0       Ready      master   2d8h   v1.20.0+551f7b2
zmaciel-f9fbb-master-1       NotReady   master   2d8h   v1.20.0+551f7b2
zmaciel-f9fbb-master-2       Ready      master   2d8h   v1.20.0+551f7b2
zmaciel-f9fbb-worker-52tds   Ready      worker   2d8h   v1.20.0+551f7b2
zmaciel-f9fbb-worker-nxhw8   Ready      worker   2d8h   v1.20.0+551f7b2
````

Operators that have pods running on master nodes start to get degraded because they need three pods running:

````
$ oc get clusteroperators
NAME                                       VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.7.3     False       False         True       13m
baremetal                                  4.7.3     True        False         False      2d8h
cloud-credential                           4.7.3     True        False         False      2d8h
cluster-autoscaler                         4.7.3     True        False         False      2d8h
config-operator                            4.7.3     True        False         False      2d8h
console                                    4.7.3     True        False         False      24h
csi-snapshot-controller                    4.7.3     True        False         False      2d8h
dns                                        4.7.3     True        False         False      2d8h
etcd                                       4.7.3     True        False         True       2d8h
image-registry                             4.7.3     True        False         False      2d8h
ingress                                    4.7.3     True        False         False      24h
insights                                   4.7.3     True        False         False      2d8h
kube-apiserver                             4.7.3     True        False         True       2d8h
kube-controller-manager                    4.7.3     True        False         True       2d8h
kube-scheduler                             4.7.3     True        False         True       2d8h
kube-storage-version-migrator              4.7.3     True        False         False      24h
machine-api                                4.7.3     True        False         False      2d8h
machine-approver                           4.7.3     True        False         False      2d8h
machine-config                             4.7.3     False       False         True       113s
marketplace                                4.7.3     True        False         False      24h
monitoring                                 4.7.3     False       True          True       6m45s
network                                    4.7.3     True        False         False      2d8h
node-tuning                                4.7.3     True        False         False      2d8h
openshift-apiserver                        4.7.3     True        False         True       11m
openshift-controller-manager               4.7.3     True        False         False      2d8h
openshift-samples                          4.7.3     True        False         False      2d8h
operator-lifecycle-manager                 4.7.3     True        False         False      2d8h
operator-lifecycle-manager-catalog         4.7.3     True        False         False      2d8h
operator-lifecycle-manager-packageserver   4.7.3     True        False         False      9m35s
service-ca                                 4.7.3     True        False         False      2d8h
storage                                    4.7.3     True        False         False      6m38s
````

The OpenShift cluster is available, but in the Control Plane, the operators mentioned above are degraded and begin to generate critical alerts:

The Master's MachineConfigPool enters the updating process to try to recover the lost machine. However, it is unsuccessful and the Operator Machine-Config is degraded:

````
$ oc get machineconfigpool
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master   rendered-master-8ee9ac3bef7c772854cb539086c44835   False     True       False      3              2                   3                     0                      2d8h
worker   rendered-worker-1f0162bfc17dded5a238424783fb5b36   True      False      False      2              2                   2                     0                      2d8h
````

Determine if the machine is not running:

````
$ oc get machines -A -ojsonpath='{range .items[*]}{@.status.nodeRef.name}{"\t"}{@.status.providerStatus.instanceState}{"\n"}' | grep -v running
zmaciel-f9fbb-master-0    poweredOn
zmaciel-f9fbb-master-1    poweredOff
zmaciel-f9fbb-master-2    poweredOn
zmaciel-f9fbb-worker-52tds    poweredOn
zmaciel-f9fbb-worker-nxhw8    poweredOn
````

The machine with a status other than powerOn or Running is the machine that is not working.

Identifying an unhealthy etcd member:

````
$ oc get etcd -o=jsonpath='{range .items[0].status.conditions[?(@.type=="EtcdMembersAvailable")]}{.message}{"\n"}'
2 of 3 members are available, zmaciel-f9fbb-master-1 is unhealthy
````

With the result above, we identified that one of the members of the ETCD is not available.

Determine if the node is not ready:

````
$ oc get nodes -o jsonpath='{range .items[*]}{"\n"}{.metadata.name}{"\t"}{range .spec.taints[*]}{.key}{" "}' | grep unreachable
zmaciel-f9fbb-master-1    node-role.kubernetes.io/master node.kubernetes.io/unreachable node.cloudprovider.kubernetes.io/shutdown node.kubernetes.io/unreachable
````

 


Recovering the Failed Master Node

After verifying everything above, you can begin the failed master node recovery procedure.

First, remove the unhealthy member.

a. In a terminal that has access to the cluster as a cluster-admin user, run the following command:

````
$ oc get pods -n openshift-etcd | grep -v etcd-quorum-guard | grep etcd
etcd-zmaciel-f9fbb-master-0                3/3     Running     0          2d8h
etcd-zmaciel-f9fbb-master-1                3/3     Running     0          2d8h
etcd-zmaciel-f9fbb-master-2                3/3     Running     0          2d8h
````

b. Connect to the running ETCD container, and pass in the name of a pod that is not on the affected node:

````
$ oc project openshift-etcd
Now using project "openshift-etcd" on server "https://api.zmaciel.rhbr-lab.com:6443".
$ oc rsh etcd-zmaciel-f9fbb-master-0
Defaulting container name to etcdctl.
Use 'oc describe pod/etcd-zmaciel-f9fbb-master-0 -n openshift-etcd' to see all of the containers in this pod.
````

c. View the member list:

````
sh-4.4# etcdctl member list -w table
+------------------+---------+------------------------+----------------------------+----------------------------+------------+
|        ID        | STATUS  |          NAME          |         PEER ADDRS         |        CLIENT ADDRS        | IS LEARNER |
+------------------+---------+------------------------+----------------------------+----------------------------+------------+
| 4319119f2850cd6a | started | zmaciel-f9fbb-master-0 |  https://10.36.250.63:2380 |  https://10.36.250.63:2379 |      false |
| 654b8780898910de | started | zmaciel-f9fbb-master-1 | https://10.36.250.177:2380 | https://10.36.250.177:2379 |      false |
| 88d623c9d503fcb1 | started | zmaciel-f9fbb-master-2 |  https://10.36.250.77:2380 |  https://10.36.250.77:2379 |      false |
+------------------+---------+------------------------+----------------------------+----------------------------+------------+
````

d. Remove the unhealthy ETCD member by providing the ID:

````
sh-4.4# etcdctl member remove 654b8780898910de
Member 654b8780898910de removed from cluster 9e399fdce41c910d
````

e. View the member list again and verify that the member was removed:

````
sh-4.4# etcdctl member list -w table
+------------------+---------+------------------------+---------------------------+---------------------------+------------+
|        ID        | STATUS  |          NAME          |        PEER ADDRS         |       CLIENT ADDRS        | IS LEARNER |
+------------------+---------+------------------------+---------------------------+---------------------------+------------+
| 4319119f2850cd6a | started | zmaciel-f9fbb-master-0 | https://10.36.250.63:2380 | https://10.36.250.63:2379 |      false |
| 88d623c9d503fcb1 | started | zmaciel-f9fbb-master-2 | https://10.36.250.77:2380 | https://10.36.250.77:2379 |      false |
+------------------+---------+------------------------+---------------------------+---------------------------+------------+
````

f. Remove any old secrets from the unhealthy ETCD member that was removed:

````
$ oc get secrets -n openshift-etcd | grep zmaciel-f9fbb-master-1
etcd-peer-zmaciel-f9fbb-master-1              kubernetes.io/tls                     2      2d9h
etcd-serving-metrics-zmaciel-f9fbb-master-1   kubernetes.io/tls                     2      2d9h
etcd-serving-zmaciel-f9fbb-master-1           kubernetes.io/tls                     2      2d9h
$ oc delete secrets etcd-peer-zmaciel-f9fbb-master-1 -n openshift-etcd
secret "etcd-peer-zmaciel-f9fbb-master-1" deleted
$ oc delete secrets etcd-serving-zmaciel-f9fbb-master-1 -n openshift-etcd
secret "etcd-serving-zmaciel-f9fbb-master-1" deleted
$ oc delete secrets etcd-serving-metrics-zmaciel-f9fbb-master-1 -n openshift-etcd
secret "etcd-serving-metrics-zmaciel-f9fbb-master-1" deleted
````

With the above commands you will remove the peer, serving, and metrics secrets.

g. Obtain the machine configuration for the unhealthy member:

````
$ oc get machines -n openshift-machine-api -o wide
NAME                         PHASE     TYPE   REGION   ZONE   AGE    NODE                         PROVIDERID                                       STATE
zmaciel-f9fbb-master-0       Running                          2d9h   zmaciel-f9fbb-master-0       vsphere://420156fd-d64a-ac6c-fcd0-0bb30524d146   poweredOn
zmaciel-f9fbb-master-1       Running                          2d9h   zmaciel-f9fbb-master-1       vsphere://4201f3ac-b3d4-384d-ec97-f3daf96b062f   poweredOff
zmaciel-f9fbb-master-2       Running                          2d9h   zmaciel-f9fbb-master-2       vsphere://4201243a-a689-57be-50a1-6cc62aad599f   poweredOn
zmaciel-f9fbb-worker-52tds   Running                          2d9h   zmaciel-f9fbb-worker-52tds   vsphere://4201bf12-613c-dda2-b877-34b504fd7622   poweredOn
zmaciel-f9fbb-worker-nxhw8   Running                          2d9h   zmaciel-f9fbb-worker-nxhw8   vsphere://4201f344-8f77-d579-e5cc-dc33d05ac7f7   poweredOn
````

h. Save the machine configuration to a file:

````
$ oc get machines zmaciel-f9fbb-master-0 -n openshift-machine-api -o yaml > new-master-machine.yml
````

i. Edit the new-master-machine.yml file that was created in the previous step to assign a new name and remove unnecessary fields.

When editing the file, you must remove the following parameters:

Status section;

````
status:
addresses:
- address: 10.36.250.63
  type: InternalIP
- address: fe80::fac:27e1:13f5:1645
  type: InternalIP
- address: zmaciel-f9fbb-master-0
  type: InternalDNS
lastUpdated: "2021-04-23T00:55:05Z"
nodeRef:
  kind: Node
  name: zmaciel-f9fbb-master-0
  uid: 8d3e0a21-c41d-4b4e-9910-59a364bb6008
phase: Running
providerStatus:
  conditions:
  - lastProbeTime: "2021-04-20T16:35:49Z"
    lastTransitionTime: "2021-04-20T16:35:49Z"
    message: Machine successfully created
    reason: MachineCreationSucceeded
    status: "True"
    type: MachineCreation
  instanceId: 420156fd-d64a-ac6c-fcd0-0bb30524d146
  instanceState: poweredOn
````

spec.providerID;

````
spec:
metadata: {}
providerID: vsphere://420156fd-d64a-ac6c-fcd0-0bb30524d146
````

metadata.annotations;

````
apiVersion: machine.openshift.io/v1beta1
kind: Machine
metadata:
annotations:
machine.openshift.io/instance-state: poweredOn
...
````

metadata.generation;

````
apiVersion: machine.openshift.io/v1beta1
kind: Machine
metadata:
...
generation: 2
...
````

metadata.resourceVersion;

````
apiVersion: machine.openshift.io/v1beta1
kind: Machine
metadata:
...
resourceVersion: "871091"
...
````

metadata.uid;

````
apiVersion: machine.openshift.io/v1beta1
kind: Machine
metadata:
...
uid: 310d6108-b46c-4d3c-a61e-95fa3f2ad07a
...
````

Once complete, you will need to set two new parameters in the file:

Change the metadata.name field to a new name:

````
apiVersion: machine.openshift.io/v1beta1
kind: Machine
metadata:
...
name: zmaciel-f9fbb-master-3
...
````

Update the metadata.selfLink:

````
apiVersion: machine.openshift.io/v1beta1
kind: Machine
metadata:
...
selfLink: /apis/machine.openshift.io/v1beta1/namespaces/openshift-machine-api/machines/zmaciel-f9fbb-master-3
...
````

j. Delete the machine of the unhealthy member:

````
$ oc delete machine zmaciel-f9fbb-master-1 -n openshift-machine-api
machine.machine.openshift.io "zmaciel-f9fbb-master-1" deleted
````

At this point, you will lose communication with the API for a few seconds.

k. Verify that the machine was deleted:

````
$ oc get machines -n openshift-machine-api -o wide
NAME                         PHASE     TYPE   REGION   ZONE   AGE    NODE                         PROVIDERID                                       STATE
zmaciel-f9fbb-master-0       Running                          2d9h   zmaciel-f9fbb-master-0       vsphere://420156fd-d64a-ac6c-fcd0-0bb30524d146   poweredOn
zmaciel-f9fbb-master-2       Running                          2d9h   zmaciel-f9fbb-master-2       vsphere://4201243a-a689-57be-50a1-6cc62aad599f   poweredOn
zmaciel-f9fbb-worker-52tds   Running                          2d9h   zmaciel-f9fbb-worker-52tds   vsphere://4201bf12-613c-dda2-b877-34b504fd7622   poweredOn
zmaciel-f9fbb-worker-nxhw8   Running                          2d9h   zmaciel-f9fbb-worker-nxhw8   vsphere://4201f344-8f77-d579-e5cc-dc33d05ac7f7   poweredOn
````

l. Create the new machine using the new-master-machine.yml file:

````
$ oc apply -f new-master-machine.yml
machine.machine.openshift.io/zmaciel-f9fbb-master-3 created

$ oc get machines -n openshift-machine-api

NAME                         PHASE          TYPE   REGION   ZONE   AGE
zmaciel-f9fbb-master-0       Running                               2d9h
zmaciel-f9fbb-master-2       Running                               2d9h
zmaciel-f9fbb-master-3       Provisioning                          117s
zmaciel-f9fbb-worker-52tds   Running                               2d9h
zmaciel-f9fbb-worker-nxhw8   Running                               2d9h

$ oc get machines -n openshift-machine-api

NAME                         PHASE     TYPE   REGION   ZONE   AGE
zmaciel-f9fbb-master-0       Running                          2d9h
zmaciel-f9fbb-master-2       Running                          2d9h
zmaciel-f9fbb-master-3       Running                          8m34s
zmaciel-f9fbb-worker-52tds   Running                          2d9h
zmaciel-f9fbb-worker-nxhw8   Running                          2d9h

$ oc get nodes

NAME                         STATUS   ROLES    AGE     VERSION
zmaciel-f9fbb-master-0       Ready    master   2d9h    v1.20.0+551f7b2
zmaciel-f9fbb-master-2       Ready    master   2d9h    v1.20.0+551f7b2
zmaciel-f9fbb-master-3       Ready    master   2m51s   v1.20.0+551f7b2
zmaciel-f9fbb-worker-52tds   Ready    worker   2d9h    v1.20.0+551f7b2
zmaciel-f9fbb-worker-nxhw8   Ready    worker   2d9h    v1.20.0+551f7b2
````

m. Check that all ETCD pods are working correctly:

````
$ oc get pods -n openshift-etcd | grep -v etcd-quorum-guard | grep etcd
etcd-zmaciel-f9fbb-master-0                3/3     Running     0          82s
etcd-zmaciel-f9fbb-master-2                3/3     Running     0          6m21s
etcd-zmaciel-f9fbb-master-3                3/3     Running     0          2m31s
````

n. If the output from the previous command lists only two pods, you can manually force a redeployment of ETCD:

````
$ oc patch etcd cluster -p='{"spec": {"forceRedeploymentReason": "recovery-'"$( date --rfc-3339=ns )"'"}}' --type=merge
````

NOTE: The “forceRedeploymentReason” value must be unique, so a time stamp is attached.

o. During the pods redeploy process, Kube-APIServer will redeploy your pods.

````
$ oc get clusteroperator kube-apiserver
NAME             VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE
kube-apiserver   4.7.3     True        True          False      2d9h
````

Upon completion of the redeploy of the KubeAPIServer pods, your cluster will be available again.

p. Final verification of cluster operators:

````
$ oc get clusteroperators
NAME                                       VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.7.3     True        False         False      9m24s
baremetal                                  4.7.3     True        False         False      2d10h
cloud-credential                           4.7.3     True        False         False      2d10h
cluster-autoscaler                         4.7.3     True        False         False      2d10h
config-operator                            4.7.3     True        False         False      2d10h
console                                    4.7.3     True        False         False      26h
csi-snapshot-controller                    4.7.3     True        False         False      12m
dns                                        4.7.3     True        False         False      2d10h
etcd                                       4.7.3     True        False         False      2d10h
image-registry                             4.7.3     True        False         False      2d10h
ingress                                    4.7.3     True        False         False      26h
insights                                   4.7.3     True        False         False      2d9h
kube-apiserver                             4.7.3     True        False         False      2d10h
kube-controller-manager                    4.7.3     True        False         False      2d10h
kube-scheduler                             4.7.3     True        False         False      2d10h
kube-storage-version-migrator              4.7.3     True        False         False      26h
machine-api                                4.7.3     True        False         False      2d10h
machine-approver                           4.7.3     True        False         False      2d10h
machine-config                             4.7.3     True        False         False      22m
marketplace                                4.7.3     True        False         False      26h
monitoring                                 4.7.3     True        False         False      14m
network                                    4.7.3     True        False         False      2d10h
node-tuning                                4.7.3     True        False         False      2d10h
openshift-apiserver                        4.7.3     True        False         False      109m
openshift-controller-manager               4.7.3     True        False         False      2d9h
openshift-samples                          4.7.3     True        False         False      2d9h
operator-lifecycle-manager                 4.7.3     True        False         False      2d10h
operator-lifecycle-manager-catalog         4.7.3     True        False         False      2d10h
operator-lifecycle-manager-packageserver   4.7.3     True        False         False      10m
service-ca                                 4.7.3     True        False         False      2d10h
storage                                    4.7.3     True        False         False      104m
````


Final Thoughts

This concludes part 2 on OpenShift disaster recovery. Note that all checks mentioned in the article are very important, because with these, you will have the true status of the cluster.

If you encounter a cluster that has lost two master nodes, do not worry, because part 3 in the series will focus on cluster recovery when you lose two masters.

I hope I have contributed to your knowledge.


Categories

How-tos, OpenShift 4, disaster recovery, backup, etcd

< Back to the blog