Page History

Versions Compared

Old Version 4

changes.mady.by.user Sara Lambert

Saved on Feb 10, 2017

compared with

New Version Current

changes.mady.by.user Craig Willis

Saved on Feb 13, 2017

Key

This line was added.
This line was removed.
Formatting was changed.

...

From

Jira

server	JIRA
serverId	b14d4ad9-eb00-3a94-88ac-a843fb6fa1ca
key	NDS-728

On several occasions, we've had nodes that just won't reboot (e.g., corrupt disk image). There are two approaches to resolving this problem:

Option 1: re-run ansible

Shutdown nodes, rename to node-dead or delete.
Detach volumes, but do not remove
Re-run ansible openstack-provision and k8s-install

Option 2:

Shutdown nodes, rename to node-dead or delete.
In OpenStack, make a snapshot of a good node (similar type)
Create new instance from snapshot
Change instance name
Edit /etc/kubelet/kubelet.config, change name to correct name
Re-attach volumes

Drain + Cordon

Drain will automatically execute cordon on a node, meaning the scheduler will no longer run any new pods there.

...