Working with InstanceGroups

The kops InstanceGroup is a declarative model of a group of nodes. By modifying the object, you can change the instance type you're using, the number of nodes you have, the OS image you're running - essentially all the per-node configuration is in the InstanceGroup.

We'll assume you have a working cluster - if not, you probably want to read how to get started on GCE.

Changing the number of nodes

If you kops get ig you should see that you have InstanceGroups for your nodes and for your master:

> kops get ig
master-us-central1-a    Master  n1-standard-1   1   1   us-central1
nodes           Node    n1-standard-2   2   2   us-central1

Let's change the number of nodes to 3. We'll edit the InstanceGroup configuration using kops edit (which should be very familiar to you if you've used kubectl edit). kops edit ig nodes will open the InstanceGroup in your editor, looking a bit like this:

kind: InstanceGroup
  creationTimestamp: 2017-10-03T15:17:31Z
  labels: simple.k8s.local
  name: nodes
  image: cos-cloud/cos-stable-57-9202-64-0
  machineType: n1-standard-2
  maxSize: 2
  minSize: 2
  role: Node
  - us-central1
  - us-central1-a

Edit minSize and maxSize, changing both from 2 to 3, save and exit your editor. If you wanted to change the image or the machineType, you could do that here as well. There are actually a lot more fields, but most of them have their default values, so won't show up unless they are set. The general approach is the same though.

On saving you'll note that nothing happens. Although you've changed the model, you need to tell kops to apply your changes to the cloud.

We use the same kops update cluster command that we used when initially creating the cluster; when run without --yes it should show you a preview of the changes, and now there should be only one change:

> kops update cluster
Will modify resources:
    TargetSize               2 -> 3

This is saying that we will alter the TargetSize property of the InstanceGroupManager object named us-central1-a-nodes-simple-k8s-local, changing it from 2 to 3.

That's what we want, so we kops update cluster --yes.

kops will resize the GCE managed instance group from 2 to 3, which will create a new GCE instance, which will then boot and join the cluster. Within a minute or so you should see the new node join:

> kubectl get nodes
NAME                        STATUS    AGE       VERSION
master-us-central1-a-thjq   Ready     10h       v1.7.2
nodes-g2v2                  Ready     10h       v1.7.2
nodes-tmk8                  Ready     10h       v1.7.2
nodes-z2cz                  Ready     1s       v1.7.2

nodes-z2cz just joined our cluster!

Changing the image

That was a fairly simple change, because we didn't have to reboot the nodes. Most changes though do require rolling your instances - this is actually a deliberate design decision, in that we are aiming for immutable nodes. An example is changing your image. We're using cos-stable, which is Google's Container OS. Let's try Debian Stretch instead.

If you run gcloud compute images list to list the images available to you in GCE, you should see a debian-9 image:

> gcloud compute images list
debian-9-stretch-v20170918                        debian-cloud             debian-9                              READY

So now we'll do the same kops edit ig nodes, except this time change the image to debian-cloud/debian-9-stretch-v20170918:

Now kops update cluster will show that you're going to create a new GCE Instance Template, and that the Managed Instance Group is going to use it:

Will create resources:
    Network                 name:default id:default
    Tags                    [simple-k8s-local-k8s-io-role-node]
    Preemptible             false
    BootDiskImage           debian-cloud/debian-9-stretch-v20170918
    BootDiskSizeGB          128
    BootDiskType            pd-standard
    CanIPForward            true
    Scopes                  [compute-rw, monitoring, logging-write, storage-ro]
    Metadata                {cluster-name: <resource>, startup-script: <resource>}
    MachineType             n1-standard-2

Will modify resources:
    InstanceTemplate         id:nodes-simple-k8s-local-1507043948 -> name:nodes-simple-k8s-local

Note that the BootDiskImage is indeed set to the debian 9 image you requested.

kops update cluster --yes will now apply the change, but if you were to run kubectl get nodes you would see that the instances had not yet been reconfigured. There's a hint at the bottom:

Changes may require instances to restart: kops rolling-update cluster`

These changes require your instances to restart (we'll remove the COS images and replace them with Debian images). kops can perform a rolling update to minimize disruption, but even so you might not want to perform the update right away; you might want to make more changes or you might want to wait for off-peak hours. You might just want to wait for the instances to terminate naturally - new instances will come up with the new configuration - though if you're not using preemptible/spot instances you might be waiting for a long time.

Performing a rolling-update of your cluster

When you're ready to force your instances to restart, use kops rollling-update cluster:

> kops rolling-update cluster
Using cluster from kubectl context: simple.k8s.local

master-us-central1-a    Ready       0       1   1   1   1
nodes           NeedsUpdate 3       0   3   3   3

Must specify --yes to rolling-update.

You can see that your nodes need to be restarted, and your masters do not. A kops rolling-update cluster --yes will perform the update. It will only restart instances that need restarting (unless you --force a rolling-update).

When you're ready, do kops rolling-update cluster --yes. It'll take a few minutes per node, because for each node we cordon the node, drain the pods, shut it down and wait for the new node to join the cluster and for the cluster to be healthy again. But this procedure minimizes disruption to your cluster - a rolling-update cluster is never going to be something you do during your superbowl commercial, but ideally it should be minimally disruptive.

After the rolling-update is complete, you can see that the nodes are now running a new image:

> kubectl get nodes -owide
NAME                        STATUS    AGE       VERSION   EXTERNAL-IP     OS-IMAGE                             KERNEL-VERSION
master-us-central1-a-8fcc   Ready     48m       v1.7.2   Container-Optimized OS from Google   4.4.35+
nodes-9cml                  Ready     17m       v1.7.2   Container-Optimized OS from Google   4.4.35+
nodes-km98                  Ready     11m       v1.7.2   Container-Optimized OS from Google   4.4.35+
nodes-wbb2                  Ready     5m        v1.7.2   Container-Optimized OS from Google   4.4.35+

Next steps: learn how to perform cluster-wide operations, like upgrading kubernetes.