Service Account Issuer (SAI) migration ¶
In the past changing the Service Account Issuer has been a disruptive process. However since Kubernetes v1.22 you can specify multiple Service Account Issuers in the Kubernetes API Server (Docs here).
As noted in the Kubernetes Docs when the --service-account-issuer
flag is specified multiple times, the first is used to generate tokens and all are used to determine which issuers are accepted.
So with this feature we can migrate to a new Service Account Issuer without disruption to cluster operations.
Note: These procedures can be adapted to enable IAM Roles for Service Accounts (IRSA) without disruption to cluster operations.
Migrate using kOps additionalServiceAccountIssuers
¶
In kubernetes/kops#16497 kOps added support for specifying an additional Service Account Issuer. This was released in kOps v1.28.5 so it requires at least this version to follow this procedure. If you are using an earlier version you can follow the Migrate using Instancegroup Hooks procedure.
Warning: This procedure is manual. We recommend testing this on a staging cluster before proceeding on a production cluster.
In this example we are switching from master.[cluster-name].[domain]
to api.internal.[cluster-name].[domain]
.
- Add new SAI as additional (existing SAI as primary) to the Cluster resource
kubeAPIServer: serviceAccountIssuer: https://master.[cluster-name].[domain] additionalServiceAccountIssuers: - https://api.internal.[cluster-name].[domain]
- Apply the changes to the cluster
- Roll the control-plane nodes
- Switch the primary/secondary SAI on the Cluster resource
kubeAPIServer: serviceAccountIssuer: https://api.internal.[cluster-name].[domain] additionalServiceAccountIssuers: - https://master.[cluster-name].[domain]
- Apply the changes to the cluster
- Roll the control-plane nodes
- Wait 24 hours until the dynamic SA tokens have refreshed
- Remove the
additionalServiceAccountIssuers
from the Cluster resource - Apply the changes to the cluster
- Roll the control-plane nodes
Migrate using Instancegroup Hooks ¶
Warning: This procedure is manual and involves some tricky modification of manifest files. We recommend testing this on a staging cluster before proceeding on a production cluster.
In this example we are switching from master.[cluster-name].[domain]
to api.internal.[cluster-name].[domain]
.
- Add the
modify-kube-api-manifest
(existing SAI as primary) hook to the control-plane InstanceGroup resourceshooks: - name: modify-kube-api-manifest before: - kubelet.service manifest: | User=root Type=oneshot ExecStart=/bin/bash -c "until [ -f /etc/kubernetes/manifests/kube-apiserver.manifest ];do sleep 5;done;sed -i '/- --service-account-issuer=https:\/\/api.internal.[cluster-name].[domain]/i\ \ \ \ - --service-account-issuer=https:\/\/master.[cluster-name].[domain]' /etc/kubernetes/manifests/kube-apiserver.manifest"
- Apply the changes to the cluster
- Roll the control-plane nodes
- Update the
modify-kube-api-manifest
(switch the primary/secondary SAI) hook on the control-plane InstanceGroup resourceshooks: - name: modify-kube-api-manifest before: - kubelet.service manifest: | User=root Type=oneshot ExecStart=/bin/bash -c "until [ -f /etc/kubernetes/manifests/kube-apiserver.manifest ];do sleep 5;done;sed -i '/- --service-account-issuer=https:\/\/api.internal.[cluster-name].[domain]/a\ \ \ \ - --service-account-issuer=https:\/\/master.[cluster-name].[domain]' /etc/kubernetes/manifests/kube-apiserver.manifest"
- Apply the changes to the cluster
- Roll the control-plane nodes
- Wait 24 hours until the dynamic SA tokens have refreshed
- Remove the
modify-kube-api-manifest
hook on the control-plane InstanceGroup resources - Apply the changes to the cluster
- Roll the control-plane nodes
This procedure was originally posted in a GitHub issue here with inspiration from this comment.