Skip to content

Release notes for kOps 1.36 series

⚠ kOps 1.36 has not been released yet! ⚠

This is a document to gather the release notes prior to the release.

kOps 1.36 adds Kubernetes 1.36 support, completes the move away from the in-tree cloud providers, refreshes the bundled components (containerd v2.3, etcd-manager, AWS Load Balancer Controller, EBS CSI driver, Cilium options, CoreDNS, etc.), reworks how the kops-channels addon manager is built and deployed, introduces a hybrid bootstrap path for gossip clusters, expands Azure and Hetzner support, and lays the groundwork for Linode (Akamai) as a new cloud provider.

Significant changes

Kubernetes

  • Kubernetes 1.36 support, including integration tests (#18267, #18202)
  • kube-apiserver: align deleteCollectionWorkers, MaxRequestInflight and compactionInterval with kube-up defaults for large clusters (#18081, #18086)
  • kubelet: migrate deprecated CLI flags into the kubelet configuration file passed via --config (#18280)
  • kubelet: add new tunables (kubeAPIQPS/kubeAPIBurst, nodeAllocatableUpdatePeriodSeconds and friends) and only set nodeAllocatableUpdatePeriodSeconds on Kubernetes 1.35+ (#18085, #18153, #18305)
  • kube-proxy: bind-mount the kubeconfig directory instead of the file so kubeconfig rotations survive (#18344)
  • kube-scheduler: documentation clarification for KubeSchedulerConfiguration (#18151)
  • kops-controller: validate instance group names (#18391)
  • kops-controller: delete the legacy node reconciler and route DigitalOcean through the non-legacy Identifier (#18227)
  • Remove the unused in-tree cloud config plumbing (#18347)

Container Runtime

  • Default containerd to v2.3.0 (with runc 1.4.2) for Kubernetes 1.36+; v2.2.3 remains the default for 1.32–1.35, and v1.7.31 for older clusters (#18364)
  • Support containerd config TOML v3, with clearer config-file version behavior and dead-code cleanup from 1.6 (#18291). Users who set spec.containerd.configAdditions on a cluster or instance group must update those entries to match the TOML v3 schema before upgrading.
  • Map the containerd registry mirror wildcard to the _default directory (#18291)
  • Stream verified container image bytes directly into ctr import from nodeup and drop the containerized-mounter Archive task (#18278, #18277)
  • Dump containerd config files in kops toolbox dump for troubleshooting (#18313)

Networking

  • Update Cilium with several new tunables:
  • Schedule cilium-operator on control-plane nodes (#18375)
  • Allow setting arbitrary cilium-config entries via Cilium.ExtraConfig (#18285)
  • Add EnableHostFirewall field to CiliumNetworkingSpec (#18152)
  • Add bpf-lb-sock and bpf-lb-sock-hostns-only flags (#18375)
  • Require k8s-connectivity in the liveness probe (#18237)
  • Calico: disable kube-proxy when running in eBPF mode (#18334)
  • Update AWS VPC CNI to v1.21.2 (#18410)
  • CoreDNS: update to v1.14.3 then pin to v1.14.2 to stay on the supported branch, and bump CoreDNS memory on large clusters (#18368, #18369, #18361)
  • dns-controller: make priorityClassName configurable and default Provider when ExternalDNS is partially set (#18298, #18302)
  • Drop deprecated GCS-based CNI plugin mirrors (#17987, #17976)

AWS

  • Drop the in-tree cloud-provider-aws dependency from kOps (#18336)
  • AWS Load Balancer Controller refresh:
  • Upgrade to v3.3.0, switching the manifest to a Helm + Kustomize pipeline (#18221, #18276)
  • Prune the bundled Deployment and drop the ALBTargetControlConfig CRD (#18222, #18233)
  • Bypass the LBC webhook for cert-manager so circular bootstrap issues are avoided (#17999)
  • Add elasticloadbalancing:SetRulePriorities and ec2:DescribeSubnets permissions (#17999)
  • Update EBS CSI driver to v1.58.0, refresh the upstream policy, and gate MutableCSINodeAllocatableCount for Kubernetes 1.35+ (#18220)
  • kops-controller is now served over HTTPS with a /healthz target on the API NLB target group (#18236, #18174)
  • Abort rolling updates when load-balancer deregister fails instead of marching on with degraded targets (#18338)
  • NodeTerminationHandler:
  • Add EnableOutOfServiceTaint field (#18140)
  • Allow disabling enableScheduledEventDraining in Queue Processor mode (#18339)
  • Mixed instances policy: apply onDemandAllocationStrategy to the ASG and propagate taints without value as ASG tags (#18342, #18343)
  • instanceRequirements: add excludedInstanceTypes and fix a memory-assignment bug (#18113, #18123)
  • NLB: add a security-group mode option (#18211)
  • Also consider private subnets that already have an IPv6 CIDR as having a CIDR assigned (#18089)
  • Disable nm-cloud-setup on RHEL 9 for AWS VPC CNI (#18264)
  • Tighten and trim instance-role IAM permissions to match what each component actually uses (#18251, #18355, #18362, #18363, #18372)
  • Use HeadBucket to resolve the S3 bucket region, pass the VFS scheme and provider-specific options to the S3 client, and silence warnings when the S3 provider has no supported checksum (#18335, #18129, #18132, #18128)

Azure

  • Deploy cloud-controller-manager for node lifecycle and load-balancer support (#18197)
  • Add experimental Terraform target support (#18149)
  • Add support for the Azure Disk CSI Driver (#18141)
  • Use HTTPS for the kops-controller probe and load-balancer health check, and move probe / rule configuration into the model using SDK types (#18182, #18183, #18190)
  • Use /etc/kubernetes/azure.json for the cloud config, load CCM/CSI config from a Secret via a new azure-cloud-config addon and stop writing the cloud-config file on nodes (#18345)
  • Set the provider ID when starting kubelet (#18155)
  • Enable CCM cloud routes for kubenet and Kindnet (#18262)
  • Encode storage account in azureblob:// URLs (#18260)
  • List VMSS NICs in protokube gossip seed discovery and match VMSS VM/NIC ARM IDs case-insensitively in the dumper (#18319, #18315)
  • Cluster deletion robustness: block disk deletion on the parent VMSS, delete RoleAssignment after the VM scale set, handle missing resource groups in disk listing, fix nil pointer dereferences and ordering issues (#18196, #18186, #18191, #18184, #18185)
  • Retry tasks with failed provisioning state, fix Terraform LoadBalancer task dependencies on PublicIP, and use a larger default VM SKU (Standard_D4ls_v6) in tests (#18157, #18154, #18194)
  • Restrict VMSS role assignments to the control plane and fix a control-plane role tag spelling (#18357, #18353)
  • Add regenerate.sh for addons (#18225)

GCP

  • Add kops-controller to the GCE internal load balancer and expose it on the internal LB for gossip clusters (#18169, #18307)
  • Use SSL health check for kops-controller on GCE (#18171)
  • Support role=apiserver on GCE with dns=none and fix instance tags for role=apiserver (#18159, #18175)
  • Allow BGP from nodes to the control plane for Calico (#18351)
  • Don't request live migration on instance types that don't support it (#18004)
  • Wait for InstanceManagers/InstanceTemplates deletion to complete and include MIG scaling errors when instances are not found (#18013, #18247)
  • Allow N4A instance type (#18330)
  • Drop the cloud-provider-gcp dependency and switch clouddns to a forked gcetokensource (#18274)
  • Allow pods to reach metric ports running on control-plane nodes when using GCE alias IPs (#18052)
  • Use Kubernetes 1.36 for the apiserver role e2e template, move it to dns=none, add e2e templates that combine internal load balancer + Cilium etcd, and reject role=apiserver + dns=none only on GCE when not supported (#18147, #18146, #18162, #18214)
  • GCE: shrink the etcd-cluster disk label to fit the 63-character limit (#18292)
  • GCE: support control-plane volume type configuration (#17955)
  • GCE: fix instance-group deletion (#18148) and nil-panic during deletion (#18195)
  • Migrate the GCS bucket discovery store handling and reject GCS VFS paths without buckets (#18360)

Hetzner

  • Enable Cluster Autoscaler (#18226, #18135)
  • Upgrade hcloud-cloud-controller-manager to v1.31.0 (#18281, #18317)
  • Upgrade hcloud-csi-driver to v2.20.2 and reorder the CSI driver Deployment before the DaemonSet (#18318)
  • Split the hcloud Secret into its own addon and let the CSI driver consume the CCM-provided secret (#18317, #18318)

DigitalOcean

  • Default machine type to s-2vcpu-4gb-amd (#18227)
  • Migrate node identifier to the non-legacy Identifier interface and tag droplets with the instance group role (#18227)

OpenStack

  • Resolve the floatingip TODO from kOps 1.21 (#18314)
  • Enable hybrid bootstrap mode for gossip clusters (#18245)

Linode (experimental)

  • Initial Linode (Akamai) cloud provider support, including:
  • Cloud provider API registration (#18166)
  • VFS object storage schema (#18138)
  • VPC cloudup task (#18316)
  • nodeup configuration and node identity (#18177)
  • The Linode provider is not yet ready for production use.

Bare metal

  • Don't try to use protobuf for bare-metal tooling (#18068)
  • Unpin Kubernetes version for the metal provider (#17944)

Etcd

  • Update etcd-manager to v3.0.20260512 (#18323)
  • Bump etcd to latest patches (3.5.30, 3.6.11) and drop support for etcd 3.4 (#18290)
  • Generate etcd-manager patch symlinks dynamically (#18290)
  • Decouple EtcdEvents HTTP from main etcd cluster traffic in scalability tests (#18370)

Channels and addons

  • The kops-channels addon manager is now built as a kops-managed image and rendered as a static pod on control-plane nodes; protokube no longer applies channels or labels control-plane nodes (#18215, #18328, #18373, #18374)
  • kops-channels gains --node-name, --interval, multi-URL apply support, and quicker retries until the first reconcile succeeds (#18328)
  • kops-channels: move the node labeler from protokube (#18215)
  • kops-channels: fix region detection and the discovery cache permission noise (#18390)
  • Drop the standalone channels binary from the kOps release artifacts (#18374)
  • addons: render addons as fi-tasks so addon templates can reference the finalized task graph, and drop the legacy 9.99.0 version shim and the deprecated master nodeAffinity term (#18215, #18257)

Gossip

  • Introduce hybrid worker bootstrap for gossip clusters on AWS, Azure, GCE, and OpenStack: control-plane nodes keep using gossip while workers bootstrap directly against the API NLB / internal LB, so workers no longer need protokube (#18245, #18307)
  • Restrict gossip seed discovery to control-plane nodes and stop exporting unused cloud credentials to worker nodes (#18352, #18354)
  • New cluster creation now defaults to dns=none and logs a deprecation warning if gossip is requested (#18245)
  • Validation: enforce supported DNS topology per cloud provider (#18255)
  • Migrate protokube mesh gossip protobuf from gogo (#18230)
  • Add minimal gossip create/update integration tests and an upgrade e2e test (#18256, #18296)

Operating System Support

  • Drop support for Amazon Linux 2 (#17943)
  • Drop support for Ubuntu 20.04 (#18235)
  • Drop support for Debian 10 (#18235)
  • Add experimental support for Ubuntu 26.04 (#18232)
  • Load the nf_tables module and install iptables-nft on RHEL 10+ (#18179)
  • Enable the nf_conntrack kernel module on Rocky 9 (#17968)
  • Skip ImageVolume tests on Debian 11 and prevent cloud-ifupdown-helper from hijacking CNI veths on Debian 11 (#18261)
  • Set E2E --node-os-arch=arm64 for Rocky 10 (#18192)

Other components

  • Update cluster-autoscaler to v1.35.0 (#18110)
  • Run metrics-server in insecure mode for AI Conformance tests (#18067)
  • Update Go to v1.25.7 / v1.25.8 / v1.25.9 / v1.26.2 / v1.26.3 (#17956, #18058, #18173, #18267, #18395)
  • Build kOps binaries with gcr.io/distroless/static as the base image and strip release binaries by default (#18403, #18263)
  • Drop the in-tree helm dependency from kops toolbox; switch to a forked helmstrvals (#18272)
  • Switch structured-merge-diff from v4 to v6 (#18273)
  • Upgrade hashicorp/memberlist to v0.5.4 (#18230)

Other changes of note

  • kops reconcile cluster accepts --use-kubeconfig to reuse an existing kubeconfig instead of regenerating it (#18126)
  • kops accepts --node-volume-type flags on create cluster (#18145)
  • kops create instancegroup aligns the node label across create cluster and create instancegroup (#18341)
  • kops get assets: fix lookup when spec.dnsZone is a DNS name (#18384)
  • kops update cluster: reject non-http(s) URLs for assets.fileRepository (#18340)
  • kops update cluster: validate assets.fileRepository (#18340)
  • kops upgrade-ab: allow kOps downgrades for upgrade-AB scenarios (#18219)
  • kops toolbox dump: time out per-node log dumping after one minute, improve reliability and skip not-found nodes on GCE (#18349, #18311, #18049)
  • nodeup: add experimental hybrid-bootstrap workers, skip protokube/channels assets on workers, populate DefaultMachineType for Cilium-ENI clusters, and use shared system-component env vars for kops-channels (#18245, #18358, #18365)
  • nodeup: fix protokube skip on hybrid-bootstrap workers (#18378)
  • kops-controller/nodeup: ensure files have the desired permissions before close and rename, and fix PrivateKey.WriteTo returning zero length (#18379)
  • dns-controller: honor klog -stderrthreshold even when -logtostderr is true (#18231)
  • Fix HasHighlyAvailableControlPlane to use AllInstanceGroups when an instance-group filter is in use (#17740)
  • Fix kops panic on send to a closed results channel (#18326)
  • AssetBuilder is now concurrency-safe (#18181)
  • Side-loading uses the KOPS_BASE_URL image version (#18200)
  • Verify the config server IPs with a DNS name (#18241)
  • Remove the explicit fs.inotify.max_user_watches sysctl setting (#17556)
  • Pull actions/upload-artifact, actions/setup-go and actions/dependency-review-action to their latest releases, pinned by commit SHA (#18114)
  • Replace shipbot with a gh-based script for promoting binaries (#18095)
  • Build: add a kops-channels image build and CI push, and run make apimachinery updates as needed (#18328)
  • gomod: tidy and verify all modules dynamically (#18401)

AI Conformance (experimental)

Breaking changes

  • Support for Amazon Linux 2, Ubuntu 20.04 and Debian 10 has been removed; existing clusters running those distributions must be migrated to a supported OS before upgrading to kOps 1.36 (#17943, #18235)
  • Support for Kubernetes 1.30 has been removed.
  • Support for etcd 3.4 has been removed; clusters must be running etcd 3.5 or 3.6 (#18290)
  • The standalone channels binary is no longer distributed; the kops-channels addon manager now runs as a static pod on control-plane nodes (#18374)
  • The legacy 9.99.0 addons-version shim has been removed; addons set up by kOps versions prior to 1.22 must be re-applied before upgrading (#18257)
  • The in-tree cloud-provider-aws and cloud-provider-gcp dependencies have been dropped from kOps; external cloud providers are required (already mandatory for Kubernetes 1.33+) (#18336, #18274)
  • The legacy azureblob://{container}/{key} URL form (paired with the AZURE_STORAGE_ACCOUNT environment variable) is no longer accepted; state-store paths must use the new azureblob://{account}/{container}/{key} form (#18260)
  • Users who set spec.containerd.configAdditions must update those entries to the containerd config TOML v3 schema before upgrading to kOps 1.36 (#18291)

Known Issues

  • None at this time

Deprecations

  • Support for Kubernetes version 1.30 is removed in kOps 1.36.

  • Support for Kubernetes version 1.31 is deprecated and will be removed in kOps 1.37.

  • Support for Amazon Linux 2, Ubuntu 20.04 and Debian 10 is removed in kOps 1.36.

  • Support for etcd 3.4 is removed in kOps 1.36.

  • The standalone channels binary is no longer distributed in kOps 1.36; kops-channels runs as a static pod on the control plane.

  • Support for AWS Classic Load Balancer (CLB) for the API, deprecated since kOps 1.26, will be rejected for new clusters in kOps 1.37 and fully removed (existing clusters must migrate) in kOps 1.38. See the CLB to NLB migration guide for the upgrade procedure.

  • Support for gossip DNS is deprecated. kOps 1.37 will reject new gossip DNS clusters, and kOps 1.38 will require existing gossip DNS clusters to migrate before upgrading. This affects clusters whose name ends in .k8s.local and that were not created with --dns=none; clusters using --dns=none, even with a .k8s.local name, are not affected. Existing gossip DNS clusters should migrate to --dns=none or a hosted DNS zone. kOps 1.36 introduces hybrid bootstrap to make that migration easier.