Horizontal Pod Autoscaling ¶
With Horizontal Pod Autoscaling, Kubernetes automatically scales the number of pods in a replication controller, deployment, or replica set based on observed CPU utilization (or, with alpha support, on some other, application-provided metrics).
autoscaling/v2 stable API moved to GA in 1.23.
The previous stable version, which only includes support for CPU autoscaling, can
be found in the
autoscaling/v1 API version. The beta version, which includes
support for scaling on memory and custom metrics, can be found in
autoscaling/v2beta1 in 1.8 - 1.24 (and
autoscaling/v2beta2 in 1.12 - 1.25).
kOps sets up HPA out of the box. Relevant reading to go through:
- Extending the Kubernetes API with the aggregation layer
- Configure The Aggregation Layer
- Horizontal Pod Autoscaling
While the above links go into details on how Kubernetes needs to be configured to work with HPA, the work is already done for you by kOps. Specifically:
- Enable the Aggregation Layer via the following kube-apiserver flags:
--requestheader-client-ca-file=<path to aggregator CA cert>
--proxy-client-cert-file=<path to aggregator proxy cert>
--proxy-client-key-file=<path to aggregator proxy key>
- Enable Horizontal Pod Scaling ... set the appropriate flags for
Cluster Configuration ¶
Support For Multiple Metrics ¶
To enable the resource metrics API for scaling on CPU and memory, install metrics-server (installation instruction here). The compatibility matrix is as follows:
|Metrics Server||Metrics API group/version||Supported Kubernetes version|
Support For Custom Metrics ¶
To enable the custom metrics API, register it via the API aggregation layer. If you're using Prometheus, checkout the custom metrics adapter for Prometheus.