Horizontal Pod Autoscaling ¶
With Horizontal Pod Autoscaling, Kubernetes automatically scales the number of pods in a replication controller, deployment, or replica set based on observed CPU utilization (or, with alpha support, on some other, application-provided metrics).
The current stable version, which only includes support for CPU autoscaling, can
be found in the
autoscaling/v1 API version. The alpha version, which includes
support for scaling on memory and custom metrics, can be found in
autoscaling/v2beta2 in 1.12 and later).
Kops can assist in setting up HPA. Relevant reading you will need to go through:
- Extending the Kubernetes API with the aggregation layer
- Configure The Aggregation Layer
- Horizontal Pod Autoscaling
While the above links go into details on how Kubernetes needs to be configured to work with HPA, a lot of that work is already done for you by Kops. Specifically:
- Enable the Aggregation Layer via the following kube-apiserver flags:
--requestheader-client-ca-file=<path to aggregator CA cert>
--proxy-client-cert-file=<path to aggregator proxy cert>
--proxy-client-key-file=<path to aggregator proxy key>
- Enable Horizontal Pod Scaling ... set the appropriate flags for
--horizontal-pod-autoscaler-use-rest-clientsshould be true.
Cluster Configuration ¶
Support For Multiple Metrics ¶
To enable the resource metrics API for scaling on CPU and memory, install metrics-server (installation instruction here). The compatibility matrix is as follows:
|Metrics Server||Metrics API group/version||Supported Kubernetes version|
Support For Custom Metrics ¶
To enable the custom metrics API, register it via the API aggregation layer. If you're using Prometheus, checkout the custom metrics adapter for Prometheus.