The Vertical Pod Autoscaler (VPA) can manage both your pods' resource requests but also recommend what the limits and requests for a pod should be. Recently, the kube-state-metrics
project removed built-in support for VPA recommendation metrics, which made the VPA require additional configuration to be valuable. This blog post will cover how to configure the VPA to expose the recommendation metrics and how to visualize them in Grafana.
This blog post doesn’t go into detail on how to install the kube-state-metrics
project, it assumes that you have it installed and only goes into details how to add additional VPA recommendation metrics.
Installing the Vertical Pod Autoscaler
The first step is to install the VPA, which we will do by using Fairwinds Helm chart. We'll set the values to enable Prometheus-operator's PodMonitors
and also configure the VPA to use Prometheus' metrics as a history provider ensuring that recommendations are based on historical data. The following values should be set in your values.yaml
:
recommender:
podMonitor:
enabled: true
extraArgs:
prometheus-address: |
http://prometheus-k8s.monitoring:9090 # Adjust according to your Prometheus address
storage: prometheus
updater:
podMonitor:
enabled: true
Now generic (VPA performance and activity) VPA metrics should be available in Prometheus by scraping the pod metrics and the VPA recommendations will be based on historical data.
Adding VPA recommendation metrics
As mentioned previously the kube-state-metrics
project removed built-in support for VPA recommendation metrics, which means that you need to configure the kube-state-metrics
to add the recommendation metrics.
First, adjust the ClusterRole
for kube-state-metrics
to include the following rules:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
app.kubernetes.io/component: exporter
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/part-of: kube-prometheus
name: kube-state-metrics
rules:
# ... other rules
# Add the following rules which allow kube-state-metrics to read VPA resources
- apiGroups:
- autoscaling.k8s.io
resources:
- verticalpodautoscalers
verbs:
- list
- watch
- apiGroups:
- apiextensions.k8s.io
resources:
- customresourcedefinitions
verbs:
- list
- watch
Next, we'll convert the status of the VPA resource to Prometheus metrics using kube-state-metrics
with the help of the CustomResourceStateMetrics
CustomResourceDefinition
(CRD). We can set the config using the --custom-resource-state-config
argument when starting kube-state-metrics
:
kind: Deployment
metadata:
labels:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 2.13.0
name: kube-state-metrics
namespace: monitoring
spec:
...
containers:
- args:
...
- --custom-resource-state-config
- |
kind: CustomResourceStateMetrics
spec:
resources:
- groupVersionKind:
group: autoscaling.k8s.io
kind: "VerticalPodAutoscaler"
version: "v1"
labelsFromPath:
verticalpodautoscaler: [metadata, name]
namespace: [metadata, namespace]
target_api_version: [spec, targetRef, apiVersion]
target_kind: [spec, targetRef, kind]
target_name: [spec, targetRef, name]
metrics:
# Labels
- name: "verticalpodautoscaler_labels"
help: "VPA container recommendations. Kubernetes labels converted to Prometheus labels"
each:
type: Info
info:
labelsFromPath:
name: [metadata, name]
# Memory Information
- name: "verticalpodautoscaler_status_recommendation_containerrecommendations_target"
help: "VPA container recommendations for memory. Target resources the VerticalPodAutoscaler recommends for the container."
each:
type: Gauge
gauge:
path: [status, recommendation, containerRecommendations]
valueFrom: [target, memory]
labelsFromPath:
container: [containerName]
commonLabels:
resource: "memory"
unit: "byte"
- name: "verticalpodautoscaler_status_recommendation_containerrecommendations_lowerbound"
help: "VPA container recommendations for memory. Minimum resources the container can use before the VerticalPodAutoscaler updater evicts it"
each:
type: Gauge
gauge:
path: [status, recommendation, containerRecommendations]
valueFrom: [lowerBound, memory]
labelsFromPath:
container: [containerName]
commonLabels:
resource: "memory"
unit: "byte"
- name: "verticalpodautoscaler_status_recommendation_containerrecommendations_upperbound"
help: "VPA container recommendations for memory. Maximum resources the container can use before the VerticalPodAutoscaler updater evicts it"
each:
type: Gauge
gauge:
path: [status, recommendation, containerRecommendations]
valueFrom: [upperBound, memory]
labelsFromPath:
container: [containerName]
commonLabels:
resource: "memory"
unit: "byte"
- name: "verticalpodautoscaler_status_recommendation_containerrecommendations_uncappedtarget"
help: "VPA container recommendations for memory. Target resources the VerticalPodAutoscaler recommends for the container ignoring bounds"
each:
type: Gauge
gauge:
path: [status, recommendation, containerRecommendations]
valueFrom: [uncappedTarget, memory]
labelsFromPath:
container: [containerName]
commonLabels:
resource: "memory"
unit: "byte"
# CPU Information
- name: "verticalpodautoscaler_status_recommendation_containerrecommendations_target"
help: "VPA container recommendations for cpu. Target resources the VerticalPodAutoscaler recommends for the container."
each:
type: Gauge
gauge:
path: [status, recommendation, containerRecommendations]
valueFrom: [target, cpu]
labelsFromPath:
container: [containerName]
commonLabels:
resource: "cpu"
unit: "core"
- name: "verticalpodautoscaler_status_recommendation_containerrecommendations_lowerbound"
help: "VPA container recommendations for cpu. Minimum resources the container can use before the VerticalPodAutoscaler updater evicts it"
each:
type: Gauge
gauge:
path: [status, recommendation, containerRecommendations]
valueFrom: [lowerBound, cpu]
labelsFromPath:
container: [containerName]
commonLabels:
resource: "cpu"
unit: "core"
- name: "verticalpodautoscaler_status_recommendation_containerrecommendations_upperbound"
help: "VPA container recommendations for cpu. Maximum resources the container can use before the VerticalPodAutoscaler updater evicts it"
each:
type: Gauge
gauge:
path: [status, recommendation, containerRecommendations]
valueFrom: [upperBound, cpu]
labelsFromPath:
container: [containerName]
commonLabels:
resource: "cpu"
unit: "core"
- name: "verticalpodautoscaler_status_recommendation_containerrecommendations_uncappedtarget"
help: "VPA container recommendations for cpu. Target resources the VerticalPodAutoscaler recommends for the container ignoring bounds"
each:
type: Gauge
gauge:
path: [status, recommendation, containerRecommendations]
valueFrom: [uncappedTarget, cpu]
labelsFromPath:
container: [containerName]
commonLabels:
resource: "cpu"
unit: "core"
The above configuration will convert the VPA recommendation status to metrics that can be scraped by Prometheus. These will be available in the kube-state-metrics
metrics endpoint. Now when Prometheus scrapes kube-state-metrics
we will have all required metrics to visualize VPA recommendations.
Visualizing VPA recommendations in Grafana
I've created a Grafana dashboard that visualizes the VPA recommendations. You can find the dashboard here. I've also written a blog post on comprehensive Kubernetes autoscaling monitoring with Prometheus and Grafana here and created a kubernetes-autoscaling-mixin that includes all the dashboards and alerts for Kubernetes autoscaling components.
The Grafana dashboard provides an overview of the VPA recommendations for both memory and CPU. It includes the following panels:
- Namespace Summary - Provides an overview of the VPA recommendations per namespace. See the memory and CPU target and lower and upper bounds for each VPA in the selected namespace.
- VPA Summary - Provides a history of recommendations for the selected VPA. See the historical memory and CPU target and lower and upper bounds for each container in the selected VPA. Also, it provides a summary for what resource configuration would be required for guaranteed and burstable QoS classes.