The Vertical Pod Autoscaler (VPA) can manage both your pods’ resource requests but also recommend what the limits and requests for a pod should be. Recently, the kube-state-metrics
project removed built-in support for VPA recommendation metrics, which made the VPA require additional configuration to be valuable. This blog post will cover how to configure the VPA to expose the recommendation metrics and how to visualize them in Grafana.
This blog post doesn’t go into detail on how to install the kube-state-metrics
project, it assumes that you have it installed and only goes into details how to add additional VPA recommendation metrics.
Installing the Vertical Pod Autoscaler
The first step is to install the VPA, which you can do by using Fairwinds Helm chart. Set the values to enable Prometheus-operator’s PodMonitor
and also configure the VPA to use Prometheus’ metrics as a history provider, ensuring that recommendations are based on historical data. The following values should be set in your values.yaml
:
recommender:
podMonitor:
enabled: true
extraArgs:
storage: prometheus
prometheus-address: |
http://prometheus-k8s.monitoring:9090 # Adjust according to your Prometheus address
// https://github.com/kubernetes/autoscaler/issues/5031#issuecomment-1450583325
prometheus-cadvisor-job-name: 'kubelet'
container-pod-name-label: 'pod'
container-namespace-label: 'namespace'
container-name-label: 'container'
metric-for-pod-labels: 'kube_pod_labels{job="kube-state-metrics"}[8d]'
pod-namespace-label: 'namespace'
pod-name-label: 'pod'
pod-label-prefix: 'label_'
updater:
podMonitor:
enabled: true
The extraArgs
related to Prometheus configuration are necessary because the default values on these args
have become outdated and don’t match the kube-state-metrics
default labels.
Now generic VPA metrics that show VPA performance and activity are available in Prometheus by scraping the pod metrics, and the VPA recommendations rely on historical data.
Update 2025-03-21: the prometheus integration doesn’t seem to work well, as it fetches the metrics and stores them in a ClusterState
. Then, it uses default Kubernetes recommended labels to match the stored metrics to the VPA, which doesn’t work fully. Discussion exists here.
Adding VPA recommendation metrics
As mentioned previously the kube-state-metrics
project removed built-in support for VPA recommendation metrics, which means that you need to configure the kube-state-metrics
to add the recommendation metrics.
First, adjust the ClusterRole
for kube-state-metrics
to include the following rules:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
app.kubernetes.io/component: exporter
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/part-of: kube-prometheus
name: kube-state-metrics
rules:
# ... other rules
# Add the following rules which allow kube-state-metrics to read VPA resources
- apiGroups:
- autoscaling.k8s.io
resources:
- verticalpodautoscalers
verbs:
- list
- watch
- apiGroups:
- apiextensions.k8s.io
resources:
- customresourcedefinitions
verbs:
- list
- watch
Next, we will need to convert the status of the VPA resource to Prometheus metrics using kube-state-metrics
with the help of the CustomResourceStateMetrics
CustomResourceDefinition
(CRD). Set the config using the --custom-resource-state-config
argument when starting kube-state-metrics
:
kind: Deployment
metadata:
labels:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 2.13.0
name: kube-state-metrics
namespace: monitoring
spec:
...
containers:
- args:
...
- --custom-resource-state-config
- |
kind: CustomResourceStateMetrics
spec:
resources:
- groupVersionKind:
group: autoscaling.k8s.io
kind: "VerticalPodAutoscaler"
version: "v1"
labelsFromPath:
verticalpodautoscaler: [metadata, name]
namespace: [metadata, namespace]
target_api_version: [spec, targetRef, apiVersion]
target_kind: [spec, targetRef, kind]
target_name: [spec, targetRef, name]
metrics:
# Labels
- name: "verticalpodautoscaler_labels"
help: "VPA container recommendations. Kubernetes labels converted to Prometheus labels"
each:
type: Info
info:
labelsFromPath:
name: [metadata, name]
# Memory Information
- name: "verticalpodautoscaler_status_recommendation_containerrecommendations_target"
help: "VPA container recommendations for memory. Target resources the VerticalPodAutoscaler recommends for the container."
each:
type: Gauge
gauge:
path: [status, recommendation, containerRecommendations]
valueFrom: [target, memory]
labelsFromPath:
container: [containerName]
commonLabels:
resource: "memory"
unit: "byte"
- name: "verticalpodautoscaler_status_recommendation_containerrecommendations_lowerbound"
help: "VPA container recommendations for memory. Minimum resources the container can use before the VerticalPodAutoscaler updater evicts it"
each:
type: Gauge
gauge:
path: [status, recommendation, containerRecommendations]
valueFrom: [lowerBound, memory]
labelsFromPath:
container: [containerName]
commonLabels:
resource: "memory"
unit: "byte"
- name: "verticalpodautoscaler_status_recommendation_containerrecommendations_upperbound"
help: "VPA container recommendations for memory. Maximum resources the container can use before the VerticalPodAutoscaler updater evicts it"
each:
type: Gauge
gauge:
path: [status, recommendation, containerRecommendations]
valueFrom: [upperBound, memory]
labelsFromPath:
container: [containerName]
commonLabels:
resource: "memory"
unit: "byte"
- name: "verticalpodautoscaler_status_recommendation_containerrecommendations_uncappedtarget"
help: "VPA container recommendations for memory. Target resources the VerticalPodAutoscaler recommends for the container ignoring bounds"
each:
type: Gauge
gauge:
path: [status, recommendation, containerRecommendations]
valueFrom: [uncappedTarget, memory]
labelsFromPath:
container: [containerName]
commonLabels:
resource: "memory"
unit: "byte"
# CPU Information
- name: "verticalpodautoscaler_status_recommendation_containerrecommendations_target"
help: "VPA container recommendations for cpu. Target resources the VerticalPodAutoscaler recommends for the container."
each:
type: Gauge
gauge:
path: [status, recommendation, containerRecommendations]
valueFrom: [target, cpu]
labelsFromPath:
container: [containerName]
commonLabels:
resource: "cpu"
unit: "core"
- name: "verticalpodautoscaler_status_recommendation_containerrecommendations_lowerbound"
help: "VPA container recommendations for cpu. Minimum resources the container can use before the VerticalPodAutoscaler updater evicts it"
each:
type: Gauge
gauge:
path: [status, recommendation, containerRecommendations]
valueFrom: [lowerBound, cpu]
labelsFromPath:
container: [containerName]
commonLabels:
resource: "cpu"
unit: "core"
- name: "verticalpodautoscaler_status_recommendation_containerrecommendations_upperbound"
help: "VPA container recommendations for cpu. Maximum resources the container can use before the VerticalPodAutoscaler updater evicts it"
each:
type: Gauge
gauge:
path: [status, recommendation, containerRecommendations]
valueFrom: [upperBound, cpu]
labelsFromPath:
container: [containerName]
commonLabels:
resource: "cpu"
unit: "core"
- name: "verticalpodautoscaler_status_recommendation_containerrecommendations_uncappedtarget"
help: "VPA container recommendations for cpu. Target resources the VerticalPodAutoscaler recommends for the container ignoring bounds"
each:
type: Gauge
gauge:
path: [status, recommendation, containerRecommendations]
valueFrom: [uncappedTarget, cpu]
labelsFromPath:
container: [containerName]
commonLabels:
resource: "cpu"
unit: "core"
The preceding configuration converts the VPA recommendation status to metrics that be scraped by Prometheus. These are available in the kube-state-metrics
metrics endpoint.
Creating a Vertical Pod Autoscaler
To create a VPA, you can use the following example:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
labels:
app.kubernetes.io/instance: hodovi-cc
app.kubernetes.io/name: hodovi-cc
app.kubernetes.io/version: 9ec45b512c915bfe2fabc1671713935890602534
name: hodovi-cc
namespace: apps
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: hodovi-cc
updatePolicy:
updateMode: "Off" # Disable automatic updates
After creating the VPA, the status
field gets updated with the recommendations. The kube-state-metrics
converts the status to metrics that Prometheus can scrape. An example of the status for the VPA hodovi-cc
is:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
labels:
app.kubernetes.io/instance: hodovi-cc
app.kubernetes.io/name: hodovi-cc
app.kubernetes.io/version: 9ec45b512c915bfe2fabc1671713935890602534
name: hodovi-cc
namespace: apps
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: hodovi-cc
updatePolicy:
updateMode: "Off" # Disable automatic updates
status:
conditions:
- lastTransitionTime: "2024-08-25T20:22:15Z"
status: "True"
type: RecommendationProvided
recommendation:
containerRecommendations:
- containerName: hodovi-cc
lowerBound:
cpu: 15m
memory: "246562508"
target:
cpu: 23m
memory: "297164212"
uncappedTarget:
cpu: 23m
memory: "297164212"
upperBound:
cpu: 97m
memory: "1254364671"
Now when Prometheus scrapes kube-state-metrics
, all required metrics to visualize VPA recommendations are available.
Visualizing VPA recommendations in Grafana
I’ve created a Grafana dashboard that visualizes the VPA recommendations. You can find the dashboard here. I’ve also written a blog post on comprehensive Kubernetes autoscaling monitoring with Prometheus and Grafana here and created a kubernetes-autoscaling-mixin that includes all the dashboards and alerts for Kubernetes autoscaling components.
The Grafana dashboard provides an overview of the VPA recommendations for both memory and CPU. It includes the following panels:
- Namespace Summary - Provides an overview of the VPA recommendations per namespace. See the memory and CPU target and lower and upper bounds for each VPA in the selected namespace.
- VPA Summary - Provides a history of recommendations for the selected VPA. See the historical memory and CPU target and lower and upper bounds for each container in the selected VPA. Also, it provides a summary for what resource configuration would be required for guaranteed and burstable QoS classes.