vertical-pod-autoscaler-2

Configuring VPA to Use Historical Metrics for Recommendations and Expose Them in Kube-state-metrics

1 month ago
5 min read

The Vertical Pod Autoscaler (VPA) can manage both your pods' resource requests but also recommend what the limits and requests for a pod should be. Recently, the kube-state-metrics project removed built-in support for VPA recommendation metrics, which made the VPA require additional configuration to be valuable. This blog post will cover how to configure the VPA to expose the recommendation metrics and how to visualize them in Grafana.

This blog post doesn’t go into detail on how to install the kube-state-metrics project, it assumes that you have it installed and only goes into details how to add additional VPA recommendation metrics.

Installing the Vertical Pod Autoscaler

The first step is to install the VPA, which we will do by using Fairwinds Helm chart. We'll set the values to enable Prometheus-operator's PodMonitors and also configure the VPA to use Prometheus' metrics as a history provider ensuring that recommendations are based on historical data. The following values should be set in your values.yaml:

recommender:
  podMonitor:
    enabled: true
  extraArgs:
    prometheus-address: |
      http://prometheus-k8s.monitoring:9090 # Adjust according to your Prometheus address
    storage: prometheus
updater:
  podMonitor:
    enabled: true

Now generic (VPA performance and activity) VPA metrics should be available in Prometheus by scraping the pod metrics and the VPA recommendations will be based on historical data.

Adding VPA recommendation metrics

As mentioned previously the kube-state-metrics project removed built-in support for VPA recommendation metrics, which means that you need to configure the kube-state-metrics to add the recommendation metrics.

First, adjust the ClusterRole for kube-state-metrics to include the following rules:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    app.kubernetes.io/component: exporter
    app.kubernetes.io/name: kube-state-metrics
    app.kubernetes.io/part-of: kube-prometheus
  name: kube-state-metrics
rules:
    # ... other rules
    # Add the following rules which allow kube-state-metrics to read VPA resources
    - apiGroups:
      - autoscaling.k8s.io
      resources:
      - verticalpodautoscalers
      verbs:
      - list
      - watch
    - apiGroups:
      - apiextensions.k8s.io
      resources:
      - customresourcedefinitions
      verbs:
      - list
      - watch

Next, we'll convert the status of the VPA resource to Prometheus metrics using kube-state-metrics with the help of the CustomResourceStateMetrics CustomResourceDefinition (CRD). We can set the config using the --custom-resource-state-config argument when starting kube-state-metrics:

kind: Deployment
metadata:
  labels:
    app.kubernetes.io/name: kube-state-metrics
    app.kubernetes.io/part-of: kube-prometheus
    app.kubernetes.io/version: 2.13.0
  name: kube-state-metrics
  namespace: monitoring
spec:
    ...
      containers:
      - args:
        ...
        - --custom-resource-state-config
        - |
          kind: CustomResourceStateMetrics
          spec:
            resources:
              - groupVersionKind:
                  group: autoscaling.k8s.io
                  kind: "VerticalPodAutoscaler"
                  version: "v1"
                labelsFromPath:
                  verticalpodautoscaler: [metadata, name]
                  namespace: [metadata, namespace]
                  target_api_version: [spec, targetRef, apiVersion]
                  target_kind: [spec, targetRef, kind]
                  target_name: [spec, targetRef, name]
                metrics:
                  # Labels
                  - name: "verticalpodautoscaler_labels"
                    help: "VPA container recommendations. Kubernetes labels converted to Prometheus labels"
                    each:
                      type: Info
                      info:
                        labelsFromPath:
                          name: [metadata, name]
                  # Memory Information
                  - name: "verticalpodautoscaler_status_recommendation_containerrecommendations_target"
                    help: "VPA container recommendations for memory. Target resources the VerticalPodAutoscaler recommends for the container."
                    each:
                      type: Gauge
                      gauge:
                        path: [status, recommendation, containerRecommendations]
                        valueFrom: [target, memory]
                        labelsFromPath:
                          container: [containerName]
                    commonLabels:
                      resource: "memory"
                      unit: "byte"
                  - name: "verticalpodautoscaler_status_recommendation_containerrecommendations_lowerbound"
                    help: "VPA container recommendations for memory. Minimum resources the container can use before the VerticalPodAutoscaler updater evicts it"
                    each:
                      type: Gauge
                      gauge:
                        path: [status, recommendation, containerRecommendations]
                        valueFrom: [lowerBound, memory]
                        labelsFromPath:
                          container: [containerName]
                    commonLabels:
                      resource: "memory"
                      unit: "byte"
                  - name: "verticalpodautoscaler_status_recommendation_containerrecommendations_upperbound"
                    help: "VPA container recommendations for memory. Maximum resources the container can use before the VerticalPodAutoscaler updater evicts it"
                    each:
                      type: Gauge
                      gauge:
                        path: [status, recommendation, containerRecommendations]
                        valueFrom: [upperBound, memory]
                        labelsFromPath:
                          container: [containerName]
                    commonLabels:
                      resource: "memory"
                      unit: "byte"
                  - name: "verticalpodautoscaler_status_recommendation_containerrecommendations_uncappedtarget"
                    help: "VPA container recommendations for memory. Target resources the VerticalPodAutoscaler recommends for the container ignoring bounds"
                    each:
                      type: Gauge
                      gauge:
                        path: [status, recommendation, containerRecommendations]
                        valueFrom: [uncappedTarget, memory]
                        labelsFromPath:
                          container: [containerName]
                    commonLabels:
                      resource: "memory"
                      unit: "byte"
                  # CPU Information
                  - name: "verticalpodautoscaler_status_recommendation_containerrecommendations_target"
                    help: "VPA container recommendations for cpu. Target resources the VerticalPodAutoscaler recommends for the container."
                    each:
                      type: Gauge
                      gauge:
                        path: [status, recommendation, containerRecommendations]
                        valueFrom: [target, cpu]
                        labelsFromPath:
                          container: [containerName]
                    commonLabels:
                      resource: "cpu"
                      unit: "core"
                  - name: "verticalpodautoscaler_status_recommendation_containerrecommendations_lowerbound"
                    help: "VPA container recommendations for cpu. Minimum resources the container can use before the VerticalPodAutoscaler updater evicts it"
                    each:
                      type: Gauge
                      gauge:
                        path: [status, recommendation, containerRecommendations]
                        valueFrom: [lowerBound, cpu]
                        labelsFromPath:
                          container: [containerName]
                    commonLabels:
                      resource: "cpu"
                      unit: "core"
                  - name: "verticalpodautoscaler_status_recommendation_containerrecommendations_upperbound"
                    help: "VPA container recommendations for cpu. Maximum resources the container can use before the VerticalPodAutoscaler updater evicts it"
                    each:
                      type: Gauge
                      gauge:
                        path: [status, recommendation, containerRecommendations]
                        valueFrom: [upperBound, cpu]
                        labelsFromPath:
                          container: [containerName]
                    commonLabels:
                      resource: "cpu"
                      unit: "core"
                  - name: "verticalpodautoscaler_status_recommendation_containerrecommendations_uncappedtarget"
                    help: "VPA container recommendations for cpu. Target resources the VerticalPodAutoscaler recommends for the container ignoring bounds"
                    each:
                      type: Gauge
                      gauge:
                        path: [status, recommendation, containerRecommendations]
                        valueFrom: [uncappedTarget, cpu]
                        labelsFromPath:
                          container: [containerName]
                    commonLabels:
                      resource: "cpu"
                      unit: "core"

The above configuration will convert the VPA recommendation status to metrics that can be scraped by Prometheus. These will be available in the kube-state-metrics metrics endpoint. Now when Prometheus scrapes kube-state-metrics we will have all required metrics to visualize VPA recommendations.

Visualizing VPA recommendations in Grafana

I've created a Grafana dashboard that visualizes the VPA recommendations. You can find the dashboard here. I've also written a blog post on comprehensive Kubernetes autoscaling monitoring with Prometheus and Grafana here and created a kubernetes-autoscaling-mixin that includes all the dashboards and alerts for Kubernetes autoscaling components.

The Grafana dashboard provides an overview of the VPA recommendations for both memory and CPU. It includes the following panels:

  • Namespace Summary - Provides an overview of the VPA recommendations per namespace. See the memory and CPU target and lower and upper bounds for each VPA in the selected namespace.
  • VPA Summary - Provides a history of recommendations for the selected VPA. See the historical memory and CPU target and lower and upper bounds for each container in the selected VPA. Also, it provides a summary for what resource configuration would be required for guaranteed and burstable QoS classes.

Vertical-pod-autoscaler-1

Vertical-pod-autoscaler-2


Similar Posts

Comprehensive Kubernetes Autoscaling Monitoring with Prometheus and Grafana

7 min read

The kubernetes-mixin is a popular resource for providing excellent dashboards and alerts for monitoring Kubernetes clusters. However, it lacks comprehensive support for autoscaling components such as Pod Disruption Budgets (PDB), Horizontal and Vertical Pod Autoscalers (HPA, VPA), Karpenter, and the …


Configuring Kube-prometheus-stack Dashboards and Alerts for K3s Compatibility

6 min read

The kube-prometheus-stack Helm chart, which deploys the kubernetes-mixin, is designed for standard Kubernetes setups, often pre-configured for specific cloud environments. However, these configurations are not directly compatible with k3s, a lightweight Kubernetes distribution. Since k3s lacks many of …


Django Monitoring with Prometheus and Grafana

6 min read

The Prometheus package for Django provides a great Prometheus integration, but the open source dashboards and alerts that exist are not that great. The to-go Grafana dashboard does not use a large portion of metrics provided by the Django-Prometheus package, …