keda-scaled-object-1

Keda Monitoring With Prometheus and Grafana

Published on August 20, 2025, 15:00 UTC 5 minutes New!

Keda is a tool that provides event-driven autoscaling for Kubernetes, allowing you to scale your applications based on external metrics. It uses the Kubernetes Horizontal Pod Autoscaler (HPA) to adjust the number of pods in a deployment based on metrics like CPU usage, memory usage, or custom metrics from external sources. It also supports scaling based on event sources like message queues, databases as a job and defines a new Custom Resource Definition (CRD) called ScaledJob to configure the scaling behavior. Monitoring Keda effectively is crucial to ensure that your autoscaling policies are working as expected and that your applications are performing optimally.

I’ve written multiple blog posts on Kubernetes autoscaling monitoring, including Comprehensive Kubernetes Autoscaling Monitoring with Prometheus and Grafana, Karpenter Monitoring with Prometheus and Grafana and Configuring VPA to Use Historical Metrics for Recommendations and Expose Them in Kube-state-metrics. This post focuses on monitoring Keda with Prometheus and Grafana, but is still be apart of the same Github repository as the other posts, so you can find all the code and configuration files in one place.

The repository is the kubernetes-autoscaling-mixin on Github. It contains Grafana dashboards and Prometheus rules for Keda, HPA, VPA, Cluster Autoscaler, PDBs, and Karpenter.

There are already two dashboards that published in Grafana:

  • Scaled Object - An overview of ScaledObjects, including the total count and their status, with insights into scaling activity and errors.
  • Scaled Job - An overview of ScaledJobs, including the total count and their status, with insights into scaling activity and errors.

There are also Prometheus alerts stored in GitHub that you can import that cover the common Karpenter issues.

Enabling Metric Collection for Keda

To monitor Keda, you need to enable metric collection. This is typically done by deploying the Keda operator in your Kubernetes cluster. The operator automatically collects metrics from your scaled jobs and scaled objects.

Using Helm, you should deploy Keda with the following values:

prometheus:
  metricServer:
    enabled: true
    serviceMonitor:
      enabled: true
  operator:
    enabled: true
    serviceMonitor:
      enabled: true
  webhooks:
    enabled: true
    serviceMonitor:
      enabled: true

This configuration enables the Prometheus metric server, the Keda operator, and webhooks service monitors. The service monitors allow Prometheus to scrape metrics from Keda components using the Prometheus-operator.

Grafana Dashboards

Upcoming sections describe each dashboard.

Scaled Objects

The Grafana dashboard for Scaled Objects provides insights into KEDA’s scaled objects. Its panels include:

  • Summary - Provides an overview of ScaledObjects, including the total count and their status. Provides a table of scaled objects, their namespaces, and their types with links to the Horizontal Pod Autoscalers (HPAs) dashboard for further scaling insights. This allows you to quickly navigate to the HPA dashboard for more detailed scaling information. The requirement is to add Horizontal Pod Autoscaler dashboard to your Grafana instance, which is available in the kubernetes-autoscaling-mixin
  • ScaledObject by object, scaler and metric - Displays scaled objects grouped by their associated scaler and metric, providing a clear view of how each scaled object is configured. Gives insights to latency of metric collection for scaled objects, which can impact scaling decisions and also the value of the metric used for scaling.

Keda-scaled-object-1

Keda-scaled-object-2

Scaled Jobs

The dashboard for Scaled Jobs is similar but tailored towards KEDA-scaled jobs. Monitoring panels include:

  • Summary - Provides an overview of ScaledJobs, including the total count and their status. Provides a table of scaled jobs, their namespaces, and their types with links to the Kubernetes Workload dashboard for job insights. This allows you to quickly navigate to the workload dashboard for more scaling insights, showing when jobs are added and resource consumption. The requirement is to add Kubernetes Workload dashboard to your Grafana instance, which is available in the kubernetes-mixin and adjust the URL so it’s predictable. The jsonnet library would need the following configuration to be added to the config.libsonnet file:
{
  _config+:: {
    keda+: {
      enabled: true,
      k8sResourcesWorkloadDashboardUid: 'k8s-resources-workload-kd21',
    },
  }
}

Also, you would need to add the Workload dashboard to your Grafana instance using that specific ID. If you are using kube-prometheus then an IaC way of doing so would be:

    // This provides stable links for our Keda Scaled Job dashboards.
    kubernetesControlPlane+: {
      mixin+: {
        _config+: {
          grafanaDashboardIDs+: {
            'k8s-resources-workload.json': 'k8s-resources-workload-kd21',
          },
        },
      },
    },
  • ScaledJob by job, scaler and metric - Displays scaled jobs grouped by their associated scaler and metric, providing a clear view of how each scaled job is configured. Gives insights to latency of metric collection for scaled objects, which can impact scaling decisions and also the value of the metric used for scaling.

Keda-scaled-job-1

Keda-scaled-job-2

Prometheus Alerts

Alerts are trickier to get right for a generic use case, however they’re still provided by the Kubernetes-autoscaling-mixin. They’re also configurable with the config.libsonnet package in the repository, if you are familiar with Jsonnet then customizing the alerts should be fairly straight forward. Alerts are available on GitHub, and I’ll add a description for the alerts below.

  • Alert name: KedaScaledJobErrors

Alerts when KEDA scaled jobs encounter errors.

  • Alert name: KedaScaledObjectErrors

Alerts when KEDA scaled objects encounter errors.

  • Alert name: KedaScalerLatencyHigh

Alerts when the latency of KEDA scaled object metrics is high. This indicates that the time taken to collect metrics for scaled objects has increased significantly, potentially impacting scaling decisions. The default threshold is set to 5 seconds, but you can adjust it based on your requirements.

  • Alert name: KedaScaledObjectPaused

Alerts when a scaled object is paused. This indicates that the scaled object isn’t processing scaling triggers, which could lead to performance issues or resource underutilization. The default threshold is set to 24 h as the pause can be intentional, but you can adjust it based on your requirements.

  • Alert name: KedaScalerDetailErrors

Alerts when KEDA scaler detail errors occur. This indicates that there are issues with some parts of the scaler, which could lead to incorrect scaling decisions. It might be that you can’t fetch Prometheus metrics or that the scaler is not configured correctly.

Conclusion

Keda is a powerful tool for event-driven autoscaling in Kubernetes, and monitoring it effectively is crucial to ensure that your autoscaling policies are working as expected. The kubernetes-autoscaling-mixin provides a set of Prometheus rules and Grafana dashboards that can help you monitor Keda and ensure that it’s working as expected. The dashboards provide an overview of scaled objects and scaled jobs, including their activity, errors, and scaling metrics. The alerts can help you identify issues with Keda, such as scaled object errors and scaler latency. If you’re using Kubernetes and have any type of autoscaling in your cluster, I highly recommend checking out the kubernetes-autoscaling-mixin.

Related Posts

Configuring VPA to Use Historical Metrics for Recommendations and Expose Them in Kube-state-metrics

The Vertical Pod Autoscaler (VPA) can manage both your pods’ resource requests but also recommend what the limits and requests for a pod should be. Recently, the kube-state-metrics project removed built-in support for VPA recommendation metrics, which made the VPA require additional configuration to be valuable. This blog post will cover how to configure the VPA to expose the recommendation metrics and how to visualize them in Grafana.

Configuring Kube-prometheus-stack Dashboards and Alerts for K3s Compatibility

The kube-prometheus-stack Helm chart, which deploys the kubernetes-mixin, is designed for standard Kubernetes setups, often pre-configured for specific cloud environments. However, these configurations are not directly compatible with k3s, a lightweight Kubernetes distribution. Since k3s lacks many of the default cloud integrations, issues arise, such as missing metrics, broken graphs, and unavailable endpoints (example issue). This blog post will guide you through adapting the kube-prometheus-stack Helm chart and the kubernetes-mixin to work seamlessly in k3s environments, ensuring functional dashboards and alerts tailored to k3s.

Django Monitoring with Prometheus and Grafana

The Prometheus package for Django provides a great Prometheus integration, but the open source dashboards and alerts that exist are not that great. The to-go Grafana dashboard does not use a large portion of metrics provided by the Django-Prometheus package, alongside this there are no filters for views, methods, jobs and namespaces. This blog post will introduce the Django-mixin - a set of Prometheus rules and Grafana dashboards for Django. The dashboard and alerts will provide insights on applied/unapplied migrations, RED (requests per second, error percentage of the request, latency for each request) metrics, database ops and cache hit rate.

Shynet