Blog Posts

Most Popular Blog Tags

Kubernetes Events Monitoring with Loki, Alloy, and Grafana

Kubernetes events offer valuable insights into the activities within your cluster, providing a comprehensive view of each resource’s status. While they’re beneficial for debugging individual resources, they often face challenges due to the absence of aggregation. This can lead to issues such as events being garbage collected, the necessity to view them promptly, difficulties in filtering and searching, and limited accessibility for other systems. The blog post explores configuring Loki with Alloy to efficiently scrape Kubernetes events and visualize them in Grafana.

Karpenter Monitoring with Prometheus and Grafana

With the release of Karpenter v1 we have stable Prometheus metrics, but the Grafana dashboards are not that great and there are no open source alerts. Therefore, I decided to create a monitoring-mixin that provides a set of Prometheus rules and Grafana dashboards for Karpenter. This blog post will introduce the kubernetes-autoscaling-mixin - a set of Prometheus rules and Grafana dashboards for Kubernetes autoscaling, but we will only write about Karpenter monitoring in this blog post.

Configuring Kube-prometheus-stack Dashboards and Alerts for K3s Compatibility

The kube-prometheus-stack Helm chart, which deploys the kubernetes-mixin, is designed for standard Kubernetes setups, often pre-configured for specific cloud environments. However, these configurations are not directly compatible with k3s, a lightweight Kubernetes distribution. Since k3s lacks many of the default cloud integrations, issues arise, such as missing metrics, broken graphs, and unavailable endpoints (example issue). This blog post will guide you through adapting the kube-prometheus-stack Helm chart and the kubernetes-mixin to work seamlessly in k3s environments, ensuring functional dashboards and alerts tailored to k3s.

Kubernetes Cost Tracking Simplified with OpenCost, Prometheus, and Grafana

OpenCost is an open-source tool designed to help you monitor and understand the cost of your cloud infrastructure. As a project under the Cloud Native Computing Foundation (CNCF), OpenCost offers a transparent and powerful solution for cloud cost management. It provides both a user-friendly interface for visualizing cloud costs and Prometheus metrics, enabling you to query and visualize these costs using Grafana. The popular tool KubeCost is built on top of OpenCost, offering an enhanced feature set and user experience. However, KubeCost is not open-source, and its free plan has limitations on data retention and storage. Given these constraints and a preference for consolidating data visualization within Grafana, I opted to use OpenCost. This blog post will introduce the opencost-mixin - a set of Prometheus rules and Grafana dashboards for OpenCost.

Comprehensive Kubernetes Autoscaling Monitoring with Prometheus and Grafana

The kubernetes-mixin is a popular resource for providing excellent dashboards and alerts for monitoring Kubernetes clusters. However, it lacks comprehensive support for autoscaling components such as Pod Disruption Budgets (PDB), Horizontal and Vertical Pod Autoscalers (HPA, VPA), Karpenter, and the Cluster Autoscaler. These essential components are commonly deployed in Kubernetes environments but do not have standardized, open-source monitoring solutions. This blog post aims to solve that by introducing the kubernetes-autoscaling-mixin - a set of Prometheus rules and Grafana dashboards for Kubernetes autoscaling.

Configuring VPA to Use Historical Metrics for Recommendations and Expose Them in Kube-state-metrics

The Vertical Pod Autoscaler (VPA) can manage both your pods’ resource requests but also recommend what the limits and requests for a pod should be. Recently, the kube-state-metrics project removed built-in support for VPA recommendation metrics, which made the VPA require additional configuration to be valuable. This blog post will cover how to configure the VPA to expose the recommendation metrics and how to visualize them in Grafana.

Ingress-Nginx Monitoring with Prometheus and Grafana

Ingress-nginx provides an easy integration with Prometheus for monitoring. However, it can be challenging to get started with monitoring Ingress-nginx and creating dashboards and alerts. Therefore, I’ve created a monitoring mixin for Ingress-nginx which will provide Prometheus alerts and Grafana dashboards focusing on Ingress-nginx.

ArgoCD Monitoring with Prometheus and Grafana

ArgoCD has by default support for notifying when an Application does not have the desired status through triggers. For example, when an application becomes OutOfSync or Unhealthy, a notification is sent to your configured notification service (e.g. Slack). This was my initial setup, but I found it to be flaky, where networking issues between the server and controller for a couple of seconds would send many Slack messages that the Application status is unknown. An application becoming unhealthy would instantly send alerts to Slack. To resolve this I wanted interval based alerts and as usual Prometheus was the solution to this. ArgoCD provides Prometheus metrics out of the box, and alongside the metrics there’s a Grafana dashboard for ArgoCD. The dashboard is good, but the project is lacking any open source alerting. Even more so, it does not have a monitoring mixin for providing dashboards and alerts to be consumed easily.

Showcase: Using Jsonnet & Mixins to Simplify Endpoint Monitoring with Blackbox-exporter

Blackbox-exporter is a Prometheus exporter that probes endpoints and exposes metrics of the probe result. There are multiple guides on how to use the Blackbox-exporter, and we won’t go into that, but rather focus on newer things as Jsonnet as a templating language, the Kubernetes CustomResourceDefinition that comes with the Prometheus-operator and the open source monitoring mixin for the exporter. The goal of this post is to showcase the cool nits of Jsonnet and open source libraries to minimize boilerplate and utilize best practices and shared standards.

Celery Monitoring with Prometheus and Grafana

Celery is a python project used for asynchronous job processing and task scheduling in web applications or distributed systems. It is very commonly used together with Django, Celery as the asynchronous job processor and Django as the web framework. Celery has great documentation on how to use it, deploy it and integrate it with Django. However, monitoring is less covered - this is what this blog post aims to do. There is a great Prometheus exporter for Celery that has dashboards and alerts that come with them.

Shynet