Monitoring Kubernetes InitContainers with Prometheus

4 years ago 6614 views
1 min read

Kubernetes InitContainers are a neat way to run arbitrary code before your container starts. It ensures that certain pre-conditions are met before your app is up and running. For example it allows you to:

  • run database migrations with Django or Rails before your app starts
  • ensure a microservice or API you depend on to is running

Unfortunately InitContainers can fail and when that happens you probably want to be notified because your app will never start. Kube-state-metrics exposes plenty of Kubernetes cluster metrics for Prometheus. Combining the two we can monitor and alert whenever we discover container problems. Recently, a pull-request was merged that provides InitContainer data.

The metric kube_pod_init_container_status_last_terminated_reason tells us why a specific InitContainer failed to run; whether it's because it timed out or ran into errors.

To use the InitContainer metrics deploy Prometheus and kube-state-metrics. Then target the metrics server in your Prometheus scrape_configs to ensure we're pulling all the cluster metrics into Prometheus:

- job_name: 'kube-state-metrics'
    - targets: ['kube-state-metrics:8080']

kube_pod_init_container_status_last_terminated_reason contains the metric label reason that can be in five different states:

  • Completed
  • OOMKilled
  • Error
  • ContainerCannotRun
  • DeadlineExceeded

We want to be alerted whenever a metric that is not 'Completed' is scraped because that means an InitContainer has failed to run. Here is an example alerting rule.

  - name: Init container failure
      - alert: InitContainersFailed
        expr: kube_pod_init_container_status_last_terminated_reason{reason!="Completed"} == 1
          summary: '{{ $labels.container }} init failed'
          description: '{{ $labels.container }} has not completed init containers with the reason {{ $labels.reason }}'

Happy monitoring!

Similar Posts

3 years ago
kubernetes devops aks gke eks

Creating a Low Cost Managed Kubernetes Cluster for Personal Development using Terraform

5 min read

Kubernetes is an open-source system that's popular amongst developers. It automatically deploys, scales, and manages containerized applications. Yet for those working outside of the traditional startup or corporate setting, Kubernetes can be CPU and memory-intensive, disincentivizing developers from using a …

3 years ago
kubernetes aws ebs devops

Migrating Kubernetes PersistentVolumes across Regions and AZs on AWS

4 min read

Persistent volumes in AWS are tied to one Availability Zone(AZ), therefore if you were to create a cluster in an AZ where the volume is not created in you would not be able to use it. You will need to …

1 year ago
grafana monitoring prometheus graphql apollo nestjs

NestJS Apollo GraphQL Prometheus Metrics and Grafana Dashboards

4 min read

Apollo GraphQL and NestJS are gaining traction quickly, however the monitoring approaches are unclear. At the moment (late 2021 / early 2022) there are no default exporters or libraries for Prometheus metrics and the same goes for Grafana dashboards, this …