Blog Posts

Most Popular Blog Tags

Ingress-Nginx Monitoring with Prometheus and Grafana

Ingress-nginx provides an easy integration with Prometheus for monitoring. However, it can be challenging to get started with monitoring Ingress-nginx and creating dashboards and alerts. Therefore, I’ve created a monitoring mixin for Ingress-nginx which will provide Prometheus alerts and Grafana dashboards focusing on Ingress-nginx.

ArgoCD Monitoring with Prometheus and Grafana

ArgoCD has by default support for notifying when an Application does not have the desired status through triggers. For example, when an application becomes OutOfSync or Unhealthy, a notification is sent to your configured notification service (e.g. Slack). This was my initial setup, but I found it to be flaky, where networking issues between the server and controller for a couple of seconds would send many Slack messages that the Application status is unknown. An application becoming unhealthy would instantly send alerts to Slack. To resolve this I wanted interval based alerts and as usual Prometheus was the solution to this. ArgoCD provides Prometheus metrics out of the box, and alongside the metrics there’s a Grafana dashboard for ArgoCD. The dashboard is good, but the project is lacking any open source alerting. Even more so, it does not have a monitoring mixin for providing dashboards and alerts to be consumed easily.

Creating Awesome Alertmanager Templates for Slack

Prometheus, Grafana, and Alertmanager is the default stack when deploying a monitoring system. The Prometheus and Grafana bits are well documented, and there exist tons of open source approaches on how to make the best use of them. Alertmanager, on the other hand, is not highlighted as much, and even though the use case can be seen as fairly simple, it can be complex. The templating language has lots of features and capabilities. Alertmanager configuration, templates, and rules make a huge difference, especially when the team has an approach of ‘not staring at dashboards all day’. Detailed Slack alerts can be created with tons of information, such as dashboard links, runbook links, and alert descriptions, which go well together with the rest of a ChatOps stack. This post goes through how to make efficient Slack alerts.

Quick, Pretty and Easy Maintenance Page using Cloudflare Workers & Terraform

Maintenance pages are a neat way to inform your users that there are operational changes that require downtime. Cloudflare Workers allows you to execute Javascript and serve HTML close to your users and when Cloudflare manages your DNS it makes it easy to create a Worker and let it manage all traffic during a maintenance period. This makes a great solution for smaller teams/applications that do not want to invest time into creating and displaying maintenance pages. Terraform is a tool that enables provisioning infrastructure via code and it integrates great with Cloudflare APIs.

Shynet