Ingress-nginx provides an easy integration with Prometheus for monitoring. However, it can be challenging to get started with monitoring Ingress-nginx and creating dashboards and alerts. Therefore, I've created a monitoring mixin for Ingress-nginx which will provide Prometheus alerts and Grafana dashboards focusing on Ingress-nginx.
You can find the source code to the alerts and dashboard in github/ingress-nginx-mixin.
There are two dashboards available:
- Ingress-nginx Overview - An overview of Ingress-nginx request metrics, controller status, SSL certificates.
- Ingress-nginx Request Handling Performance - An detailed view of request metrics filterable by ingress.
There are also Prometheus alerts stored in GitHub that you can import that alert on request failures and controller failures.
The dashboards and alerts are work in progress, and feel free to share feedback in the ingress-nginx-mixin repository of what you would like to see or any issues you experience.
If you want to go directly to the dashboards you can use the links above, the rest of the blog post will guide you on how to enable metrics and describe the various alerts and dashboards.
Enabling Ingress-nginx Metrics
Ingress-nginx provides Prometheus metrics out of the box, and you can enable them by setting the following values in your Helm chart values file.
controller:
metrics:
enabled: true
serviceMonitor:
enabled: true
The above configuration will enable the metrics and also create a ServiceMonitor
for Prometheus Operator to scrape the metrics. Adjust the way you scrape metrics according to your setup if you do not use the Prometheus Operator. The metrics are available on the port 10254
and the path /metrics
of the Ingress-nginx controller.
Grafana Dashboards
There are 2 dashboards, and they are split as otherwise there would be many graphs in one dashboard, filters would be applicable for a portion of the panels as not all metrics contain the filtered labels making it unclear when they apply and some expensive metrics would put high pressure on your Prometheus backend.
The upcoming sections will describe each dashboard.
Ingress-nginx Overview Dashboard
The Ingress-nginx overview dashboard focuses on providing an overview of the request metrics, controller status and SSL certificates. The following things are core for the dashboard:
- Controller - Provides a section that summarizes requests by controller and the controller configuration status.
- Ingress - Provides a section that displays ingress request volume, request success rates and request duration.
- Certificates - Provides a section that displays SSL certificate expiry date.
Ingress-nginx Request Handling Performance
The Ingress-nginx request handling performance dashboard focuses on providing detailed insight to request metrics. The following things are core for the dashboard:
- Ingress Response Times - Provides a section that displays graphs for total request time and upstream response time.
- Ingress Paths - Provides a section that does a breakdown of request metrics by ingress path. However, metrics for each path is disabled by default for Ingress-nginx due to the high metric cardinality it causes. Therefore, you might only see a single path which is
/
.
Prometheus Alerts
Alerts are tricky to get right for a generic use case, however, they are still provided by the ingress-nginx-mixin
. They are also configurable with the config.libsonnet
package in the repository, if you are familiar with Jsonnet then customizing the alerts should be fairly straight forward. The alerts can be found on GitHub, and I'll add a description for the alerts below.
Adjust any of the alerts and add any new ones that you require. Open issues and share feedback in the GitHub repository!
Application Alerts
- Alert name:
NginxConfigReloadFailed
Alerts when an Ingress-nginx configuration reload failed.
- Alert name:
NginxHighHttp4xxErrorRate
Alerts when an Ingress-nginx ingress has a higher 4xx rate than 5% of the total requests in the past 5 minutes.
- Alert name:
NginxHighHttp5xxErrorRate
Alerts when an Ingress-nginx ingress has a higher 5xx rate than 5% of the total requests in the past 5 minutes.
Note: Remember to adjust the thresholds according to your setup which can be done in the config.libsonnet
file before generating the alerts. You can also mute or lower severity for alerts using the config.
Summary
The Ingress-nginx mixin provides Prometheus alerts and Grafana dashboards for monitoring Ingress-nginx. The dashboards provide an overview of the request metrics, controller status and SSL certificates. The request handling performance dashboard provides detailed insight into request metrics. The alerts are generic and can be adjusted to your setup. The mixin is a work in progress, and feedback is welcome in the ingress-nginx-mixin repository.