OpenCost is an open-source tool designed to help you monitor and understand the cost of your cloud infrastructure. As a project under the Cloud Native Computing Foundation (CNCF), OpenCost offers a transparent and powerful solution for cloud cost management. It provides both a user-friendly interface for visualizing cloud costs and Prometheus metrics, enabling you to query and visualize these costs using Grafana. The popular tool KubeCost is built on top of OpenCost, offering an enhanced feature set and user experience. However, KubeCost is not open-source, and its free plan has limitations on data retention and storage. Given these constraints and a preference for consolidating data visualization within Grafana, I opted to use OpenCost.
This blog post will introduce the opencost-mixin - a set of Prometheus rules and Grafana dashboards for OpenCost. The dashboards will provide insights on both an overview of cluster cost but also a breakdown of cost by namespace/node/pod/container. In addition to cost visualization, the opencost-mixin includes alerts for budget increases and helps identify anomalies. For example, you can set targeted budget alerts to monitor when your costs approach predefined thresholds or detect anomalies, such as a sudden 20% increase in cluster expenses. There are already two dashboards that are published in Grafana:
- OpenCost Overview - A overview of the Kubernetes cluster cost with a breakdown by instance type, resource type (RAM/CPU/Persistent Volume) and namespace.
- OpenCost Namespace - Provides insights to namespace costs with a breakdown by pods/containers/persistent volumes for that namespace.
Prometheus alerts for the opencost-mixin are available in GitHub. These cover cost anomalies (e.g., sudden spikes) and budget alerts (e.g., exceeding thresholds) and can be easily imported into your setup for proactive cost monitoring.
If you want to go directly to the dashboards you can use the links above, the rest of the blog post will describe setting up OpenCost and the various alerts and dashboards.
Installing OpenCost
First, add the OpenCost Helm chart library
helm repo add opencost https://opencost.github.io/opencost-helm-chart
The following Helm values are set:
metrics:
serviceMonitor:
enabled: true
prometheus:
internal:
enabled: true
namespaceName: monitoring
port: 9090
serviceName: prometheus-k8s
We use the Prometheus-operator and have a Prometheus instance running in the monitoring
namespace. We enable the ServiceMonitor
and set internal.enabled
to true
to let OpenCost know that we have an internal Prometheus instance running and that we do not need a Prometheus instance deployed with the OpenCost chart.
Install OpenCost with the following command:
helm install opencost opencost/opencost -f values.yaml
Now you should be able to go to your Prometheus instance and query cost metrics - for example node_total_hourly_cost
which provides total hourly costs for a node.
Grafana Dashboards
OpenCost Overview Dashboard
The OpenCost overview dashboard focuses on providing an overview of your Kubernetes cluster. The following things are core for the dashboard:
- Cluster Summary - Provides a section that summarizes the costs of the whole cluster. It shows pie chart panels that group the costs by resource/namespace/instance type. It also shows cost variance and also cost variance for each resource - for example the increase/decrease in cost for CPU over time.
- Cloud Resources - It visualizes the instances deployed in the cloud and the costs associated with them. It also shows the persistent volumes and the costs associated with them.
- Namespace - It provides a breakdown of costs by namespace. There's also direct links to the namespace dashboard that provides a more detailed view of the namespace.
OpenCost Namespace Dashboard
The OpenCost namespace dashboard provides a more detailed view of the costs for a specific namespace. The dashboard is split into the following sections:
- Filters - Allows us to filter by namespace, which is applied to all the panels.
- Summary - An overview of the namespace cost - hourly, monthly, daily costs as well as costs grouped by resource (CPU/RAM/PV).
- Pod - A summary of the 10 most expensive pods, including their current costs and a comparison of cost changes over the last 7 and 30 days.
- Container - A summary of the 10 most expensive containers, including their current costs and a comparison of cost changes over the last 7 and 30 days.
- Persistent Volumes - An overview of which persistent volumes are deployed into the namespace and the cost of each persistent volume.
Alerts
The alerts can be customized using the config.libsonnet
file available in the repository. If you're familiar with Jsonnet, modifying and tailoring the alerts to suit your specific requirements should be straightforward. It is needed to adjust the alerts according to your Kubernetes cluster. The alerts can be found on GitHub, and I'll add a description for the alerts below.
- Alert name:
OpenCostMonthlyBudgetExceeded
Alerts when the predicted monthly budget (current hourly cost multiplied with 730 hours) exceeds the threshold, currently the threshold is set to $200 as an example - you need to configure this.
- Alert name:
OpenCostAnomalyDetected
Alerts when the average hourly cost over the 3 hours exceeds the 7-day average by more than 20%. The threshold of 20% can be adjusted.
Summary
OpenCost is a great tool for understanding your cloud costs, and the opencost-mixin provides a set of Prometheus rules and Grafana dashboards that can help you visualize and monitor your costs. The dashboards provide an overview of the cluster costs, as well as a breakdown by namespace, pod, container, and persistent volume. The alerts can help you set budget alerts and detect cost anomalies. The dashboards and alerts are work in progress, so feel free to share feedback in the opencost-mixin repository of what you would like to see or any issues you experience.