Apollo GraphQL and NestJS are gaining traction quickly, however the monitoring approaches are unclear. At the moment (late 2021 / early 2022) there are no default exporters or libraries for Prometheus metrics and the same goes for Grafana dashboards, this blog post will provide both. Just to ensure that you are aware - Apollo Studio provides metrics and many other features for your graphs. The only downside is you'll most likely end up with a paid plan and you will be locked-in to their offering. Also, there is no way of exporting metrics to your Prometheus instance and centralizing alerting & dashboards.
This blog post will be based on a NestJS implementation for the dependency injection of Prometheus metrics, however it should work similarly in other setups.
Creating your Prometheus metrics
We will use three OSS repositories to create our metrics:
prom-client
: The default NodeJS Prometheus library.@willsoto/nestjs-prometheus
: NestJS Prometheus integration for injecting metrics.- apollo-metrics: Prometheus counters/histograms for each stage of the Apollo GraphQL request lifecycle.
The below sample metrics are extracted from apollo-metrics
, you will head into that repository and grab all of metrics and define them.
export const parsedCounter = makeCounterProvider({
name: 'graphql_queries_parsed',
help: 'The amount of GraphQL queries that have been parsed.',
labelNames: ['operation_name', 'operation'],
});
export const validationStartedCounter = makeCounterProvider({
name: 'graphql_queries_validation_started',
help: 'The amount of GraphQL queries that have started validation.',
labelNames: ['operation_name', 'operation'],
});
export const resolvedCounter = makeCounterProvider({
name: 'graphql_queries_resolved',
help: 'The amount of GraphQL queries that have had their operation resolved.',
labelNames: ['operation_name', 'operation'],
});
export const executionStartedCounter = makeCounterProvider({
name: 'graphql_queries_execution_started',
help: 'The amount of GraphQL queries that have started executing.',
labelNames: ['operation_name', 'operation'],
});
The metrics above are aligned with the Apollo GraphQL request cycle. Each request has a 9 step lifecycle and the metrics defined are aligned against those. Apollo has a in-depth guide for each step and what it means, you can find it here. The Prometheus metric help section should also provide enough information to understand each request lifecycle step.
NestJS Dependency Injection
Now we'll create a NestJS plugin called GraphQLPrometheusMetricsPlugin
, we'll use the above metrics and create a class that extends the ApolloServerPlugin
to create a server plugin. The server plugin will later on be used in the Apollo server and will be incrementing metrics for each request.
@Injectable()
@Plugin()
export class GraphQLPrometheusMetricsPlugin implements ApolloServerPlugin {
constructor(
@InjectMetric('graphql_queries_parsed')
public parsedCounter: Counter<string>,
@InjectMetric('graphql_queries_validation_started')
public validationStartedCounter: Counter<string>,
@InjectMetric('graphql_queries_resolved')
public resolvedCounter: Counter<string>,
@InjectMetric('graphql_queries_execution_started')
public executionStartedCounter: Counter<string>,
@InjectMetric('graphql_queries_errors')
public errorsCounter: Counter<string>,
@InjectMetric('graphql_queries_responded')
public respondedCounter: Counter<string>,
) {}
async requestDidStart(): Promise<GraphQLRequestListener<any>> {
const parsedCounter = this.parsedCounter;
const validationStartedCounter = this.validationStartedCounter;
const resolvedCounter = this.resolvedCounter;
const executionStartedCounter = this.executionStartedCounter;
const errorsCounter = this.errorsCounter;
const respondedCounter = this.respondedCounter;
const resolverTimeCounter = this.resolverTimeHistogram;
const totalRequestTimeCounter = this.totalRequestTimeHistogram;
return {
parsingDidStart(parsingContext): Promise<void> {
const labels = filterUndefined({
operation_name: parsingContext.request.operationName || '',
operation: parsingContext.operation?.operation,
});
parsedCounter.inc(labels);
return;
},
validationDidStart(validationContext): Promise<void> {
const labels = filterUndefined({
operation_name: validationContext.request.operationName || '',
operation: validationContext.operation?.operation,
});
validationStartedCounter.inc(labels);
return;
},
didResolveOperation(resolveContext): Promise<void> {
const labels = filterUndefined({
operation_name: resolveContext.request.operationName || '',
operation: resolveContext.operation.operation,
});
resolvedCounter.inc(labels);
return;
},
executionDidStart(executingContext): Promise<void> {
const labels = filterUndefined({
operation_name: executingContext.request.operationName || '',
operation: executingContext.operation.operation,
});
executionStartedCounter.inc(labels);
return;
},
didEncounterErrors(errorContext): Promise<void> {
const labels = filterUndefined({
operation_name: errorContext.request.operationName || '',
operation: errorContext.operation?.operation,
});
errorsCounter.inc(labels);
return;
},
willSendResponse(responseContext): Promise<void> {
const labels = filterUndefined({
operation_name: responseContext.request.operationName || '',
operation: responseContext.operation?.operation,
});
respondedCounter.inc(labels);
As you see above each step in the request lifecycle increments the equivalent metric for that step. Each metric has two labels by default:
operation_name
- indicates the name of the GraphQL operation which is not required but it is helpful for debugging and logging.operation
- indicates the GraphQL operation, i.e mutation, query or subscription.
Both of the labels are highly useful for the metrics, the operation
comes by default but the operation name
needs to be added by the user for each request to your API. Apollo covers both very well in their documentation. The labels will also be used in the Grafana dashboards.
Now we can create our NestJS module using the metrics and the plugin:
import { Module } from '@nestjs/common';
import { PrometheusModule } from '@willsoto/nestjs-prometheus';
import {
GraphQLPrometheusMetricsPlugin,
validationStartedCounter,
parsedCounter,
resolvedCounter,
executionStartedCounter,
errorsCounter,
respondedCounter,
} from './prometheus.plugin';
@Module({
imports: [PrometheusModule.register()],
providers: [
GraphQLPrometheusMetricsPlugin,
validationStartedCounter,
parsedCounter,
resolvedCounter,
executionStartedCounter,
errorsCounter,
respondedCounter,
],
exports: [GraphQLPrometheusMetricsPlugin],
})
export class PromModule {}
And lastly add it to our application:
import { GraphQLPrometheusMetricsPlugin } from './metrics/prometheus.plugin';
...
@Module({
imports: [
...
PromModule,
GraphQLGatewayModule.forRootAsync({
imports: [
...
PromModule,
],
useFactory: async (
...
graphQLPrometheusMetrics: GraphQLPrometheusMetricsPlugin,
=> {
return {
server: {
...
plugins: [graphQLPrometheusMetrics],
Now we should have Prometheus metrics at the /metrics
endpoint and you should be able to scrape the endpoint with Prometheus. You can try querying the Prometheus instance with the query graphql_queries_execution_started
and you should see results as:
graphql_queries_execution_started{container="gateway", endpoint="gateway-http", instance="redacted", job="gateway", namespace="redacted", operation="mutation", operation_name="redacted", pod="redacted", service="gateway"} 10231
Grafana Dashboard
By now we should have our metrics in our Prometheus instance and we should be able to query it and create a dashboard. I've created a sample dashboard that covers requests and errors summed by the operation and the operation name.
The dashboard can be found here.
The dashboard should cover the basics, feel free to share your dashboard if you've created a better one!