Replacing Ingress-NGINX with Envoy Gateway in My Personal Cluster

Table of Contents

Deploying Envoy Gateway
Deploying Our Gateway
Cert-manager Integration
Metrics
Tracing
Blocking Specific Paths
Conclusion

With the retirement of ingress-nginx, many users are looking for alternatives for ingress controllers. Envoy gateway looked like a promising option, so I decided to give it a try in my personal Kubernetes cluster. I'll describe my experience deploying Envoy Gateway and how I was able to replicate my previous ingress-nginx setup. This blog post covers my personal cluster and the migration would have been harder in a production environment with more complex requirements.

Deploying Envoy Gateway

I chose to deploy Envoy Gateway using Helm and Envoy Gateway's official Helm chart. The deployment was straightforward, and I was able to get Envoy Gateway up and running quickly. Here are the values I used for the Helm chart:

config:
envoyGateway:
    extensionApis:
        enableBackend: true
    logging:
        level:
          default: info
deployment:
    envoyGateway:
        resources:
            requests:
              cpu: 20m
              memory: 50Mi

I enable the backend API to use the Backend resource for routing traffic to my services. Specifically I use HttpRouteFilters to block specific paths.

Deploying Our Gateway

By default, Envoy Gateway creates a separate cloud LoadBalancer and Deployment for each Gateway resource, with individual HTTPRoute objects routing traffic to the appropriate backend services. In my previous ingress-nginx setup, I used a single LoadBalancer for all ingress traffic - mainly to keep costs down, therefore I wanted to replicate that behavior with Envoy Gateway.

Fortunately, Envoy Gateway supports this pattern out of the box. To share one underlying LoadBalancer and Deployment across multiple Gateway resources, you deploy a single Envoy Gateway instance and have each Gateway reference it. The first step is to create an

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
  name: default
  namespace: gateway
spec:
  logging:
    level:
      default: info
  mergeGateways: true
  provider:
    kubernetes:
      envoyDeployment:
        patch:
          type: StrategicMerge
          value:
            spec:
              template:
                spec:
                  containers:
                  - name: envoy
                    resources:
                      requests:
                        cpu: 20m
                        memory: 100Mi
                  - name: shutdown-manager
                    resources:
                      requests:
                        cpu: 5m
                        memory: 10Mi
      envoyHpa:
        maxReplicas: 6
        metrics:
        - resource:
            name: cpu
            target:
              averageUtilization: 80
              type: Utilization
          type: Resource
        minReplicas: 3
      envoyPDB:
        maxUnavailable: 1
    type: Kubernetes

Note the mergeGateways: true field - this instructs the Envoy Gateway controller to merge multiple Gateway resources into a single Envoy deployment (and therefore a single LoadBalancer). Everything else in the spec is standard Kubernetes configuration: PDBs, HPAs, and resource requests/limits.

Next we'll need a GatewayClass resource:

apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
  name: default
spec:
  controllerName: gateway.envoyproxy.io/gatewayclass-controller
  parametersRef:
    group: gateway.envoyproxy.io
    kind: EnvoyProxy
    name: default
    namespace: gateway

Finally, we can create multiple Gateway resources that refer to the same EnvoyProxy:

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: hodovi-cc
  namespace: apps
spec:
  gatewayClassName: default
  listeners:
  - allowedRoutes:
      namespaces:
        from: Same
    hostname: hodovi.cc
    name: hodovi-cc
    port: 443
    protocol: HTTPS
    tls:
      certificateRefs:
      - group: ""
        kind: Secret
        name: hodovi-cc-tls
      mode: Terminate
  - allowedRoutes:
      namespaces:
        from: Same
    hostname: www.hodovi.cc
    name: www-hodovi-cc
    port: 443
    protocol: HTTPS
    tls:
      certificateRefs:
      - group: ""
        kind: Secret
        name: hodovi-cc-tls
      mode: Terminate

Note the gatewayClassName: default, which refers to the GatewayClass we created earlier. The Gateway resource defines two listeners for hodovi.cc and www.hodovi.cc.

We can then have another Gateway resource for another domain, all sharing the same LoadBalancer and Envoy deployment:

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: teleport
  namespace: auth
spec:
  gatewayClassName: default
  listeners:
  - allowedRoutes:
      namespaces:
        from: All
    hostname: '*.teleport.honeylogic.io'
    name: wildcard-teleport-honeylogic-io
    port: 443
    protocol: HTTPS
    tls:
      certificateRefs:
      - group: ""
        kind: Secret
        name: teleport-cluster-tls
      mode: Terminate
  - allowedRoutes:
      namespaces:
        from: All
    hostname: '*.findwork.dev'
    name: wildcard-findwork-dev
    port: 443
    protocol: HTTPS
    tls:
      certificateRefs:
      - group: ""
        kind: Secret
        name: teleport-cluster-tls
      mode: Terminate
  - allowedRoutes:
      namespaces:
        from: All
    hostname: '*.honeylogic.io'
    name: wildcard-honeylogic-io
    port: 443
    protocol: HTTPS
    tls:
      certificateRefs:
      - group: ""
        kind: Secret
        name: teleport-cluster-tls
      mode: Terminate

The above Gateway resource defines multiple listeners for wildcard domains, all sharing the same LoadBalancer and Envoy deployment.

Lastly, we'll need to create HttpRoute resources to route traffic to our services. Here's an example:

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: hodovi-cc
  namespace: apps
spec:
  hostnames:
  - hodovi.cc
  - www.hodovi.cc
  parentRefs:
  - group: gateway.networking.k8s.io
    kind: Gateway
    name: hodovi-cc
    namespace: apps
  rules:
  - backendRefs:
    - group: ""
      kind: Service
      name: hodovi-cc
      port: 80
      weight: 1
    filters: []
    matches:
    - path:
        type: PathPrefix
        value: /

Cert-manager Integration

cert-manager supports GatewayAPI resources out of the box. However, to solve HTTP-01 challenges, we need to create a dedicated Gateway that cert-manager can use for the challenge endpoint. Here's an example:

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: letsencrypt
  namespace: gateway
spec:
  gatewayClassName: default
  listeners:
  - allowedRoutes:
      namespaces:
        from: All
    name: http
    port: 80
    protocol: HTTP

Now we can annotate our Gateway resources to use the letsencrypt gateway for cert-manager challenges:

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-issuer

The cert-manager.io/cluster-issuer annotation tells cert-manager to use the letsencrypt-issuer ClusterIssuer for obtaining certificates for the domains defined in the Gateway resource. It has to exist in the cluster, the migration from Ingress to Gateway doesn't change cert-manager configuration.

Metrics

Envoy Gateway exposes metrics in Prometheus format out of the box. To scrape these metrics, we can create a ServiceMonitor resource if you're using the prometheus-operator. Here's an example:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: envoy-gateway
  namespace: gateway
spec:
  endpoints:
  - interval: 30s
    port: metrics
  namespaceSelector:
    matchNames:
    - gateway
  selector:
    matchLabels:
      app.kubernetes.io/instance: envoy-gateway
      app.kubernetes.io/name: gateway-helm
      control-plane: envoy-gateway

Now you should have Envoy Gateway metrics being scraped by Prometheus. Next, we'll want to scrape the Envoy proxy metrics as well. The default service doesn't expose the metrics port so we'll just create one:

apiVersion: v1
kind: Service
metadata:
  name: envoy-default-metrics
  namespace: gateway
spec:
  clusterIP: 10.55.1.35
  clusterIPs:
  - 10.55.1.35
  internalTrafficPolicy: Cluster
  ipFamilies:
  - IPv4
  ipFamilyPolicy: SingleStack
  ports:
  - name: metrics
    port: 19001
    protocol: TCP
    targetPort: metrics
  - name: admin
    port: 19000
    protocol: TCP
    targetPort: admin
  - name: xds
    port: 18000
    protocol: TCP
    targetPort: xds
  selector:
    app.kubernetes.io/component: proxy
    app.kubernetes.io/managed-by: envoy-gateway
    app.kubernetes.io/name: envoy
  sessionAffinity: None
  type: ClusterIP

The selector section matches the Envoy proxy pods created by the Envoy Gateway controller. Now we can create another ServiceMonitor to scrape the Envoy proxy metrics:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: envoy-default
  namespace: gateway
spec:
  endpoints:
  - interval: 15s
    path: /stats/prometheus
    port: metrics
  namespaceSelector:
    matchNames:
    - gateway
  selector:
    matchLabels:
      app.kubernetes.io/name: envoy-metrics

At this point, both Envoy Gateway and the Envoy proxy should be successfully scraped by Prometheus. To simplify observability, I've created a monitoring-mixin for Envoy and Envoy Gateway, which provides pre-configured Grafana dashboards and Prometheus alerts out of the box.

If you want a deeper walkthrough of the setup, I’ve also written a detailed guide: Monitoring Envoy and Envoy Gateway with Prometheus and Grafana.

Tracing

I used tracing in my ingress-nginx setupand Envoy Gateway also supports distributed tracing out of the box. You can configure tracing by adding the following configuration to the EnvoyProxy resource:

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
  name: default
  namespace: gateway
spec:
  logging:
    level:
      default: info
  mergeGateways: true
  provider:
    kubernetes:
      envoyDeployment:
        patch:
          type: StrategicMerge
          value:
            spec:
              template:
                spec:
                  containers:
                  - name: envoy
                    resources:
                      requests:
                        cpu: 20m
                        memory: 100Mi
                  - name: shutdown-manager
                    resources:
                      requests:
                        cpu: 5m
                        memory: 10Mi
      envoyHpa:
        maxReplicas: 6
        metrics:
        - resource:
            name: cpu
            target:
              averageUtilization: 80
              type: Utilization
          type: Resource
        minReplicas: 3
      envoyPDB:
        maxUnavailable: 1
    type: Kubernetes
  # Add the following tracing configuration
  telemetry:
    tracing:
      provider:
        backendRefs:
        - group: ""
          kind: Service
          name: alloy
          namespace: monitoring
          port: 4317
        port: 4317
        type: OpenTelemetry
      samplingRate: 1

The backendRefs section specifies the tracing backend to use. In this example, I'm using Grafana's Alloy and Tempo as the tracing backend, which is deployed as a Kubernetes Service in the monitoring namespace. The samplingRate is set to 1, which means that 1% of requests will be sampled.

Blocking Specific Paths

I sucessfully circumvented most ingress annotations I was using by better application logic and saner defaults by Envoy. However, one specific use-case I had was blocking access to the Prometheus metrics endpoint and other endpoints like /admin to the public.

Envoy Gateway allows you to use HttpRouteFilters to block specific paths. Here's an example of how to block access to the /admin path:

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: HTTPRouteFilter
metadata:
  name: hodovi-cc-block-prometheus-metrics
  namespace: apps
spec:
  directResponse:
    body:
      inline: Forbidden
      type: Inline
    contentType: text/plain
    statusCode: 403

Then we'll reference this filter in the HttpRoute resource:

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: hodovi-cc
  namespace: apps
spec:
  hostnames:
  - hodovi.cc
  - www.hodovi.cc
  parentRefs:
  - group: gateway.networking.k8s.io
    kind: Gateway
    name: hodovi-cc
    namespace: apps
  rules:
  - backendRefs:
    - group: ""
      kind: Service
      name: hodovi-cc
      port: 80
      weight: 1
    filters: []
    matches:
    - path:
        type: PathPrefix
        value: /
  # Block Prometheus metrics endpoint
  - backendRefs:
    - group: ""
      kind: Service
      name: hodovi-cc
      port: 80
      weight: 1
    filters:
    - extensionRef:
        group: gateway.envoyproxy.io
        kind: HTTPRouteFilter
        name: hodovi-cc-block-prometheus-metrics
      type: ExtensionRef
    matches:
    - path:
        type: PathPrefix
        value: /prometheus/metrics

Conclusion

Overall, my experience replacing ingress-nginx with Envoy Gateway has been very positive. The migration was straightforward, and I was able to reproduce my previous setup using a single LoadBalancer shared across multiple Gateway resources. Integration with cert-manager for TLS was seamless, and having metrics and tracing available out of the box was a great improvement.

The only part that required extra effort was monitoring. With ingress-nginx, I relied on the ingress-nginx-mixin for ready-made dashboards and alerts. Envoy didn’t have an equivalent at the time, so I created my own mixin and worked through which metrics and alerts made the most sense. It took some work, but the end result has been worth it.

If you're considering an alternative to ingress-nginx, I highly recommend giving Envoy Gateway a try. And if you want a head start on observability, you can try the mixin here: adinhodovic/envoy-mixin.