Blog Posts

Most Popular Blog Tags

Simple Django User Session Clearing using Celery

Django provides session support out-of-the-box and stores sessions in the django_session database table. Django leaves it up to the project maintainers to purge sessions in their Django project. This means that if it’s not done on a regular basis the table grows infinitely. However, they provide a simple command called clearsessions for it. Using it within a Celery task in a cron schedule resolves any long-term storage issues with large tables for sessions. The below solution requires both Celery for the task and Celery-beat for the cron schedule.

Best Practises for A Performant Django Admin

The admin interface that comes with Django is one of the great things about Django. It comes with a ton of features out of the box and has many open source packages that extend the base functionality even more. Well documented and works very well, the only pain point I’ve found when using the admin and its features is when I’ve had large tables containing millions of objects. In that case, searching, sorting, ordering, counting and other features cause the admin to load slowly and creates database pressure. At that point we need to optimize the admin and there are many small changes that you can do that will speed up your admin load times and reduce any additional database load. This blog post will describe approaches to the common performance problems I’ve experienced when having a large database.

Celery Monitoring with Prometheus and Grafana

Celery is a python project used for asynchronous job processing and task scheduling in web applications or distributed systems. It is very commonly used together with Django, Celery as the asynchronous job processor and Django as the web framework. Celery has great documentation on how to use it, deploy it and integrate it with Django. However, monitoring is less covered - this is what this blog post aims to do. There is a great Prometheus exporter for Celery that has dashboards and alerts that come with them.

Core Continuous Integration (CI) Steps for Python and Django Applications

Recently I worked on a Django project where I built a Django API using DRF, but I also was responsible for the infrastructure setup. Setting up a baseline Django project was easy due to the Django cookiecutter, but for the CI/CD setup the path was not as straight forward. Overtime, I’ve found a large ecosystem of tools that format, lint and test Django/Python projects very well. Getting a good CI suite with GitHub Actions was easy due to the ecosystem of formatting, linting and testing tools, but there was no blog post/guide on the all the various tools available. Therefore, I thought of sharing all the various actions we’re using to ensure the highest code standard as possible using these tools.

Django Monitoring with Prometheus and Grafana

The Prometheus package for Django provides a great Prometheus integration, but the open source dashboards and alerts that exist are not that great. The to-go Grafana dashboard does not use a large portion of metrics provided by the Django-Prometheus package, alongside this there are no filters for views, methods, jobs and namespaces. This blog post will introduce the Django-mixin - a set of Prometheus rules and Grafana dashboards for Django. The dashboard and alerts will provide insights on applied/unapplied migrations, RED (requests per second, error percentage of the request, latency for each request) metrics, database ops and cache hit rate.

Django Error Tracking and Performance Monitoring with Sentry

As good as a framework that Django is, the default method of sending an email when getting an error leave much to be desired. The database or cache acting flaky? Expect 1000s of emails depicting that error, which usually does not provide enough details for you to understand the issue. Luckily, Sentry, which is an error tracking and monitoring platform provides an out-of-the-box integration for Django and Celery.

RabbitMQ Per Queue Monitoring

RabbitMQ has a native built-in Prometheus plugin and by default it has granular metrics disabled. Granular metrics means per-queue/vhost metrics - detailed metrics that provide message lag and consumer info on a queue and vhost basis. You could enable granular per-object metrics but this is not recommended as the plugin becomes much slower on a large cluster and the label cardinality for your time series database could become high.

To solve this you could use the unofficial OSS RabbitMQ exporter written by kbudde that will allow you to have granular metrics enabled and also disable specific metrics that the native Prometheus plugin provides. The unofficial exporter refers to a mixed approach where you use the unofficial exporter for detailed metrics and disable all other metrics and use the native RabbitMQ Prometheus plugin for all other metrics.

May 07, 2022 4 minutes

IRSA and Workload Identity with Terraform

The go-to practice when pods require permissions to access cloud services when using Kubernetes is using service accounts. The various clouds offering managed Kubernetes solutions have different implementations but they have the same concept, EKS has IRSA and GKE has Workload Identity. The service accounts that your containers use will have the required permissions to impersonate cloud IAM roles(AWS) or service accounts(GCP) so that they can access cloud resources. There are other alternatives as AWS instance roles but they are not fine-grained enough when running containerized workflows, every container has access to the resources the node is allowed to access. It might be a bit more complex and different coming from a non Kubernetes background but preexisting Terraform modules simplify the creation of the required resources to allow Kubernetes service accounts to impersonate and access cloud resources.

May 06, 2022 7 minutes

Private EKS API Endpoint behind OpenVPN

AWS offers a managed Kubernetes solution called Elastic Kubernetes Service (EKS). When an EKS cluster is spun up the Kubernetes API is by default accessible by the public. However, this might be something that your company does not approve of due to security reasons, they might want to limit Kubernetes API access only to private networks. In that case you might want to bring up a service as OpenVPN and route private traffic through it. That would allow you to access the Kubernetes API through a private endpoint using OpenVPN. In this blog post we’ll use Terraform to provision our infrastructure required for a private EKS cluster and we’ll use OpenVPN Access Server as our VPN solution.

May 02, 2022 7 minutes

CI/CD for Apollo GraphQL Managed Federation

GraphQL federation is great to use when you want a single API/gateway for all your queries. The simple to-go approach is schema stitching, where you run a gateway microservice which targets all other microservices and composes a graph. This works initially fine, however over time you’d like schema checking, auto-polling for graph updates, seamless rollouts(no issues with schema stitching when rolling out) and overall a process that’s well integrated into your continuous integration and continuous delivery pipeline. The basic approach of schema stiching does not provide this, using managed federation provided by Apollo Studio improves the workflow and solves many of the pain points.

Shynet