Migrating Kubernetes PersistentVolumes across Regions and AZs on AWS

Published on October 15, 2020, 00:00 UTC 4 minutes 14452 views

Persistent volumes in AWS are tied to one Availability Zone(AZ), therefore if you were to create a cluster in an AZ where the volume is not created in you would not be able to use it. You will need to migrate the volume to one of the zones your cluster is in. Similarly, if a Kubernetes cluster is moved across AWS regions you will need to create a snapshot and copy it to that region before creating a volume.

Moving a volume across zones/regions to another Kubernetes cluster requires the following steps:

  • Creating a snapshot of the volume you want to migrate
  • Creating a volume from the snapshot in a zone and region where your new Kubernetes cluster exist in
  • Creating a PersistentVolume resource referencing the new volume
  • Creating a PVC which will bind the PersistentVolume

Creating a snapshot and a new volume

We’ll start of by outputting the current PersistentVolume. The resource will contain all information to create a new volume:

kubectl get pv pvc-my-pv -o yaml > old-pv.yaml

The old-pv.yaml file should have the following content:

apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    kubernetes.io/createdby: aws-ebs-dynamic-provisioner
    pv.kubernetes.io/bound-by-controller: "yes"
    pv.kubernetes.io/provisioned-by: kubernetes.io/aws-ebs
  creationTimestamp: "2020-06-08T18:52:37Z"
  finalizers:
  - kubernetes.io/pv-protection
  labels:
    failure-domain.beta.kubernetes.io/region: us-west-2
    failure-domain.beta.kubernetes.io/zone: us-west-2a
  name: <pvc-name
  resourceVersion: "143717702"
  selfLink: /api/v1/persistentvolumes/<pvc-name>
  uid: <uid>
spec:
  accessModes:
  - ReadWriteOnce
  awsElasticBlockStore:
    fsType: ext4
    volumeID: aws://us-west-2a/vol-0l15z9d4e044f4ad0
  ...
  ...

We’ll use the aws CLI to create a volume and a snapshot. We’ll use the above spec.awsElasticBlockStore.volumeID when creating a snapshot:

aws ec2 create-snapshot --description "Migrating Kubernetes PV" --volume-id "vol-0l15z9d4e044f4ad0"

You should receive a response specifying the SnapshotId and the State:

{
    "Description": "Migrating Kubernetes PV",
    "Encrypted": false,
    "OwnerId": "952221748506",
    "Progress": "",
    "SnapshotId": "snap-05c946a2456775d6e",
    "StartTime": "2020-10-12T18:55:59.000Z",
    "State": "pending",
    "VolumeId": "vol-0l15z9d4e044f4ad0",
    "VolumeSize": 1,
    "Tags": []
}

Check that the snapshot is in a completed state:

aws ec2 describe-snapshots --snapshot-ids "snap-05c946a2456775d6e"
{
    "Snapshots": [
        {
            "Description": "Migrating Kubernetes PV",
            "Encrypted": false,
            "OwnerId": "952221748506",
            "Progress": "100%",
            "SnapshotId": "snap-05c946a2456775d6e",
            "StartTime": "2020-10-12T18:55:59.404Z",
            "State": "completed",
            "VolumeId": "vol-0l15z9d4e044f4ad0",
            "VolumeSize": 1
        }
    ]
}

If you need to copy the snapshot accross regions use the copy-snapshot cli command:

aws ec2 copy-snapshot \
    --region us-east-1 \
    --source-region us-west-2 \
    --source-snapshot-id snap-05c946a2456775d6e \
    --description "This is my copied snapshot."

Now we’ll create a volume from the snapshot in the availability zone you prefer:

aws ec2 create-volume \
    --snapshot-id snap-05c946a2456775d6e \
    --availability-zone us-west-2b

Which should return the following response:

{
    "AvailabilityZone": "us-west-2b",
    "CreateTime": "2020-10-12T19:08:57.000Z",
    "Encrypted": false,
    "Size": 1,
    "SnapshotId": "snap-05c946a2456775d6e",
    "State": "creating",
    "VolumeId": "vol-0a3c1440cfee9b131",
    "Iops": 100,
    "Tags": [],
    "VolumeType": "gp2",
    "MultiAttachEnabled": false
}

Now your volume should be available in the new region or availability zone!

Creating a PersistentVolume and a PersistentVolumeClaim

Now we’ll create the PersistentVolume by using the old-pv.yaml file that was outputted previously. However, we’ll need to remove many fields that are generated on creation. The following fields should not be removed:

  • metadata.labels - should be kept, however change the failure-domain.beta.kubernetes.io/zone and failure-domain.beta.kubernetes.io/region to the correct zone and region.
  • spec.nodeAffinity - should be kept, however adjust the two keys for zone and region as you did with the labels.
  • metadata.name - the PersistentVolumes are named after the PVC when created but since you are creating the PersistentVolume manually and it’s not auto created by the PVC you can adjust the name
  • spec.storageClassName
  • spec.capacity
  • spec.awsBlockStore - adjust the spec.awsBlockStore.volumeID field to match the new zone, region and volume id, otherwise it’ll create a new volume without a matching field.
  • spec.accessModes
  • spec.persistentVolumeReclaimPolicy

The Kubernetes yaml should be similar to the snippet below:

Note: Create the volume first before the StatefulSet/PersistentVolumeClaim otherwise it will create a completely new volume.

apiVersion: v1
kind: PersistentVolume
metadata:
  labels:
    failure-domain.beta.kubernetes.io/region: us-west-2 # NEW REGION
    failure-domain.beta.kubernetes.io/zone: us-west-2b # NEW ZONE
  name: <my-volume-name>
spec:
  accessModes:
  - ReadWriteOnce
  awsElasticBlockStore:
    fsType: ext4
    volumeID: aws://<ZONE>/vol-<ID>
  capacity:
    storage: 150Gi
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: failure-domain.beta.kubernetes.io/zone
          operator: In
          values:
          - us-west-2b # NEW ZONE
        - key: failure-domain.beta.kubernetes.io/region
          operator: In
          values:
          - us-west-2 # NEW REGION
  persistentVolumeReclaimPolicy: Retain
  storageClassName: <storageClassName>

Now you can create a PersistentVolumeClaim or a StatefulSet with a volumeClaimTemplate with the above capacity and storageClassName and it should bind the PersistentVolume to the claim. Just remember that you will need to have nodes running in that zone and region where the new volume is created. Hopefully this short guide made your volume migration simple and straightforward!

Related Posts

May 06, 2022 7 minutes

Private EKS API Endpoint behind OpenVPN

AWS offers a managed Kubernetes solution called Elastic Kubernetes Service (EKS). When an EKS cluster is spun up the Kubernetes API is by default accessible by the public. However, this might be something that your company does not approve of due to security reasons, they might want to limit Kubernetes API access only to private networks. In that case you might want to bring up a service as OpenVPN and route private traffic through it. That would allow you to access the Kubernetes API through a private endpoint using OpenVPN. In this blog post we’ll use Terraform to provision our infrastructure required for a private EKS cluster and we’ll use OpenVPN Access Server as our VPN solution.

May 07, 2022 4 minutes

IRSA and Workload Identity with Terraform

The go-to practice when pods require permissions to access cloud services when using Kubernetes is using service accounts. The various clouds offering managed Kubernetes solutions have different implementations but they have the same concept, EKS has IRSA and GKE has Workload Identity. The service accounts that your containers use will have the required permissions to impersonate cloud IAM roles(AWS) or service accounts(GCP) so that they can access cloud resources. There are other alternatives as AWS instance roles but they are not fine-grained enough when running containerized workflows, every container has access to the resources the node is allowed to access. It might be a bit more complex and different coming from a non Kubernetes background but preexisting Terraform modules simplify the creation of the required resources to allow Kubernetes service accounts to impersonate and access cloud resources.

Monitoring Kubernetes InitContainers with Prometheus

Kubernetes InitContainers are a neat way to run arbitrary code before your container starts. It ensures that certain pre-conditions are met before your app is up and running. For example it allows you to:

  • run database migrations with Django or Rails before your app starts
  • ensure a microservice or API you depend on to is running
Shynet