Migrating Kubernetes PersistentVolumes across Regions and AZs on AWS

Persistent volumes in AWS are tied to one Availability Zone(AZ), therefore if you were to create a cluster in an AZ where the volume is not created in you would not be able to use it. You will need to migrate the volume to one of the zones your cluster is in. Similarly, if a Kubernetes cluster is moved across AWS regions you will need to create a snapshot and copy it to that region before creating a volume.

Moving a volume across zones/regions to another Kubernetes cluster requires the following steps:

Creating a snapshot of the volume you want to migrate
Creating a volume from the snapshot in a zone and region where your new Kubernetes cluster exist in
Creating a PersistentVolume resource referencing the new volume
Creating a PVC which will bind the PersistentVolume

Creating a snapshot and a new volume

We’ll start of by outputting the current PersistentVolume. The resource will contain all information to create a new volume:

kubectl get pv pvc-my-pv -o yaml > old-pv.yaml

The old-pv.yaml file should have the following content:

apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    kubernetes.io/createdby: aws-ebs-dynamic-provisioner
    pv.kubernetes.io/bound-by-controller: "yes"
    pv.kubernetes.io/provisioned-by: kubernetes.io/aws-ebs
  creationTimestamp: "2020-06-08T18:52:37Z"
  finalizers:
  - kubernetes.io/pv-protection
  labels:
    failure-domain.beta.kubernetes.io/region: us-west-2
    failure-domain.beta.kubernetes.io/zone: us-west-2a
  name: <pvc-name
  resourceVersion: "143717702"
  selfLink: /api/v1/persistentvolumes/<pvc-name>
  uid: <uid>
spec:
  accessModes:
  - ReadWriteOnce
  awsElasticBlockStore:
    fsType: ext4
    volumeID: aws://us-west-2a/vol-0l15z9d4e044f4ad0
  ...
  ...

We’ll use the aws CLI to create a volume and a snapshot. We’ll use the above spec.awsElasticBlockStore.volumeID when creating a snapshot:

aws ec2 create-snapshot --description "Migrating Kubernetes PV" --volume-id "vol-0l15z9d4e044f4ad0"

You should receive a response specifying the SnapshotId and the State:

{
    "Description": "Migrating Kubernetes PV",
    "Encrypted": false,
    "OwnerId": "952221748506",
    "Progress": "",
    "SnapshotId": "snap-05c946a2456775d6e",
    "StartTime": "2020-10-12T18:55:59.000Z",
    "State": "pending",
    "VolumeId": "vol-0l15z9d4e044f4ad0",
    "VolumeSize": 1,
    "Tags": []
}

Check that the snapshot is in a completed state:

aws ec2 describe-snapshots --snapshot-ids "snap-05c946a2456775d6e"

{
    "Snapshots": [
        {
            "Description": "Migrating Kubernetes PV",
            "Encrypted": false,
            "OwnerId": "952221748506",
            "Progress": "100%",
            "SnapshotId": "snap-05c946a2456775d6e",
            "StartTime": "2020-10-12T18:55:59.404Z",
            "State": "completed",
            "VolumeId": "vol-0l15z9d4e044f4ad0",
            "VolumeSize": 1
        }
    ]
}

If you need to copy the snapshot accross regions use the copy-snapshot cli command:

aws ec2 copy-snapshot \
    --region us-east-1 \
    --source-region us-west-2 \
    --source-snapshot-id snap-05c946a2456775d6e \
    --description "This is my copied snapshot."

Now we’ll create a volume from the snapshot in the availability zone you prefer:

aws ec2 create-volume \
    --snapshot-id snap-05c946a2456775d6e \
    --availability-zone us-west-2b

Which should return the following response:

{
    "AvailabilityZone": "us-west-2b",
    "CreateTime": "2020-10-12T19:08:57.000Z",
    "Encrypted": false,
    "Size": 1,
    "SnapshotId": "snap-05c946a2456775d6e",
    "State": "creating",
    "VolumeId": "vol-0a3c1440cfee9b131",
    "Iops": 100,
    "Tags": [],
    "VolumeType": "gp2",
    "MultiAttachEnabled": false
}

Now your volume should be available in the new region or availability zone!

Creating a PersistentVolume and a PersistentVolumeClaim

Now we’ll create the PersistentVolume by using the old-pv.yaml file that was outputted previously. However, we’ll need to remove many fields that are generated on creation. The following fields should not be removed:

metadata.labels - should be kept, however change the failure-domain.beta.kubernetes.io/zone and failure-domain.beta.kubernetes.io/region to the correct zone and region.
spec.nodeAffinity - should be kept, however adjust the two keys for zone and region as you did with the labels.
metadata.name - the PersistentVolumes are named after the PVC when created but since you are creating the PersistentVolume manually and it’s not auto created by the PVC you can adjust the name
spec.storageClassName
spec.capacity
spec.awsBlockStore - adjust the spec.awsBlockStore.volumeID field to match the new zone, region and volume id, otherwise it’ll create a new volume without a matching field.
spec.accessModes
spec.persistentVolumeReclaimPolicy

The Kubernetes yaml should be similar to the snippet below:

Note: Create the volume first before the StatefulSet/PersistentVolumeClaim otherwise it will create a completely new volume.

apiVersion: v1
kind: PersistentVolume
metadata:
  labels:
    failure-domain.beta.kubernetes.io/region: us-west-2 # NEW REGION
    failure-domain.beta.kubernetes.io/zone: us-west-2b # NEW ZONE
  name: <my-volume-name>
spec:
  accessModes:
  - ReadWriteOnce
  awsElasticBlockStore:
    fsType: ext4
    volumeID: aws://<ZONE>/vol-<ID>
  capacity:
    storage: 150Gi
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: failure-domain.beta.kubernetes.io/zone
          operator: In
          values:
          - us-west-2b # NEW ZONE
        - key: failure-domain.beta.kubernetes.io/region
          operator: In
          values:
          - us-west-2 # NEW REGION
  persistentVolumeReclaimPolicy: Retain
  storageClassName: <storageClassName>

Now you can create a PersistentVolumeClaim or a StatefulSet with a volumeClaimTemplate with the above capacity and storageClassName and it should bind the PersistentVolume to the claim. Just remember that you will need to have nodes running in that zone and region where the new volume is created. Hopefully this short guide made your volume migration simple and straightforward!

The go-to practice when pods require permissions to access cloud services when using Kubernetes is using service accounts. The various clouds offering managed Kubernetes solutions have different implementations but they have the same concept, EKS has IRSA and GKE has Workload Identity. The service accounts that your containers use will have the required permissions to impersonate cloud IAM roles(AWS) or service accounts(GCP) so that they can access cloud resources. There are other alternatives as AWS instance roles but they are not fine-grained enough when running containerized workflows, every container has access to the resources the node is allowed to access. It might be a bit more complex and different coming from a non Kubernetes background but preexisting Terraform modules simplify the creation of the required resources to allow Kubernetes service accounts to impersonate and access cloud resources.