Showing results for

Back to all articles

EKS Volume Snapshots

John Bresnahan Mar 18, 2021

As discussed in a previous post Stardog Cloud relies on VolumeSnapshots in Kubernetes (k8s) for backups of user data. In this post we will go into more technical details of how to work with VolumeSnapshots in the Elastic Kubernetes Service (EKS).

Kubernetes Components

Here we will presents the k8s components that are used when working with VolumeSnapshots. We do not go into exhaustive details here but rather briefly give an overview to ease in understanding the concepts in this post. Detailed documentation is linked to each type.

Persistent Volume (PV)

The abstraction that represents physical storage.

StorageClass

The type of storage from which a PV is created. For example a PersistenceVolumeClass can be from an AWS gp2 or io1. It can also be from a local disk with specific RAID options or an NFS partition.

PersistentVolumeClaim (PVC)

A claim to a specific resource on a PV. The PVC consumes specific size and access modes and other characteristics of the PV. In order for a pod to mount a PV storage type it must first create a claim upon that PV. While in the course of mounting storage on k8s pods it operates as though the PVC is the volume while in fact the PVC itself does not represent any physical storage but a right to use physical storage.

VolumeSnapshotContent

This is akin to the PV in that it represents an actual snapshot of data on some physical storage. This is typically created from a PVC but can also be pre-provisioned.

VolumeSnapshot

The VolumeSnapshot relationship to the VolumeSnapshotContent is akin the PVC relationship to the PV. The VolumeSnapshotContent is the physical storage while the VolumeSnapshot is the interface to it. The VolumeSnapshot represents the request for a snapshot from a PVC and it maintains the status of creating the snapshot from the data held by the PVC onto the VolumeSnapshotContent. It also can be used as a source for creating a PVC from the contents of its bound VolumeSnapshotContent.

VolumeSnapshotClass

This is similar to the SnapshotClass in that it describes attributes of the VolumeSnapshotContent. The most important attribute in this case is the ebs.csi.aws.com driver.

Configuring for Snapshots

When using EKS v1.17 snapshots to EBS must be made using the ebs.csi.aws.com provisioner. In order to do this the PVCs from which the snapshots will be created must be associated with a StorageClass defined to use that provisioner. The reason for this is that provisioner is the driver that can interact with EC2 EBS services and thus the volumes in question must use it.

StorageClass and VolumeSnapshotClass

Stardog Cloud manages this by creating a StorageClass called stardog-home. Any volumes that we wish to snapshot must be created from PVCs associated with this storage class. We define our class with the following:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: stardog-home
provisioner: ebs.csi.aws.com
volumeBindingMode: WaitForFirstConsumer
parameters:
  type: gp2
allowVolumeExpansion: true

The important fields to note are the provisioner, the metadata name parameter type. This sets our StorageClass up to be named stardog-home, for it to be managed by the ebs.csi.aws.com provisioner, and for that to use the type gp2. Descriptions of the other fields can be found elsewhere. The parameters section is used to pass in options specific to the provisioner. In our case we are asking it to use gp2. io1 would be another possible value here.

Similarly any snapshot that we make must also use the EBS CSI provisioner. In order to achieve this we must define a SnapshotVolumeClass as well. We use the following description:


apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshotClass
metadata:
  name: csi-aws-vsc
driver: ebs.csi.aws.com
deletionPolicy: Delete

This basic description gives us a way to create VolumeSnapshotContent using the driver ebs.csi.aws.com. The name of this VolumeSnapshotClass will be csi-aws-vsc.

Creating A Snapshot

Once we have the above classes defined and a PVC created from the stardog-home StorageClass we can create snapshots of that PVC. The process of creating a snapshot and the associated components is illustrated below:

Query plan

In the same way that pods interface to physical storage via a PVC so do snapshots. A VolumeSnapshot of a specific class is created. This component is given a source PVC and it creates a VolumeSnapshotContent where it stores data.

To create a snapshot of a PVC named pvc-1 we apply the following to the k8s cluster:

apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshot
metadata:
  name: new-snapshot
spec:
  volumeSnapshotClassName: csi-aws-vsc
  source:
    persistentVolumeClaimName: pvc-1

This creates a VolumeSnapshot object named new-snapshot and it will bind it to a dynamically provisioned VolumeSnapshotContent object. The VolumeSnapshot can be inspected to see the status in the following:


kubectl -n buzztroll describe volumesnapshot new-snapshot
Name:         new-snapshot
Namespace:    stardog
Labels:       <none>
Annotations:  <none>
API Version:  snapshot.storage.k8s.io/v1beta1
Kind:         VolumeSnapshot
Metadata:
  Creation Timestamp:  2021-01-13T18:48:57Z
  Finalizers:
    snapshot.storage.kubernetes.io/volumesnapshot-as-source-protection
    snapshot.storage.kubernetes.io/volumesnapshot-bound-protection
  Generation:        1
  Resource Version:  100143385
  Self Link:         /apis/snapshot.storage.k8s.io/v1beta1/namespaces/stardog/volumesnapshots/new-snapshot
  UID:               6707c050-7b89-47c9-a795-152e3582ce86
Spec:
  Source:
    Persistent Volume Claim Name:  pvc-1
  Volume Snapshot Class Name:      csi-aws-vsc
Status:
  Bound Volume Snapshot Content Name:  snapcontent-6707c050-7b89-47c9-a795-152e3582ce86
  Creation Time:                       2021-01-13T18:48:57Z
  Ready To Use:                        false
  Restore Size:                        32Gi
Events:                                <none>

The VolumeSnapshot is initially reported as not ready to use. That means that the VolumeSnapshotContent is still being written. Also not that the created VolumeSnapshotContent is called snapcontent-6707c050-7b89-47c9-a795-152e358. We can inspect that k8s resource as well:


kubectl -n buzztroll describe volumesnapshotcontent snapcontent-6707c050-7b89-47c9-a795-152e3582ce86
Name:         snapcontent-6707c050-7b89-47c9-a795-152e3582ce86
Namespace:
Labels:       <none>
Annotations:  <none>
API Version:  snapshot.storage.k8s.io/v1beta1
Kind:         VolumeSnapshotContent
Metadata:
  Creation Timestamp:  2021-01-13T18:48:57Z
  Finalizers:
    snapshot.storage.kubernetes.io/volumesnapshotcontent-bound-protection
  Generation:        1
  Resource Version:  100143641
  Self Link:         /apis/snapshot.storage.k8s.io/v1beta1/volumesnapshotcontents/snapcontent-6707c050-7b89-47c9-a795-152e3582ce86
  UID:               53c603ff-e235-46ef-b59f-14d431cf4724
Spec:
  Deletion Policy:  Delete
  Driver:           ebs.csi.aws.com
  Source:
    Volume Handle:             vol-008db636e12862813
  Volume Snapshot Class Name:  csi-aws-vsc
  Volume Snapshot Ref:
    API Version:       snapshot.storage.k8s.io/v1beta1
    Kind:              VolumeSnapshot
    Name:              new-snapshot
    Namespace:         stardog
    Resource Version:  100143369
    UID:               6707c050-7b89-47c9-a795-152e3582ce86
Status:
  Creation Time:    1610563737000000000
  Ready To Use:     true
  Restore Size:     34359738368
  Snapshot Handle:  snap-0f1f04f468660a027
Events:             <none>

An interesting thing to note is the Snapshot Handle value snap-0f1f04f468660a027. That value is the reference to the EBS snapshot in the associated AWS account.

Summary

Stardog Cloud uses these k8s techniques to create snapshots. With backing of EBS performance enhancements this gives us a robust and performant backup solution. Checkout Stardog Cloud.

Keep Reading:

Joins and NULLs in SPARQL

Joins in SPARQL could be confusing to newcomers. You can hear some people celebrating the fact that they don’t need to write explicit join conditions (like in SQL) but if you actually look in the SPARQL spec, you will see the term “join” used like 67 times (as of Oct 2021). Furthermore, if you look at the join definition you will recognize the familiar relational operator that’s not so different from SQL.

FROM vs FROM NAMED in SPARQL

FROM vs FROM NAMED, what’s the difference, and when should I use one or the other is a constant source of confusion for SPARQL users. It’s one of the main reasons why a query can surprisingly return zero results and the most experienced of us have been tricked by it at least once. This short post goes into a little bit of a detail of the difference and discusses how both can be used to address different use cases.

Try Stardog Free

Stardog is available for free for your academic and research projects! Get started today.

Download now