Showing results for

Back to all articles

EKS Volume Snapshots

John Bresnahan Mar 18, 2021

As discussed in a previous post Stardog Cloud relies on VolumeSnapshots in Kubernetes (k8s) for backups of user data. In this post we will go into more technical details of how to work with VolumeSnapshots in the Elastic Kubernetes Service (EKS).

Kubernetes Components

Here we will presents the k8s components that are used when working with VolumeSnapshots. We do not go into exhaustive details here but rather briefly give an overview to ease in understanding the concepts in this post. Detailed documentation is linked to each type.

Persistent Volume (PV)

The abstraction that represents physical storage.

StorageClass

The type of storage from which a PV is created. For example a PersistenceVolumeClass can be from an AWS gp2 or io1. It can also be from a local disk with specific RAID options or an NFS partition.

PersistentVolumeClaim (PVC)

A claim to a specific resource on a PV. The PVC consumes specific size and access modes and other characteristics of the PV. In order for a pod to mount a PV storage type it must first create a claim upon that PV. While in the course of mounting storage on k8s pods it operates as though the PVC is the volume while in fact the PVC itself does not represent any physical storage but a right to use physical storage.

VolumeSnapshotContent

This is akin to the PV in that it represents an actual snapshot of data on some physical storage. This is typically created from a PVC but can also be pre-provisioned.

VolumeSnapshot

The VolumeSnapshot relationship to the VolumeSnapshotContent is akin the PVC relationship to the PV. The VolumeSnapshotContent is the physical storage while the VolumeSnapshot is the interface to it. The VolumeSnapshot represents the request for a snapshot from a PVC and it maintains the status of creating the snapshot from the data held by the PVC onto the VolumeSnapshotContent. It also can be used as a source for creating a PVC from the contents of its bound VolumeSnapshotContent.

VolumeSnapshotClass

This is similar to the SnapshotClass in that it describes attributes of the VolumeSnapshotContent. The most important attribute in this case is the ebs.csi.aws.com driver.

Configuring for Snapshots

When using EKS v1.17 snapshots to EBS must be made using the ebs.csi.aws.com provisioner. In order to do this the PVCs from which the snapshots will be created must be associated with a StorageClass defined to use that provisioner. The reason for this is that provisioner is the driver that can interact with EC2 EBS services and thus the volumes in question must use it.

StorageClass and VolumeSnapshotClass

Stardog Cloud manages this by creating a StorageClass called stardog-home. Any volumes that we wish to snapshot must be created from PVCs associated with this storage class. We define our class with the following:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: stardog-home
provisioner: ebs.csi.aws.com
volumeBindingMode: WaitForFirstConsumer
parameters:
  type: gp2
allowVolumeExpansion: true

The important fields to note are the provisioner, the metadata name parameter type. This sets our StorageClass up to be named stardog-home, for it to be managed by the ebs.csi.aws.com provisioner, and for that to use the type gp2. Descriptions of the other fields can be found elsewhere. The parameters section is used to pass in options specific to the provisioner. In our case we are asking it to use gp2. io1 would be another possible value here.

Similarly any snapshot that we make must also use the EBS CSI provisioner. In order to achieve this we must define a SnapshotVolumeClass as well. We use the following description:


apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshotClass
metadata:
  name: csi-aws-vsc
driver: ebs.csi.aws.com
deletionPolicy: Delete

This basic description gives us a way to create VolumeSnapshotContent using the driver ebs.csi.aws.com. The name of this VolumeSnapshotClass will be csi-aws-vsc.

Creating A Snapshot

Once we have the above classes defined and a PVC created from the stardog-home StorageClass we can create snapshots of that PVC. The process of creating a snapshot and the associated components is illustrated below:

Query plan

In the same way that pods interface to physical storage via a PVC so do snapshots. A VolumeSnapshot of a specific class is created. This component is given a source PVC and it creates a VolumeSnapshotContent where it stores data.

To create a snapshot of a PVC named pvc-1 we apply the following to the k8s cluster:

apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshot
metadata:
  name: new-snapshot
spec:
  volumeSnapshotClassName: csi-aws-vsc
  source:
    persistentVolumeClaimName: pvc-1

This creates a VolumeSnapshot object named new-snapshot and it will bind it to a dynamically provisioned VolumeSnapshotContent object. The VolumeSnapshot can be inspected to see the status in the following:


kubectl -n buzztroll describe volumesnapshot new-snapshot
Name:         new-snapshot
Namespace:    stardog
Labels:       <none>
Annotations:  <none>
API Version:  snapshot.storage.k8s.io/v1beta1
Kind:         VolumeSnapshot
Metadata:
  Creation Timestamp:  2021-01-13T18:48:57Z
  Finalizers:
    snapshot.storage.kubernetes.io/volumesnapshot-as-source-protection
    snapshot.storage.kubernetes.io/volumesnapshot-bound-protection
  Generation:        1
  Resource Version:  100143385
  Self Link:         /apis/snapshot.storage.k8s.io/v1beta1/namespaces/stardog/volumesnapshots/new-snapshot
  UID:               6707c050-7b89-47c9-a795-152e3582ce86
Spec:
  Source:
    Persistent Volume Claim Name:  pvc-1
  Volume Snapshot Class Name:      csi-aws-vsc
Status:
  Bound Volume Snapshot Content Name:  snapcontent-6707c050-7b89-47c9-a795-152e3582ce86
  Creation Time:                       2021-01-13T18:48:57Z
  Ready To Use:                        false
  Restore Size:                        32Gi
Events:                                <none>

The VolumeSnapshot is initially reported as not ready to use. That means that the VolumeSnapshotContent is still being written. Also not that the created VolumeSnapshotContent is called snapcontent-6707c050-7b89-47c9-a795-152e358. We can inspect that k8s resource as well:


kubectl -n buzztroll describe volumesnapshotcontent snapcontent-6707c050-7b89-47c9-a795-152e3582ce86
Name:         snapcontent-6707c050-7b89-47c9-a795-152e3582ce86
Namespace:
Labels:       <none>
Annotations:  <none>
API Version:  snapshot.storage.k8s.io/v1beta1
Kind:         VolumeSnapshotContent
Metadata:
  Creation Timestamp:  2021-01-13T18:48:57Z
  Finalizers:
    snapshot.storage.kubernetes.io/volumesnapshotcontent-bound-protection
  Generation:        1
  Resource Version:  100143641
  Self Link:         /apis/snapshot.storage.k8s.io/v1beta1/volumesnapshotcontents/snapcontent-6707c050-7b89-47c9-a795-152e3582ce86
  UID:               53c603ff-e235-46ef-b59f-14d431cf4724
Spec:
  Deletion Policy:  Delete
  Driver:           ebs.csi.aws.com
  Source:
    Volume Handle:             vol-008db636e12862813
  Volume Snapshot Class Name:  csi-aws-vsc
  Volume Snapshot Ref:
    API Version:       snapshot.storage.k8s.io/v1beta1
    Kind:              VolumeSnapshot
    Name:              new-snapshot
    Namespace:         stardog
    Resource Version:  100143369
    UID:               6707c050-7b89-47c9-a795-152e3582ce86
Status:
  Creation Time:    1610563737000000000
  Ready To Use:     true
  Restore Size:     34359738368
  Snapshot Handle:  snap-0f1f04f468660a027
Events:             <none>

An interesting thing to note is the Snapshot Handle value snap-0f1f04f468660a027. That value is the reference to the EBS snapshot in the associated AWS account.

Summary

Stardog Cloud uses these k8s techniques to create snapshots. With backing of EBS performance enhancements this gives us a robust and performant backup solution. Checkout Stardog Cloud.

Keep Reading:

Loading a million triples per second on commodity hardware

At Stardog we are continuously pushing the boundaries of performance and scalability. Last month’s 7.5.0 release brought 500% improvement to transactional write performance. This month’s 7.6.0 release improves writing data at database creation time by almost 100%, yielding a million triples per second loading speed using a commodity server. In this post we’ll talk about the details of loading performance. Let’s do the numbers The fastest way to load large amounts of data into Stardog is to do at database creation time.

Write Performance Improves up to 500%

Stardog 7.5.0 improves write performance up to 500% in some cases. In this post I describe the details of this improvement and share detailed benchmarking results for update performance. Large Updates A common usage pattern for Stardog involves connecting to external data sources through virtual graphs that are queried on-demand without storing any data in Stardog. However, in some cases you might enable virtual graph caching to pull data into Stardog for indexing and in some other cases it is preferable or even necessary to materialize the data in Stardog completely.

Get your Stardog Academic Trial

Current students, instructors, and staff at accredited academic institutions are eligible to receive a one-year Stardog license.

Download now