One of the key differences between Kubernetes pods (meaning container-based apps hosted in Kubernetes) and traditional apps is that pods don't have access to persistent storage by default. Typically, any data that the pods write is stored in a temporary file system, and the data disappears forever as soon as the pod shuts down.
So, what do you do if you want to store data persistently in Kubernetes? The answer is to create a persistent volume (PV) – which, as its name implies, allows pods to write data that is persistent, meaning it is not lost when the pod shuts down.
Keep reading for tips on how Kubernetes persistent volumes and persistent volume claims (PVCs) work in Kubernetes, when to use them, a step-by-step guide to setting up persistent volumes, and best practices for working with them.
What is a Kubernetes persistent volume (PV)?
In Kubernetes, a persistent volume (PV) is an object that provides pods with access to persistent storage resources.
To use a PV, admins must first either provision a storage resource using any type of storage system (such as a local file system or cloud-based storage) that is accessible by the cluster, or set dynamic provisioning (more on this in a bit), which allows Kubernetes to provision volumes automatically. Then, they can create a persistent volume claim (which we'll also discuss in more detail in a moment) to make persistent volumes accessible to pods.
Persistent volumes are external to pods, and any data the pod writes to the storage will remain intact after the pod shuts down.
The other main type of volume in Kubernetes is called an ephemeral volume. Kubernetes automatically destroys ephemeral volumes when the pods that are using them shut down – hence, ephemeral volumes can't store data persistently or permanently. Persistent volumes can, which is why they're critical for use cases involving data retention.
What is a Kubernetes persistent volume claim?
When you set up a PV, you've configured a storage object. But to allow pods to use the PV, you need to set up a persistent volume claim (PVC).
A persistent volume claim is a request for a pod to access a persistent volume. The PVC can be configured with variables such as how much storage the pod should be able to use (you don't necessarily have to let the pod use the entire volume's capacity) and the storage mode (such as read-only or read-write).
Importantly, PVCs don't specify which PV to use. Instead, they describe which conditions the PV needs to meet (such as, again, how much storage capacity it requires and which access mode it supports). Kubernetes then automatically searches for a PV that meets the requirements defined in the PVC.
PVC: Static vs. dynamic provisioning
There are two ways to go about creating persistent volumes:
- Static provisioning: Static provisioning means that admins manually provision storage resources by selecting a storage system to host data, defining how much storage should be available in the PV, and configuring access modes.
- Dynamic provisioning: Dynamic provisioning allows Kubernetes to create persistent volumes automatically, in response to PVC requests, based on predefined storage classes. Storage classes tell Kubernetes which storage system to use when setting up persistent volumes and how to configure it. Then, when a new PVC appears, Kubernetes sets up an appropriate PV automatically (assuming a storage class has been configured that is capable of satisfying the conditions defined in the PVC).
Static provisioning is simpler to implement, and it works well when storage conditions are fixed and predictable. For example, if you need to create a storage system of finite size to store log data for a pod, and your pod automatically rotates logs to avoid filling up the storage, static provisioning might be appropriate.
In contrast, dynamic provisioning is more flexible. Creating volumes dynamically using a storage class is useful for situations where you don't know ahead of time exactly which types of storage resources your workloads will require. In addition, dynamic provisioning can assist in optimizing storage costs by provisioning storage only when it's necessary, which helps teams avoid paying for storage they aren't actually using.
Lifecycle stages of a persistent volume and claim
To understand fully how PVs and PVCs work, let's walk through the stages of setting up a PV and PVC – a process that you can think of as the persistent volume lifecycle.
1. Provisioning
The persistent volume lifecycle starts with the provisioning of storage resources. Provisioning entails taking storage infrastructure and creating one or more storage volumes based on it. As noted above, provisioning can happen statically or dynamically.
2. Binding
Binding is the process of matching a PVC to a persistent volume. To do this, you create a PVC. As long as a PV exists that satisfies the criteria defined in the PVC, the Kubernetes control plane will automatically discover the PV and bind it to the PVC.
If an appropriate PV is not available at the time the PVC is created, but is added later, Kubernetes will find it and bind it to the PVC. Thus, you don't necessarily need to create PVs before creating PVCs.
3. Using
Once a PV has been bound to a PVC, pods can use the PV by referencing the PVC in the volumes block when the pod is defined. This allows pods to read and (if allowed by the PVC configuration) write to the storage resource. Any data that exists in the volume will remain intact even if the pod shuts down.
4. Reclaiming
Once a user finishes using a volume, they can delete the PVC that they used to access it. At that point, the PV is no longer bound to the PVC, because the PVC no longer exists.
However, whether the PV itself continues to exist depends on which reclaim policy admins configured when they set up the PV. There are three reclaim options:
- Retain: The volume (and data on it) remains intact, but the volume cannot be reused. An admin must manually delete the volume so that its storage resources can be repurposed for other users or workloads.
- Delete: Kubernetes automatically deletes the volume, erasing any data stored on it.
- Recycle: Kubernetes automatically deletes data from the volume. After this, the volume becomes available for reuse by another PVC. This option is considered deprecated in modern versions of Kubernetes but may still be supported.
Persistent volume claims common use cases
Now that we know what PVs and PVCs do and how they work, let's talk about why you might want to use them.
Again, you don't strictly need to create PVs or PVCs to use Kubernetes. If none of your workloads require persistent storage, they can simply store data in the internal file system of containers, or use ephemeral volumes to store data temporarily on external storage resources.
However, there are a variety of use cases where persistent storage is important.
Stateful applications
Stateful applications are ones that retain data across sessions. In other words, when the application restarts after having shut down, the new application session needs to be able to access data generated by the previous one. Databases are classic examples of stateful applications. So are applications that record data about users and need to look the data up when the apps restart.
PVCs enable use cases like these by allowing applications to write data to storage that remains intact between shutdown and restart events.
Shared storage across pods
In some cases, multiple applications may need to access the same data. For example, multiple apps might need to look up information from the same database. While they could share this data using APIs, a simpler approach is to give all associated pods access to the same storage resource. This is possible by having pods share a PVC.
Disaster recovery and backup
PVCs can be useful for disaster recovery and backup purposes by providing access to persistent storage resources that backup software can use to store backups. For an extra layer of protection, data from the storage infrastructure that supports the PVCs can be copied, and then later used to rebuild the PVs and PVCs in the event that the storage infrastructure fails.
CI/CD pipelines
PVCs can be helpful when running CI/CD pipelines on Kubernetes by providing a way to store artifacts generated during the CI/CD process. For example, if you run a compiler inside a pod, you could use a PVC to store the binary files that the compiler generates – which would otherwise be lost when the pod shuts down, if they are not placed in persistent storage.
Logs and metrics storage
Having pods write logs and metrics data to persistent storage is a handy way of ensuring that this data is not lost when the pods shut down. You could also use data collectors or APIs to collect or pull logs and metrics data from pods to an external tool, but storing the data in a PVC by default provides assurance that you won't lose important observability data in the event that your pod stops running and you didn't pull all of the logs or metrics from it first for some reason.
How to create a PVC and bind to a persistent volume: Tutorial
To illustrate these steps, here's a tutorial on creating a PVC and binding it to a persistent volume.
1. Provision a persistent volume
The first step is to provision a persistent volume. To provision statically, write a YAML file that describes the claim. For example, the following code provisions a 1-gigabyte volume based on local storage at /mnt/pv1. It also configures the PV for ReadWriteOnce access, which allows a single pod to mount the volume in read-write mode.
To provision dynamically, you'd write YAML that defines a storage class, rather than a PV. For example:
This sets up dynamic provisioning using aws-ebs as the dynamic provisioner.
Once you've defined either a PV or a StorageClass using YAML, save it as a file and apply the configuration with kubectl:
2. Create a PVC
Next, create a persistent volume claim. Here again, you use YAML to do this. For example:
This creates a PVC that matches the 1-gigabyte, ReadWriteOnce PV volume that we set up in the previous step.
Save the file and apply the PVC with:
3. Verify binding
As explained above, the binding process will happen automatically as long as a PV (or a dynamic PV provisioner) is available that matches the configuration defined in the PVC.
To ensure that binding was successful, use kubectl get to check the status of the PV and PVC:
4. Use the PVC in a pod
To use the PVC in a pod, write a pod specification that references the PVC in the volumes section. For example:
This pod mounts the PVC at the path /usr/share/nginx/html, where it can read from and write to the persistent volume.
Troubleshooting common Kubernetes PVC issues
To troubleshoot PVC issues in Kubernetes, first confirm that PVs and PVCs you've created actually exist. As noted above, you can do this using:
If PVs or PVCs don't exist as expected, review your YAML code for typos and to make sure that the underlying storage infrastructure is available.
If you don't see problems here, the next step is to check the events log of the pod that is trying to use the PVC. You can do this by running:
There are three common PVC-related errors that you may see inside pod events logs:
- FailedAttachVolume: This typically happens in cases where the underlying storage infrastructure is unavailable. It may also happen due to bugs in the PV provisioner, if you're using a dynamic provisioner.
- FaliedMount: This error typically coincides with FailedAttachVolume, and the root causes are the same.
- CrashLoopBackOff: CrashLoopBackOff means the pod has repeatedly failed to start due to recurring Kubernetes health check failures. This causes the pod never to succeed in using a PVC. Usually, CrashLoopBackOff errors stem from issues with the pod and are unrelated to PV or PVC settings. (For details, check out our guide to CrashLoopBackOff troubleshooting.)
Tips to handle Kubernetes persistent volume claims (PVC)
To get the most from PVCs in Kubernetes, consider the following best practices:
- Use storage limits: You can define a LimitRange that restricts how much storage a PVC is allowed to request. This is useful for preventing PVCs from hogging more storage resources than they need. It can also help prevent Kubernetes disk pressure, which occurs when nodes run short on available storage.
- Back up PVC data: Kubernetes persistent volumes store data persistently, but they don't guarantee data won't be lost in the event that the underlying storage system is corrupted, goes offline, or is otherwise disrupted. For this reason, it's a best practice to back up PVC data to external storage.
- Use dynamic provisioners where possible: While static provisioning is useful for simple PVC scenarios, dynamic provisioning is, on the whole, a more scalable and efficient way to manage storage resources.
- Consider QoS definitions: PVC requests can include a Quality of Service (QoS) parameter, which defines performance criteria (like throughput or latency) that a storage volume should meet. This is useful in situations where you want to use a certain type of storage (like SSD).
Handling Kubernetes PVC with groundcover
As a Kubernetes monitoring and observability platform, groundcover doesn't help you create Kubernetes persistent volumes or persistent volume claims. It does, however, provide the insights you need to troubleshoot issues related to persistent volumes and PVCs – like storage that fails to mount or pods that can't use a PVC because they're stuck in a restart loop.
When something goes wrong with a PVC, you can count on groundcover to provide the context necessary to troubleshoot the issue quickly.
Getting more from Kubernetes with PVCs
Managing storage resources in Kubernetes is more complicated than managing storage for traditional applications, which can write directly to local server file systems or network file systems. But with a little help from PVCs, you can provide storage resources to Kubernetes workloads, too, making it possible to run stateless applications with abandon. You just need to learn the steps for provisioning persistent volumes and PVCs, and troubleshooting them when matters go awry.
Sign up for Updates
Keep up with all things cloud-native observability.