You know Redis, the open-source in-memory data store, and you know Kubernetes, the open source container orchestration platform. But what you might not know - and fair enough - is that you can use Redis and Kubernetes together. Traditionally, most Redis deployments have not involved Kubernetes, and vice versa.
That certainly doesn't mean, however, that you can't combine Redis with Kubernetes – or that there are no clear advantages to doing so. By deploying Redis on Kubernetes, you can easily set up a Redis cluster, which provides reliability and performance benefits that you wouldn't get from a conventional, single-node Redis environment.
This article walks through the what, why and how of setting up a Redis cluster on Kubernetes. It explains how Redis works, what the benefits (and concerns) of running Redis on Kubernetes are and tips and best practices for getting the most out of a Kubernetes-based Redis deployment.
What is a Redis cluster?
Let's start with the basics by defining what we mean when we talk about Redis clusters.
A Redis cluster is a set of multiple Redis instances. In other words, with a Redis cluster, you have multiple nodes hosting your Redis data stores. Data is "sharded" between the nodes, meaning that the data is broken into smaller chunks and distributed across the various nodes.
That's significant because historically, Redis wasn't designed for clustering or data sharding. Unlike certain other types of databases, like Apache Cassandra, Redis isn't a distributed database that is supposed to operate in a clustered environment. Redis's big selling-point is that it stores data in-memory, which can dramatically boost performance compared to disk-based storage.
So, by creating a Redis cluster, you're piling extra benefits on top of Redis's core functionality. In particular, you get added reliability, because your Redis data store will remain operational even if some nodes fail. You may also get better performance in some cases because load can be distributed between nodes, which helps to avoid the bottlenecks that could result if a single application or request were to monopolize access to a Redis node.
Redis Cluster vs. Redis Sentinel
If you're familiar with Redis, you may know that there's another way to achieve some of the reliability features mentioned above. It's called Redis Sentinel, and it provides automated failover between multiple Redis instances. Sentinel also supports monitoring and notifications to help admins detect availability issues quickly.
However, Sentinel is different from setting up a cluster where you have multiple Redis instances with data sharded between them. Sentinel is essentially a way to run Redis in high-availability mode by taking advantage of automated failover, which is different from distributing data across multiple nodes within a cluster.
In general, Sentinel is useful if you have a relatively small-scale Redis environment and you want a simple means of increasing its availability. But to maximize availability at scale, setting up a Redis cluster is a better approach.
How Redis clusters work on Kubernetes
Now, let's talk about how Kubernetes figures into Redis clustering. To be clear, you don't need to use Kubernetes to create a Redis cluster. You can set up a Redis cluster using any set of servers, even if they are not managed by Kubernetes.
That said, operating a Redis cluster on top of Kubernetes provides some benefits that you wouldn't get from other approaches:
• You can horizontally scale Redis deployments across multiple nodes in Kubernetes, which leads to the best performance and availability. Horizontal scaling is harder to do without the help of Kubernetes, whose core purpose includes scaling workloads horizontally.
• You get automated self-healing and rolling updates for your Redis instances, making it easier to keep your environment running.
On top of this, running a Redis cluster on Kubernetes allows you to use Kubernets-native tooling – like Helm charts and kubectl – to manage your environment. That's an advantage if you already know Kubernetes, and/or you have other workloads running on Kubernetes. By integrating Redis with Kubernetes, you centralize your tooling and simplify your management overhead.
Setting up a Redis cluster on Kubernetes
Actually setting up a Redis cluster on top of Kubernetes is easy enough. Here are the steps.
1. Set up Kubernetes
As a preliminary step, you need to deploy Kubernetes. We won't walk through how to do that here, but suffice it to say that you can run Kubernetes locally using a tool like K3s, or you can deploy it on a set of servers. You can also use a cloud-based managed Kubernetes service, like Amazon EKS, which simplifies the setup process because the control plane software is provided for you.
2. Create a StatefulSet
Next, you need to create a StatefulSet. A StatefulSet is a workload object that makes it possible to deploy Pods (which host Kubernetes workloads) with unique identifiers – which you don't get from a standard Kubernetes deployment. If you're running a database or data store like Redis, having a unique identifier for each Pod is important because it allows you to keep Pods connected to a specific database even if the Pods fail and are restarted, or if they move to a different node. Without a StatefulSet, you'd have no way to establish persistent links between Pods and data stores.
So, set up a StatefulSet to manage the nodes in your Redis cluster. Be sure to specify the number of replicas you want for the Pods and the container image to use. Here's an example with the minimum configuration you'd need: