If you're one of those people who can't really hold a functional conversation in the morning until you've had your coffee, you can probably relate in a certain way to Kubernetes readiness probes.
Why? Because just as being awake and out of bed doesn't necessarily mean you're really ready to start your day, having a container that’s up and running doesn't always mean that the container is actually ready to begin handling traffic. The purpose of readiness probes is to check whether the container is, indeed, responsive – whether it has finished its coffee and is ready to begin seizing the day, so to speak.
That, at least, is a basic analogistic overview of what readiness probes mean in Kubernetes. For a longer – and more technical – explanation, keep reading as we unpack everything you need to know about readiness probes, including what they are, which types are available, how to configure them, and how to troubleshoot issues with readiness probes.
What is a Kubernetes readiness probe?
A Kubernetes readiness probe is a type of health check that verifies whether a container is ready to receive traffic. If the readiness probe is successful, Kubernetes knows the container is fully up and running and can begin accepting requests. If it's not, Kubernetes waits a bit and tries again.
Typically, each Kubernetes readiness probe takes place after startup probes. Startup probes check whether a container is running, but not whether it's fully functional. Readiness probes do the latter by testing whether the container will respond to traffic.
Readiness probes are important because they are the only way to confirm that a container is completely functional. Just because a container has successfully started doesn't mean it's actually ready to begin handling workloads. It may still be too busy with startup tasks to function normally – kind of like how when you start your PC, it sometimes takes a minute or two after the desktop appears for the computer to begin behaving normally. But if a container passes a readiness probe, Kubernetes knows that it's ready to begin work.
How readiness probes work in Kubernetes
Readiness probes work in a pretty straightforward way: Kubernetes attempts to connect to the container it wants to probe. If the connection is successful, the probe check passes and Kubernetes deems the container ready to accept traffic.
In most cases, Kubernetes attempts to connect to the container using either a TCP endpoint or by sending an HTTP request, although it's also possible to probe by running a command inside a container. We discuss the different probe protocols, and how to configure each type, below.
Three types of Kubernetes probes
To understand the concept of the Kubernetes readiness probe fully, you should understand each of the three main types of probes that Kubernetes performs. They include:
- Startup probes: As we mentioned, Kubernetes runs startup probes to check whether a container has started and is running at all. This is the first type of probe that occurs during a container's lifecycle.
- Readiness probes: After a startup probe succeeds, Kubernetes runs a readiness probe to check whether the container can actually accept traffic.
Liveness probes: Liveness probes check whether a container remains in a healthy state and is free of errors. If a liveness probe fails, Kubernetes attempts to restart it. Liveness probes begin after startup and readiness probes have successfully completed, and they are repeated on an ongoing basis in order to monitor a container's health continuously.
When to use readiness probes
Since the purpose of readiness probes is to verify that a container can accept traffic, you should use readiness probes once you've confirmed that a container has successfully started.
You should repeat these probes if they initially fail, since that is an indication that the container is not yet ready. However, once a container passes a readiness probe, there is usually no reason to run the probe again for that particular container. Instead, you can rely on ongoing liveness probes to track the container's health on a continuous basis.
Readiness probe configuration options
That said, you can modify readiness probe configurations in certain ways if you choose. Here are the main configuration options:
- initialDelaySeconds: How many seconds to wait before running the first readiness probe. By default there is no delay.
- periodSeconds: The time (in seconds) that Kubernetes should wait before reprobing a container if the preceding probe failed (or if you configured a probe to require multiple successful connection attempts). The default interval between probes is 10 seconds.
- timeoutSeconds: How long to wait for a response before deeming a probe to have failed if there is no response. The default timeout value is 1 second. You may want to make it longer if the application inside your container has high latency or slow response times.
- successThreshold: How many recurring probes the container must respond to before it is considered ready. By default this value is 1, so Kubernetes will deem the container ready if it passes a single probe, but you can change this value if you want to require multiple probes (which may be useful for containers that respond intermittently at first due to complex startup routines).
- failureThreshold: The number of failed probes that must occur before Kubernetes considers the container to be not ready and stops trying to probe it. The default value for readiness probes is 3.
To configure these options, specify the configuration variable and associated value when writing the YAML code to configure your containers. For example, if you want to set a timeoutSeconds value of 2 and a failureThreshold of 5, you'd write code like the following:
Note that if you don't define readiness probe configuration options explicitly, Kubernetes will run the probes automatically using the default values.
Disabling probes for readiness
If you want to stop a readiness probe from running at all, include the following statement in the YAML for your container:
In general, there are few scenarios where you would not want to perform a readiness probe, since the probes are important for validating that applications are ready to accept traffic. But in rare circumstances – for instance, if you have an application that can't handle TCP or HTTP traffic for some reason – you may choose to disable the probes.
Configuring different types of probes
As we mentioned, you can use different protocols and methods to perform readiness probes in Kubernetes. Here's a look at how to configure the three main approaches: HTTP probes, TCP probes, and probes from the command line.
HTTP probes
To run probes using HTTP, include an httpGet: segment in your code, such as:
Note that you must use the host value to define the IP address or hostname of the container you're connecting to, the port to use, and the connection path. (If you don't define a host, Kubernetes defaults to the internal IP address of the Pod.)
TCP probes
To configure a TCP probe, define tcpSocket: values for your probe using code like the following:
Here again, you have to specify a host and port. The host value defaults to the Pod's IP address if you don't define a different option.
Command probes
As an alternative to probing containers using HTTP or TCP, you can run a direct command. If the container responds to the command, the probe is considered successful.
For instance, the following probe runs the command cat /some/file.txt to verify readiness:
In this case, the container would be considered ready if the binary cat can execute and the file /some/file.txt exists. This approach is different from HTTP or TCP probes because it doesn't actually test whether the container will respond to network requests. But it's useful if you want to ensure that your container can perform specific actions before you consider it ready.
Troubleshooting and fixing probe failures
Sometimes, a readiness probe fails at first because your container simply needs a bit more time to become fully operational. But if a probe fails repeatedly, you'll want to figure out why and fix the issue.
Here's a look at common reasons for failed readiness probes, along with tips on leveraging Kubernetes observability strategies to troubleshoot the issues.
Delayed response
In some cases, your container may be ready but does not respond to a probe quickly enough due to issues like being too busy with other tasks to process the probe immediately, or congestion on the network.
To fix this probe problem, consider increasing the timeoutSeconds value to give the container longer to respond before the probe is considered unsuccessful. Setting an initialDelaySeconds greater than the default of 0 may also help, especially if the container becomes faster to respond once it has been running for a bit.
Cascading failures
If a container's ability to respond to a probe depends on conditions beyond simply being able to accept a connection, a cascading failure situation may arise. For example, if you configure a probe that runs a command where the container pings another container, the failure of the other container could trigger the failure of a probe on the first container.
The best way to avoid this issue is to keep your probes as simple as possible. Avoid creating dependencies that could complicate probe results.
Application bugs
If your probes fail repeatedly even after making delay and timeout thresholds more generous than the default values, it's likely that you have an application bug. In that case, you'll need to debug the container. Try issuing HTTP requests or opening TCP connections to it directly, rather than via a Kubernetes probe, to check how it responds. Logs and traces may also help you to pinpoint the source of the issue.
Resource constraints
Unexpected probe failures may also stem from a lack of adequate resources in your Kubernetes cluster. If your containers don't have enough CPU and memory to function normally, they may not be able to respond to probes.
To determine whether this is the likely cause of your problem, use a Kubernetes monitoring tool to check the total resource utilization of your cluster. If that seems normal, check the resource availability for the node that hosts the container whose probes are failing. And validate as well that you don't have any resource limits in place that are depriving the container of the CPU and memory it needs to operate normally.
Competing or conflicting probes
Attempting to run too many probes at once can trigger strange probe results, too. This most often happens in situations where liveness probes occur at the same time as readiness probes and the container can't respond quickly enough to satisfy both requests. In this case, customizing the liveness probe parameters by defining an initialDelaySeconds for liveness probes may help because it will impose a delay before liveness probes can start, giving your readiness probes more time to complete before the liveness checks kick in.
Cluster issues
Finally, problematic configurations with your cluster as a whole could cause strange probe behavior. For instance, a lack of sufficient nodes might cause Kubernetes to attempt to reschedule a Pod (and, by extension, containers in the pod) on different notes on a frequent basis, in which case the containers could be stopping and restarting so often that they never get past readiness probes. Or, communication problems between the cluster control plane and kubelet might prevent probes from running as expected.
The best solution here is to use a Kubernetes troubleshooting solution to get deep visibility into what's happening with your cluster and find any strange issues you may be overlooking.
Best practices for using readiness probes in Kubernetes
Now that you know all about how readiness probes work and how to use them, let's go over some best practices for getting the most value out of readiness probes.
Don't settle for default probe configurations
The default probe options that Kubernetes uses aren't necessarily a good fit for every container. Kubernetes doesn't know how long it might take your container to respond to a request under normal conditions or how much time the container needs before it's fully ready, for instance.
For this reason, don't settle for the default probe configuration options unless you've verified that they make sense for your container. In many cases, you'll want to customize options like initial delays or timeout periods.
Configure probes on a container-by-container basis
Instead of using the same probe configuration for each of the containers in the pod, consider setting up different options for every one. This is because, once again, every container is unique, and generic probe configurations don't often make sense for every container you have running.
Use probe types strategically
TCP probes are the easiest type of probe to configure because they have the fewest required values to define. However, that doesn't mean you should always default to TCP probes.
HTTP probes often make more sense for Web apps because the main purpose of those apps is to accept HTTP traffic – so probing whether they can handle HTTP requests is a good way to check that a container is capable of carrying out its main job. Likewise, command probes can make sense in situations where you need to confirm that a container can perform an action more complex than simply accepting a network connection or traffic.
Create simple endpoints to accommodate probes
If you're using a probe that requires Kubernetes to connect to a certain endpoint, having a simple endpoint available for this purpose makes the probe easier to execute and less likely to result in a false positive or negative. So, when creating your container, consider setting up an endpoint – such as /probe – for this very purpose.
The alternative is to connect to endpoints that support actual application functionality. This is riskier and more complicated because if the endpoint is not designed to handle the type of request issued via a probe – which it's probably not, if handling probes is not the main purpose of the endpoint – it may respond in unexpected ways, causing the probe to fail.
Troubleshooting readiness probes with groundcover
When a readiness probe – or, for that matter, anything else – fails in your Kubernetes cluster, groundcover delivers the Kubernetes monitoring and troubleshooting capabilities you need to get to the root of the issue quickly.
By comprehensively and continuously tracking everything that happens in your cluster, groundcover makes it easy to determine whether failed probes result from problems with a node or Pod, control plane issues, resource constraints, application bugs or any of the myriad other factors that could cause a container not to respond to readiness probes.
Ready for readiness probes
If you've read this far – or even just skimmed to this point – congrats! You now know how to configure probes, how to troubleshoot probe issues, and which best practices optimize probe effectiveness. In short, you're now ready to put readiness probes into action to help keep your Kubernetes-based apps happy, merry and wise.
Sign up for Updates
Keep up with all things cloud-native observability.