
Guide to Kubernetes Resource Quota: Examples & Pros and Cons
Find out how to manage Kubernetes resources with help from resource quotas: one of the main methods for setting a limit on the resources assigned to workloads in a Kubernetes cluster.


Dinner parties can be a real headache when it comes to predicting how much food to serve: On the one hand, if you prepare too little, your guests will be left hungry and disappointed. But if you prepare too much, you’re wasting money.
Managing resources in Kubernetes is another situation where you need to try to make the right call or suffer the consequences: in this case, how much compute, memory, and storage resources your applications will consume. If you assign too few resources, your workloads won’t have sufficient CPU and memory to operate properly. With too many, you’ll allocate more resources than your applications need, causing you to waste money on underutilized infrastructure.
Now, we can’t help you plan your next dinner party. But we can explain how to manage Kubernetes resources with help from resource quotas, one of the main methods for setting a limit on the resources assigned to workloads in a Kubernetes cluster.
What is a Kubernetes resource quota?

In Kubernetes, a resource quota is a limit on the total number of objects, total resource requests, and total resource usage levels permissible within a given Kubernetes namespace.
To fully understand what that means, let’s unpack some of the key terms we just mentioned:
- Namespace: Namespaces are subdivisions within a Kubernetes cluster that host particular pods and containers. You can use namespaces to split a Kubernetes cluster into smaller environments, each reserved for a different set of applications or team of users. Resource quotas apply to an entire namespace.
- Resource requests: Resource requests are the minimum resources that a container or pod can request from a Kubernetes cluster. Resource quotas allow you to place a limit on how many resources the containers or pods in a particular namespace can request in total.
- Resource consumption: Resource consumption refers to how many resources a container or pod is actually using. Resource types you can manage using resource quotas include CPU, memory, storage, and object counts. Resource consumption levels can exceed resource request levels because resource requests define a minimum, not a maximum. However, resource quotas allow you to set a cap on the maximum resources that all pods and containers within a namespace can collectively consume.
- Objects: In the context of resource quotas, objects are types of resources that can exist in Kubernetes. Examples include pods, services, secrets, deployments, and replica sets. Using resource quotas, you can define a limit on the total number of objects of a given type that can live within a namespace.
The primary purpose of resource quotas is to place restrictions on how many resources each namespace can consume based on resource requests and resource limits constraints or object counts. Resource quotas help prevent applications within one namespace from sucking up so many resources that they starve other workloads of the resources they need to run properly.
Resource quotas aren’t required. They’re an optional feature that you can enable on a namespace-by-namespace basis. If you don’t set a resource quota for a given namespace, Kubernetes will allow applications in the namespace to consume as many resources as the cluster has available (provided you haven’t configured any limit ranges, which we’ll talk about in just a second).
Resource quotas vs. limit ranges vs. requests
Resource quotas are only one of several ways to manage resource allocations in Kubernetes. The other main methods include:
- Limit ranges: Limit ranges set minimum and maximum resource levels for pods and containers. They’re similar to resource quotas, with the main difference being that limit ranges manage resources for individual pods and containers, whereas resource quotas apply to an entire namespace (and a namespace can contain multiple pods and containers). Also, resource quotas can restrict the total number of objects allowed within a namespace. Limit ranges can’t control object count.
- Requests: As we mentioned, resource requests define the minimum resources to assign to a container or pod. Like limit ranges, resource requests apply to individual pods or containers, not to entire namespaces. Unlike limit ranges and resource quotas, requests identify only the minimum resources to allocate; workloads may consume additional resources beyond those configured via the request, so long as more resources are available (learn more about Kubernetes requests vs. limits).
Why Kubernetes resource quotas matter
Kubernetes resource quotas are important primarily because they offer a convenient way of divvying up CPU, memory, and other resources between multiple workloads or sets of users. By extension, they help to manage overall cluster performance and prevent a single workload or set of workloads from monopolizing all of a cluster’s resources – which could lead to issues like Kubernetes CPU throttling.
A key benefit of resource quotas is that, again, they set limits on resource usage levels on a per-namespace basis. And because a single namespace can host multiple workloads and/or be shared by multiple users, the ability to manage resources at the namespace level is a convenient way to define resource limits for an entire set of workloads or resources through just one configuration.
Without a resource quota feature, you’d have to use limit ranges to set resource limits on a per-workload basis. You could do that to achieve the same outcome as you’d achieve using resource quotas, but it would be more work, because you’d have to manage resource limits for each individual workload (and you’d have to make sure to create a limit whenever you added a new workload). With resource quotas, a single configuration automatically applies to all workloads within a namespace – including any workloads that you add to the namespace after deploying the resource quota.
How do Kubernetes resource quota limits work?
Resource quota limits work as follows: First, you write YAML code that defines a resource quota, such as the following:
This quota defines the following limits for the namespace named my-namespace:
- Total pod count cannot exceed 10.
- CPU requests can’t exceed 4.
- Memory requests can’t exceed 8 gigabytes.
- Total CPU cores used can’t exceed 10.
- Total memory consumed can’t exceed 16 gigabytes.
Note that the limits apply to the total resource usage of all workloads within the pod – so, for instance, total memory consumption by all pods cannot exceed 16 gigabytes in aggregate due to the memory limits set in the quota above.
To apply the resource quota, save the code as a file, then apply it with:
Once you’ve applied a resource quota, Kubernetes automatically enforces the configurations within it. They apply not just to workloads running in the selected namespace at the time you deployed the resource quota, but also to any workloads that join the namespace afterwards.
Components of a Kubernetes resource quota
Quotas can be broken down into several distinct components.
1. Quota scope
Scopes are a way of defining which specific types of resources a quota applies to. For instance, the following resource quota includes a BestEffort scope, which means it will apply only to pods that are configured with the BestEffort quality of service (QoS). Other pods will not be impacted by the quota, even if they’re in the relevant namespace:
2. Hard vs. soft quotas
Resource limits in Kubernetes can be either “hard” or “soft.” A hard quota is strictly enforced, meaning Kubernetes won’t allow resource consumption or object count to exceed the quota. A soft quota simply generates a warning in response to an exceeded quota, which happens when workloads consume resources at a level above resource quota.
As you’d expect, the hard: descriptor within a resource quota spec defines quotas as “hard.” There is no way within a resource quota to define a soft limit to generate notifications in response to exceeded quotas, but you can use pod-level limit ranges for this purpose.
3. Namespace
As we’ve mentioned, resource quotas are enforced at the namespace level. You configure the namespace to which a quota should apply using the namespace: field.
Except in the case where a quota scope (see above) restricts which types of pod a resource quota governs, all pods and containers within a namespace will be subject to a resource quota.
Kubernetes resource quota examples
To contextualize further how resource quotas work, let’s take a look at additional examples of common resource quota configurations.
Example 1: Set a quota for compute resources
The following resource quota object example restricts compute resources by constraining CPU requests and limits:
Example 2: Storage resource quotas in action
This resource quota restricts total storage consumption. It also restricts the total count of persistent volume claims:
Example 3: Object count quotas
This example restricts object counts for several types of objects (such as pods, services, replication controllers, and secrets) within a namespace:
Example 4: Extended resource quota
Starting with Kubernetes 1.10, resource quotas can manage extended resources. Extended resources are resources that exist at the node level (meaning they are part of the servers that form a Kubernetes cluster) but are not directly managed as objects by Kubernetes. Examples include GPUs, FPGAs, and other types of custom hardware.
As an example, this extended resource quota object sets a limit on how many GPUs a namespace can use:
Types of Kubernetes resource quotas
As you may have gathered if you read the preceding section, there are four main types of resource quotas in Kubernetes:
- Compute resources quotas, which restrict requests and limits for CPU and memory.
- Storage resource quotas, which manage total storage (in terms of gigabytes). You can also restrict total resource objects, like persistent volume claims (although this is arguably a type of object count quota rather than a storage quota).
- Object count quotas, which constrain the total objects of a certain type (like pods, services, and persistent volume claims) allowed in a namespace.
- Extended resource quotas, which manage extended resources – meaning resources (like GPUs) that are available from nodes but are not standard Kubernetes objects.

Pros and cons of Kubernetes resource quotas
Resource quotas offer many advantages, as well as several drawbacks to consider. Here’s a look at the main pros and cons.
Pros
The main advantages of resource quotas include:
- Improved resource utilization: Resource quotas help to make the most of the resources available within a cluster. For instance, a namespace that hosts critical production workloads could receive more resources than one used for dev/test.
- Simple limit configurations: Because resource quotas let you manage resources at the namespace level, they are a fast and easy way to apply requests and limits across multiple workloads.
- Granular controls: Features like quota scopes provide a level of granularity that is useful when you need to control resource usage only for certain types of workloads, rather than an entire namespace.
Cons
The drawbacks of resource quotas are mainly:
- Complexity: Resource quotas can be complex to set up and manage, especially when they include granular configuration options like quota scopes.
- Limitations on application performance: If you set quotas too low, applications may not have enough resources to operate properly.
- Limited granularity: While resource quotas include some level of granularity because you can configure them to match pods of a certain type, they don’t offer as much control as you’d get from limit ranges, which you can set on a pod-by-pod basis.
Best practices for working with resource quotas
To get the most from resource quotas while limiting your risk, consider the following best practices:
- Set up namespaces strategically: Resource quotas are of little value if all of your workloads exist in the same namespace or are scattered at random across namespaces. Avoid this by creating a logical set of namespaces (such as one for each team in your organization) and assigning workloads to them accordingly.
- Start high and scale down: As a rule, it’s better to set generous resource limits initially, then reduce them once you’ve confirmed that a namespace doesn’t need as many resources. This is better than allocating insufficient resources and running into performance issues as a result.
- Monitor and adjust quotas: You won’t know how many resources your workloads actually need unless you monitor them continuously. Based on actual Kubernetes metrics, adjust your quotas as needed.
- Include both resource requests and limits: As a best practice, you should generally set both resource requests and resource limits within resource quotas. If you configure just one or the other, you don’t get the full value from resource quotas because you’re not setting a range of acceptable resource usage.
Managing resource quotas with groundcover
Above, we’ve told you all about how to work with resource quotas. But we haven’t answered the toughest question of all, and that is which resource and object limits you should actually configure when creating a resource quota.

Solving that quandary is where groundcover comes in. By providing comprehensive visibility into how many resources your workloads are actually consuming – and by allowing you to monitor workloads on a per-namespace, per-pod or per-container basis – groundcover clues you into how many resources you should assign. It also lets you know when a lack of available resources threatens workload or cluster performance.
Resource quotas: Making the most of cluster performance
Again, no one’s forcing you to use resource quotas in Kubernetes. You can choose not to set them and simply hope for the best.
But if you want to avoid situations where excessive resource usage by a workload or namespace threatens your cluster’s health, we strongly recommend creating resource quotas for each namespace. They’re not all that hard to set up – and managing them is certainly easier than having to sort through massive performance issues caused by problematic resource usage patterns you could have prevented through a resource quota.
Sign up for Updates
Keep up with all things cloud-native observability.