Kubernetes Logging: A Complete Guide to Efficient Management
Learn more about Kubernetes logging. Understand the importance and techniques for efficient log management in Kubernetes environments.
Part of the beauty of Kubernetes is that you can tell it how you want your containerized applications to operate, and Kubernetes then attempts to manage them automatically according to your specifications.
However, just because Kubernetes tries to do what you want doesn't mean that it always actually does what you request. As with any complex system, a variety of things can go wrong in Kubernetes, and you need to monitor the status of Kubernetes continuously to stay on top of those challenges.
That's why Kubernetes logging plays a central role in running Kubernetes effectively. By understanding which log files are available in Kubernetes, which data they reveal and how to analyze that data, you can gain critical visibility into what's happening with your applications. You can also troubleshoot problems effectively when they arise.
With those goals in mind, let's walk through everything K8s admins need to know about logging. This article breaks down the Kubernetes logging architecture, explains which log files are available in a Kubernetes cluster and walks through the process of managing logs using a logging agent.
Types of logs in a Kubernetes cluster
The first step in mastering Kubernetes logging is to understand which log files exist in a Kubernetes cluster.
This is a complex topic because Kubernetes includes many moving parts, and most of those parts generate their own log files. That said, at a high level, Kubernetes logs can be broken down into three main categories:
• System logs: These logs provide information about the state of the core components of Kubernetes itself, such as the API server and scheduler.
• Application logs: These are logs that provide insight into the status and health of applications running inside containers. For example, you can find application-level error messages here.
• Audit logs: Audit logs record actions taken by human users as well as system processes inside Kubernetes. They're valuable for researching changes that took place before a problem occurred. They can also help to identify potential security risks.
By collecting and analyzing all three of these types of logs, you can achieve comprehensive visibility into all major components of your Kubernetes environment – the control plane, the nodes that host applications and the applications themselves.
The Kubernetes logging architecture
One factor that complicates Kubernetes logging is that most components of Kubernetes are constantly changing, and some are not persistent. For example, when containers shut down, any data stored inside them, including logs, will go away with them.
To solve this challenge, Kubernetes uses a logging architecture – known as cluster-level logging – that decouples log storage and lifecycles from nodes, Pods and containers. Cluster-level logging provides a separate backend to store, analyze and query logs from various sources within Kubernetes.
That said, Kubernetes doesn't provide a native storage solution for hosting log data. You have to implement that on your own, using a third-party logging solution that integrates with Kubernetes.
As for actually generating log data, Kubernetes does that by writing logs from control plane components directly. In addition, a container runtime handles and redirects output generated by applications to stdout and strderr streams, which can then be turned into logs. Different container runtimes handle these streams in somewhat different ways (for example, the integration with kubelet uses the CRI logging format), but they all make it possible to log data about the status of applications as long as those applications can expose data to stdout and stderr.
Kubernetes log structures
Because there are so many different types of logs in Kubernetes, the exact structure of Kubernetes log files varies. But in general, logs typically include the following structural components:
• Timestamp: Timestamps record the time at which each log entry was generated, usually in the format of YYYY-MM-DD HH:MM:SS.microseconds.
• Log level: Log level identifies the severity of the log entries. For example, entries could be categorized at the info, warning or error levels.
• Component: This identifies the component or process that generated the log entry. The component could be the Kubernetes API server, a specific pod or a container.
• Message: The message contains the actual content of the log entry, which may include details about the event or error that occurred.
• Additional fields: Depending on the logging driver and configuration, Kubernetes log files may also include additional fields such as the Pod metadata and names, namespace, or container ID.
In most cases, this data is sufficient for understanding not just what happened, but also for investigating the context that caused it to happen – and, by extension, for troubleshooting and remediating problems.
Kubernetes log example
As an example of a Kubernetes log file entry, here's a sample log that records data about a kubelet event.
As you can see, this entry tells us that the container named my-container started at a particular time. It also identifies the Kubernetes component (kubelet) associated with the container.
If the container were to fail to become ready, this log entry might be useful because it would tell us that the container at least tried to start. With that information, we could rule out some potential causes of failure, such as Kubernetes not trying to schedule the container at all. If we wanted more context on why the container wasn't achieving the ready state, we could look at the logs from the container itself.
Rotating and archiving Kubernetes logs
In most cases, you don't want your Kubernetes logs to live forever. If you keep outdated logs, you waste storage space. In addition, if log files become too large, it grows difficult to search and analyze them efficiently.
That's why it's important to rotate and archive Kubernetes logs. Log rotation is the process of moving older log files to a different storage location or deleting them entirely. Archiving means retaining a copy of log files for long-term storage, typically using a lower-cost and more scalable storage solution than the storage integrated into your cluster itself.
Kubernetes offers several methods for rotating logs, including the logrotate utility. Logrotate is a Linux tool that you can configure to rotate logs according to a set schedule – such as daily, weekly or monthly. Logrotate can also rotate logs automatically when log files reach a specified size or when available disk space crosses a certain threshold.
Alternatively, you can rotate Kubernetes logs using a logging tool designed to support Kubernetes, such as Fluentd or Logstash. These tools can also rotate Kubernetes logs based on factors like log file size, log file age or a combination of both.
Using logging agents to achieve Kubernetes logging efficiency
An important aspect of Kubernetes logging that admins should understand is that although Kubernetes provides a cluster-level logging architecture, it doesn't offer a native tool for performing the actual cluster-level logging. Instead, admins have to implement their own method of collecting logs.
There are several common approaches to this challenge:
• Node-level logging: You can install a logging agent on every node in your cluster and use it to collect logs. Although installing logging agents on each node can be tedious, this is a relatively efficient way to gain access to all available logs.
• Sidecar containers: Sidecar containers can host logging agents inside application Pods. The sidecars collect logs and stream or aggregate them to a preconfigured location. This approach is easier to implement than node-level logging because it doesn't require you to install logging agents on each node. The downside is that sidecar containers can increase the resource overhead of your cluster, due to the resources required to run the sidecar containers.
• Pushing logs from the backend: It's possible to push logs directly from within an application, provided your application includes logic for this purpose. This approach is the most complex to implement because it requires you to configure logging directly within your application, but it's also efficient because it doesn't require you to run standalone logging agents.
Kubernetes log collectors
To add more efficiency to Kubernetes logging and help centralize the process of collecting logs from the various Kubernetes components, you may want to use a tool like Fluentd or Logstash. These tools are log collectors that can be configured to collect log data from multiple independent sources, then push it all to a central location. In general, log collectors are easier to deploy and manage than individual node-based or sidecar-based logging agents that you manage manually.
To deploy a log collector, you simply create a Deployment that tells Kubernetes how to run it. For example, here's a Fluentd Deployment written in YAML:
As you can see, this configuration tells Kubernetes to run Fluentd as a container. It also configures storage volumes that will serve as a location for aggregating the log data.
The log collector approach improves the efficiency of Kubernetes logging because it allows you to collect and store logs from multiple locations using a centralized log stream. From there, you can easily find, search and analyze any logs that are relevant to you. In addition, you get features like built-in filtering and alerting functionality, which you can leverage to help automate the process of monitoring Kubernetes logs.
Log collectors also help to facilitate a scalable log architecture for Kubernetes. As your Kubernetes clusters grow, so will the number of logs it generates. As long as you have log collectors configured, you can add the new log files to your log stream to handle the growth.
Get by with a little help from K8s logs
From control plane components, to applications, to security events and beyond, Kubernetes's logging architecture provides the opportunity for gaining comprehensive visibility into what's happening inside your cluster. But the key word there is opportunity. Kubernetes doesn't actually collect or analyze the logs for you; it expects you to do that using one of the logging methods described above.
As you consider which Kubernetes logging strategy to adopt, evaluate factors such as how efficient you need log collection to be, how your logging needs may (or may not) scale over time and how often you need to rotate or archive logs. When you get these things right, Kubernetes logs become a K8s admin's best friend by helping to deliver actionable Kubernetes observability into all facets of the Kubernetes architecture.
eBPF Academy
Related content
Sign up for Updates
Keep up with all things cloud-native observability.