Recommendations for setting up logging mechanism in Kubernetes
In traditional server environments, application logs are written to a file such as /var/log/app.log. However, when working with Kubernetes, you need to collect logs for multiple transient pods (applications) across multiple nodes in the cluster, making this log collection method less than optimal.
Ways to collect logs in Kubernetesโ
Basic logging using stdout and stderrโ
The default Kubernetes logging framework captures the standard output (stdout) and standard error output (stderr) from each container on the node to a log file. You can see the logs of a particular container by running the following commands:
$ kubectl logs <pod-name> -c <container-name> -n <namespace>
For a previously failed container:
$ kubectl logs <pod-name> -c <container-name> -n <namespace> --previous
By default, if a container restarts, the kubelet keeps one terminated container with its logs. If a pod is evicted from the node, all corresponding containers are also evicted along with their logs.
Cluster-level logging using node logging agentโ
With the help of cluster-level logging setup, you can access the logs even after the pod is deleted. The logging agent is commonly a container that exposes logs or pushes logs to a backend. Because the logging agent must run on every node, the best practice is to run the agent as a DaemonSet.
Managing logs on different platforms:โ
Google Kubernetes Engine (GKE) clusterโ
For container and system logs, by default GKE deploys fluent-bit
, a per-node logging agent fluent-bit
that reads container logs, adds helpful metadata, and then stores them in Cloud Logging. The logging agent checks for container logs in the following sources:
- Standard output and standard error logs from containerized processes
- kubelet and container runtime logs
- Logs for system components, such as VM startup scripts
For events, GKE uses a deployment in the kube-system namespace which automatically collects events and sends them to Logging. For more details, see Managing GKE logs.
Use kubectl get pods -n kube-system
to ensure the fluent-bit pods are up and running.
Sample output:
% kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
event-exporter-gke-857959888b-mc44k 2/2 Running 0 8d
fluentbit-gke-6zdgb 2/2 Running 0 8d
fluentbit-gke-85mc8 2/2 Running 0 8d
fluentbit-gke-mbgkx 2/2 Running 0 8d
Read logsโ
To view logs on Google Cloud Logs Explorer, see Gcloud Logs Explorer.
To fetch logs through command line, gcloud logging read
command will be used.
gcloud logging read 'severity>=DEFAULT AND
resource.type="k8s_container" AND
resource.labels.container_name=<container name>AND
resource.labels.pod_name=<pod name> AND
resource.labels.namespace_name=<namespace name> AND
resource.labels.location="us-west1-a" AND
resource.labels.cluster_name=<cluster name> AND
timestamp>="2023-04-29T11:32:00Z" AND timestamp<="2023-05-29T12:09:00Z"'
--format=json --order=asc | grep -i textPayload > ~/gcloudlogging.log
This command fetches the textPayload
field from all the logs in the given range of timestamp of the container, pod, and cluster mentioned in the command, and populates the gcloudlogging.log
file with that information.
Amazon EKS clusterโ
The Amazon EKS cluster does not come with any per-node logging agent installed. For logging purposes, you can install an agent similar to the fluent-bit
deamonset to aggregate Kubernetes logs and send them to AWS CloudWatch Logs.
See the AWS documentation Set up Fluent Bit as a DaemonSet.
Also, verify the IAM permissions before setting up fluent-bit
Verify prerequisites.
Read logsโ
To view logs on AWS CloudWatch, see view log data.
To fetch logs through command line, use the aws logs filter-log-events
command.
This command needs the arguments --log-group-name
and --log-stream-names
, which can be obtained from given commands:
% aws logs describe-log-groups
{
"logGroups": [
{
"logGroupName": "/aws/containerinsights/openebs-demo/application",
"creationTime": 1685007094462,
"metricFilterCount": 0,
"arn": "arn:aws:logs:us-east-1:<accountNumber>:log-group:/aws/containerinsights/openebs-demo/application:*",
"storedBytes": 125735395
},
...
]
}
% aws logs describe-log-streams --log-group-name /aws/containerinsights/openebs-demo/application
{
"logStreams": [
{
"logStreamName": "aerospike-init",
"creationTime": 1685431444031,
"arn": "arn:aws:logs:us-east-1:<accountNumber>:log-group:/aws/containerinsights/openebs-demo/application:log-stream:aerospike-init",
"storedBytes": 0
},
...
]
}
% aws logs filter-log-events \
--start-time `date -d 2023-04-30T12:32:00Z +%s`000 \
--end-time `date -d 2023-05-30T12:34:40Z +%s`000 \
--log-group-name <application log group name> \
--output json --log-stream-names <log stream names> | jq '.events[].message' > ~/awsevents.log
This command fetches the message
field from all the logs in the given range of timestamp from the log stream mentioned in the command, and uses that information to populate the awsevents.log
file.
On-Prem or Self-Managed clusterโ
There are several Kubernetes logging stacks that can be implemented in any kind of cluster, including:
- EFK (Elasticsearch, FluentD, and Kibana)
- ELK (Elasticsearch, Logstash, and Kibana)
- PLG (Promtail, Loki, and Grafana)
PLG (Promtail, Loki, and Grafana)โ
The metadata discovery mechanism in Loki stack is useful in Kubernetes ecosystem when cost-control and storing logs for a long amount of time are priorities.
PLG stack is comprised of the following components:
- Promtail: Responsible for data ingestion into Loki. Runs on every node of your Kubernetes cluster.
- Loki: The heart of the PLG stack; a data store optimized for logs.
- Grafana: Visualizes logs stored in Loki. We can build individual dashboards in Grafana based on application logs and metrics computed from the logs.
Install the PLG stack with Helmโ
- Add the Grafana repository to Helm.
% helm repo add grafana https://grafana.github.io/helm-charts
"grafana" has been added to your repositories
% helm repo update
Hang tight while we grab the latest from your chart repositories...
...
Update Complete. โHappy Helming!โ
Verify the Grafana repo in helm
% helm search repo grafana/
NAME CHART VERSION APP VERSION DESCRIPTION
grafana/enterprise-logs 2.4.3 v1.5.2 Grafana Enterprise Logs
grafana/enterprise-logs-simple 1.2.1 v1.4.0 DEPRECATED Grafana Enterprise Logs (Simple Scal...
grafana/enterprise-metrics 1.9.0 v1.7.0 DEPRECATED Grafana Enterprise Metrics
grafana/fluent-bit 2.5.0 v2.1.0 Uses fluent-bit Loki go plugin for gathering lo...
grafana/grafana 6.56.5 9.5.2 The leading tool for querying and visualizing t...
grafana/grafana-agent 0.14.0 v0.33.2 Grafana Agent
grafana/grafana-agent-operator 0.2.15 0.32.1 A Helm chart for Grafana Agent Operator
grafana/loki 5.5.5 2.8.2 Helm chart for Grafana Loki in simple, scalable...
...
- Configure the PLG stack
Download value file from grafana/loki-stack to configure based on the use case. In the following example, we customize the value file to deploy only Promtail, Loki, and Grafana.
loki:
enabled: true
persistence:
enabled: true
storageClassName: ssd
size: 50Gi
isDefault: true
url: http://{{(include "loki.serviceName" .)}}:{{ .Values.loki.service.port }}
readinessProbe:
httpGet:
path: /ready
port: http-metrics
initialDelaySeconds: 45
livenessProbe:
httpGet:
path: /ready
port: http-metrics
initialDelaySeconds: 45
datasource:
jsonData: "{}"
uid: ""
promtail:
enabled: true
config:
logLevel: info
serverPort: 3101
clients:
- url: http://{{ .Release.Name }}:3100/loki/api/v1/push
grafana:
enabled: true
sidecar:
datasources:
enabled: true
image:
tag: 8.3.5
For Loki, we configure persistence to store our logs on a running Kubernetes cluster with a size of 50 GB. The disk itself is provisioned automatically through the available CSI driver.
Depending on your Kubernetes setup or managed Kubernetes vendor, you may have to provide a different storage class. (Use kubectl get storageclass
) to get a list of available storage classes in your cluster.
- Deploy the PLG stack with Helm
% helm install loki grafana/loki-stack -n loki --create-namespace -f ~/loki-stack-values.yml
NAME: loki
LAST DEPLOYED: Thu May 25 19:21:04 2023
NAMESPACE: loki
STATUS: deployed
REVISION: 1
NOTES:
The Loki stack has been deployed to your cluster. Loki can now be added as a datasource in Grafana.
See http://docs.grafana.org/features/datasources/loki/ for more detail.
verify loki pods created by the above installation:
% kubectl -n loki get pod
NAME READY STATUS RESTARTS AGE
loki-0 0/1 Running 0 26s
loki-grafana-7db596b95-4jdrf 1/2 Running 0 26s
loki-promtail-2fhdn 1/1 Running 0 27s
loki-promtail-dh7g2 1/1 Running 0 27s
loki-promtail-hjdm8 1/1 Running 0 27s
- Access Grafana from your local machine.
Find the Grafana password. By default, Grafana is protected with basic authentication You can get the password (username is admin) from the loki-grafana
secret in the loki namespace with kubectl
:
% kubectl get secret loki-grafana -n loki \
-o template \
--template '{{ index .data "admin-password" }}' | base64 -d; echo
- Port-Forward from localhost to Grafana
Knowing the username and password, you can port-forward with kubectl
port-forward and access Grafana from your local machine through port 8080:
% kubectl get pod -n loki -l app.kubernetes.io/name=grafana
NAME READY STATUS RESTARTS AGE
loki-grafana-7db596b95-4jdrf 2/2 Running 0 97s
% kubectl port-forward -n loki loki-grafana-7db596b95-4jdrf 8080:3000
Forwarding from 127.0.0.1:8080 -> 3000
Forwarding from [::1]:8080 -> 3000
- Read logs
To see the Grafana dashboard, access localhost:8080. Use admin as the username, and the password you fetched from secret. Use LogQL queries to explore the logs on the Grafana dashboard. For more details see the official LogQL documentation.
Use the logcli
command to fetch logs through command line.
- Port-Forward from localhost to Loki pod
You can port-forward using kubectl
port-forward to access Loki
via logcli
from your local machine through port 8080:
% kubectl get pod -n loki -l app=loki
NAME READY STATUS RESTARTS AGE
loki-0 1/1 Running 0 5d20h
% kubectl port-forward -n loki loki-0 8080:3100
Forwarding from 127.0.0.1:8080 -> 3100
Forwarding from [::1]:8080 -> 3100
For logcli
to access Loki, export the Loki address and port number:
export LOKI_ADDR=http://localhost:8080
logcli query '{namespace="<namespace name>",pod=<pod name>,container=<container name>}' --from "2023-05-29T11:32:00Z" --to "2023-05-30T16:12:00Z" > ~/lokilogs.log
This command fetches all the logs in the given range of timestamp using the query in the command, and uses that information to populate the lokilogs.log
file.