(Hands on) Setup a Kubernetes Infrastructure Dashboard with Grafana + Prometheus

Grafana and Prometheus

Introduction

Prometheus and Grafana are growing in popularity and not surprisingly so are the skills to manage these tools. This article acts as a hands on, step by step tutorial on how to setup a basic infrastructure dashboard for monitoring your Kubernetes pods.

Prerequisites

Building the Infrastructure Dashboard

1.) Login to Grafana, hover over the “+” button in the left hand column and click “Dashboard”.

2.) Let’s name our dashboard by clicking on the save icon in the top right hand corner of the screen, giving our dashboard a name (K8s Infrastructure), and clicking “Save”.

3.) To add a new panel we click the button in the top right that looks like a graph with a plus sign. Then we click “Add new panel” on the new window that appears.

4.) Create a CPU Panel.

a. Give our panel a title (CPU)

b. Ensure our data source is set as Prometheus

c. Add our Prometheus query

Prometheus Query:

sum(rate(container_cpu_usage_seconds_total{container!="POD",container!=""}[5m]) * 1000) by (container)

Breaking down the query it says:

  • container_cpu_usage_seconds_total{container!="POD",container!=""}
    • Grab the cpu usage metrics for all containers where thee container name isn’t blank or upper case POD
  • rate([5m])
    • The container_cpu_usage_seconds_total metric is a counter. In order to not just show a graph that keeps going up we need to take the rate – or the amount that the counter changed over the last 5 minutes (5m).
  • (* 1000)
    • Multiplying by 1000 is to make the graph more readable. In Kubernetes you set requests and limits using x/1000 fractional shares of a cpu – if you see Grafana using 100 on the graph that is the equivalent of using 100m use of cpu resources.
  • sum() by (container)
    • The sum section aggregates the numbers by container. For example if you had 3 Grafana containers running it would combine their CPU usage in the graph. Feel free to play around with the key you want to aggregate this by.
  • Legend: {{ container }}
    • This tells Grafana to find the value for the key “container” and set it as the label for the graph. If you get rid of this, you’ll notice the labels look a little clunkier.

d. Set the legend set to “{{ container }}” so that the names on the graphs look nice

e. I also like to toggle the “As Table” & “To the right” to make the dashboard’s legend key look more clean.

f. Click the “Go Back” Arrow in the top left – resize your panel to your liking and save your dashboard. Onto the next panel!

5.) Create a Memory usage panel.

a. Name the panel (Memory)

b. Ensure our data source is set as Prometheus

c. Add our Prometheus query

Prometheus Query:

(sum (container_memory_working_set_bytes{container!="POD",container!=""}) by (container) )

d. Set the legend set to “{{ container }}”

e. Toggle the “As Table” & “To the right” under the ‘Legend’ section on the right hand side. Just like we did in step 4.E.

f. Set the ‘Unit’ to ‘bytes(Si)’ for the ‘Left Y’ Axes underneath the ‘Axes’ section on the right. Grafana will now show the the kb, mb, and gb label on the Y axes instead of out of context numbers.

g. Click the “Go Back” Arrow in the top left

6.) Create a network panel.

As you can see the process is fairly repetitive. So for the next step I’ll just give you the Prometheus query and you can make sure the panel looks nice.

This panel will have 2 queries on the same graph – one for the bytes received (1st) and one for the bytes sent (2nd). To add a second query click the ‘+ Query’ button underneath your first query.

Prometheus Query 1: sum (rate (container_network_receive_bytes_total{kubernetes_io_hostname=~”^$Node$”}[1m]))

Prometheus Query 2: – sum (rate (container_network_transmit_bytes_total{kubernetes_io_hostname=~”^$Node$”}[1m]))

Conclusion

You should now have a basic infrastructure dashboard that tracks CPU, Memory, and Network I/O. You will definitely want to visualize more than just these limited metrics, but hopefully you feel more confident creating additional panels in Grafana with Prometheus metrics.

1 thought on “(Hands on) Setup a Kubernetes Infrastructure Dashboard with Grafana + Prometheus”

Leave a Reply

Your email address will not be published. Required fields are marked *