An eye on Observability for March 2022

on March 29th 2022

The sun is rearing its sleepy head across much of the northern hemisphere, making news a little quiet this month. But we have a few choice items for you, and with the first in-person conference seasons in a while on the horizon, there’s plenty more to come!

What’s your query?

PromQL is a fundamental part of turning Prometheus metrics into something meaningful and actionable. However, it’s a query language with a steep learning curve and concepts that are confusing to newcomers.

This is a good opportunity to mention one of our older, but most popular posts, “an introduction to PromQL”. Once you’re done there and want to dig into more details, here are two posts from this month for just that!

Andy Kuszyk at Form3 digs into one of the more fundamental Prometheus metrics types, Histograms. Useful for summarizing the distribution of metrics data, Prometheus treats the concept somewhat differently, and Andy covers the salient points and tips for using them.

Another commonly used metric type is the summary type that’s useful for measuring latencies. Again, it’s useful, but easy to misunderstand. If you want to understand it better, read Fabian Stäber’s detailed post on the GrafanaLabs blog.

Long reads

Two long reads released in the past month look at the wider Observability landscape. 

First is a report from the CNCF Observability Technical Advisory Group (TAG) that highlights a topic this newsletter has mentioned a few times. We have more tools than ever, but are they helping us understand our systems?

Second, Alex Boten has a full book in progress with Packt publishing. There’s not too much detail right now, but keep an eye on the book’s website for updates.

Metrics are everywhere

At home I have a RaspberryPi hosting a Nextcloud, Homebridge, and other servers. Occasionally while digging around the admin interfaces I’ve seen API endpoints from Prometheus metrics and wondered, “what could I do with these?”. This fantastic post from Amon Reich shows I wasn’t the only one, and in it he details how he built dashboards based on data emitted from a variety of smart home devices in senior housing. These dashboards help identify potential risk and health issues, and are a fantastic example of what else you can use metrics to track. 

Interested in what we are building?