Making a business case for Observability
Observability is fast changing from a practice that those close to engineering knew was useful, to a practice that everyone knows is useful for technical and business reasons.
A handful of posts from April highlighted this, and none more than a post from the pragmatic programmer by Gergely Orosz.The post digs into what happened to cause Atlassian’s recent week-long outage that hit their SLA promises hard, at over 15% below their target. I’m sure Atlassian has an observability solution, but as this newsletter has mentioned a few times before, if it doesn’t help you remediate issues, then is it worth having?
Focusing on an industry that certainly generates a lot of data, but you rarely hear too much about in terms of tech stacks, this post from Stella Udovicic digs deep into how Observability helps gaming companies balance the plethora of infrastructure and services they need.
Next, something from our friends at Logz.io, discussing one of our favorite topics, who in a company should “own” observability. They argue that a centralized team is needed, what do you think?
Finally, one from our own blog that sums up all these discussions nicely, as our CEO, Martin Mao, predicts that observability data will hit a crucial tipping point at some time this year.
OpenTelemetry roars ahead
If Python is more your thing, then here’s a great post from Thomas Zach.
If this OpenTelemetry thing is completely new to you and you want to get up to date, then start with this post from Jeffrey Lean and then dig in really deep with this new comprehensive guide from NewRelic.
Underpinning so much of what we cover in this newsletter and all the content we link to is metrics data and how it’s important to have a good understanding of how to make the most of that data before you add more on top.
Prometheus is one of the most common metrics formats, and nearly a year after publication, our introduction to PromQL post continues to be popular and it gives a great overview of Metric types and how to make them useful to you.
If you want to learn more specifically about the metrics types, then this post from Timescale is recommended reading.
These last two posts didn’t quite fit anywhere else, but I wanted to include them.
When thinking of observability, we often think of generating data from HTTP APIs and with SDKs in application code. But another increasingly popular developer tool is GraphQL (we use it at Chronosphere!) and this fantastic post from Ankit Anand covered how to report meaningful metrics from GraphQL APIs.
Wow! Manning has always been a respected publisher of technical books, and they have a handful of books covering Observability-related topics in progress right now. So if you’re looking for long reads, take a look.