Kubernetes is ideal for microservices architectures, but it also brings complexity and adds costs. Here are some tips to keep costs in line.
Eric is Chronosphere’s Director of Technical Marketing and Evangelism. He’s renowned in the development community as a speaker, lecturer, author and baseball expert. His current role allows him to help the world understand the challenges they are facing with cloud native observability. He brings a unique perspective to the stage with a professional life dedicated to sharing his deep expertise of open source technologies, organizations, and is a CNCF Ambassador.
On: Apr 30, 2024
Kubernetes (or K8s) is quickly becoming the standard of choice for container orchestration in microservices architecture. Kubernetes has emerged as the de facto standard for container orchestration in microservices architectures.Widespread adoption is poised to continue, too, given the Gartner prediction that 90% of global organizations will be running containerized applications in production by 2026.
Kubernetes, which helps teams run distributed systems resiliently, helps calm the chaos of cloud native environments. It also can be Exhibit A in illustrating how the costs can quickly shoot up because of the additional complexity involved in moving to cloud native. That’s why it’s important to tie any Kubernetes cost optimization journey — which is the process of ensuring your Kubernetes infrastructure is tuned to both increase efficiency and contain costs — with your observability goals.
Kubernetes is different from traditional computing approaches. It’s dynamic, containerized, and always changing. The biggest challenge is that resources are predominantly shared and constantly shifting, meaning that traditional models of cost allocation are broken. Instead of being able to apply tagging schemes to VMs and then aligning those tags to teams, services, or applications, now teams consume resources on a shared cluster, which becomes a black box when it comes to cost allocation.
In order to successfully do Kubernetes cost allocation, you must break down cluster usage by namespace, service, and labels. Even once you understand utilization at this level, there are additional considerations, such as “should I allocate costs based on utilization of memory, CPU, or some combination of the two?” The answer may not be universal, and can vary based on the workload.
With Kubernetes environments, organizations face the “continuous risk of cost escalation due to lack of controls,” according to the EMA report Observability: Challenges, Priorities, Adoption Patterns, and Solutions.
In the report, Torsten Volk, EMA’s Managing Research Director, Hybrid Cloud, Software-Defined Infrastructure, and Machine Learning, writes that “misconfiguring a Kubernetes cluster to log and poll telemetry data at a detailed level and in very short intervals can lead to negative surprises when the monthly observability bill arrives. Combine this with a lack of data retention policies; detailed monitoring for Amazon EC2, RDS, and other public cloud services; a large number of custom metrics, and the detailed logging of serverless invocations, and you may receive a bill that you will have to explain to your boss.”
Misconfiguring a Kubernetes cluster to log and poll telemetry data at a detailed level and in very short intervals can lead to negative surprises when the monthly observability bill arrives.
Organizations must understand and align a number of variables to contain costs related to any Kubernetes distribution. Due to the way teams provision containers and services, it can be easy for costs to spiral out of control. For example, an oversized container image can get propagated widely, and quickly cause costs to spike. In addition, developers have the ability to “reserve” cluster capacity that they may or may not actually even need. Typically these reservations are made based on a best guess, and then rarely revisited. It’s not uncommon to see clusters that are fully provisioned, but not very highly utilized because of these limits. These are just a few of the reasons that Kubernetes cost optimization is a completely different ball game.
Illustrating the intricacies involved in Kubernetes observability and the importance of strategic management, the following factors, when not carefully assessed and managed, have the potential to spike costs in the following ways:
Variable | Traditional vs. K8s Monitoring | Business Impact |
---|---|---|
Users | IT only vs More teams: DevOps to security engineers |
Organizations experience data demand surges from more K8 users tracking and analyzing the system, spiking costs. |
Data resolution | Low-res metrics vs High-res metrics / +1,000x resolution |
Teams want the precise, real-time analytics that come from monitoring K8 clusters with high-resolution metrics, leading to higher storage needs and costs. |
Entities | Finite number of servers vs Hundreds of pods and containers |
Every container, pod, and service in a K8s cluster becomes an entity needing its own set of metrics and logs, causing monitoring entities to increase a hundredfold, and because each entity emits its own set of metrics, collected data volume escalates. |
Release frequency | Planned only vs Continuous deployment / +20x |
With CD releases and each likely to introduce new metrics or alter existing ones, observability footprints grow. |
Tool | Single tool monitoring vs Tool multiplicity / +10x |
Because K8s environments typically use different tools for monitoring different aspects, complexity and licensing costs rise. |
Data volume | Manageable vs Growing exponentially / +100x |
K8s distributed nature leads to a substantial increase in generated data that needs to be managed. |
Collected data | Discrete vs Significant overlap / 50-100% |
Different teams and tools collecting data causes redundancy and adds unnecessary cost. |
Data transfer | Occasionally vs Often / +500x |
K8s regularly move data across systems in multi-cloud and hybrid worlds, increasing network usage and data egress fees. |
Infrastructure | Consistent vs Increasing / +5x |
Infrastructure as well as storage and power must scale to support growing observability, boosting costs. |
Management | As expected vs Complexity increases overhead / +5x |
K8s requires more resources management — integrating tools, managing data from many sources, and navigating complex pricing models. |
Volk’s EMA report confirms these challenges, explaining “The pitfalls of overpaying stem from a lack of granularity since cloud native apps temporarily utilize infrastructure, yet observability platforms may charge for longer durations. The complexity of various factors, such as data ingestions, query complexity, and multi-cloud integrations, contribute to unpredictable monthly bills.
“Furthermore, the risk of cost escalation is prevalent due to potential misconfigurations, like overly detailed logging in Kubernetes clusters or insufficient data retention policies,” he adds.
Cost transparency leads to Kubernetes cost optimization. Cost visibility also prevents surprises and finger pointing while allowing teams to more effectively perform budget trend analyses across teams running independent microservices. It also enables organizations to set quotas for teams based on observability spend.
Download the full EMA report Observability: Challenges, Priorities, Adoption Patterns, and Solutions to learn more controlling Kubernetes costs and other ways observability will make your cloud native journey a success.
Curious about Chronosphere and Kubernetes? Check out the resources below: