Container monitoring is the process of collecting metrics on microservices-based applications that run on a container platform. Containers are designed to spin code up and shut down quickly, which makes it absolutely essential to know when something goes wrong as downtime is costly and outages damage customer trust.
Containers are an essential part of any cloud native architecture, which makes it paramount to have software that can effectively monitor and oversee container health and optimize resources to ensure high infrastructure availability.
Let’s take a look at the components of container monitoring, how to select the right software, and current offerings.
Containers provide IT teams with a more agile, scaleable, portable, and resilient infrastructure. Container monitoring tools are necessary, as they let engineers resolve issues more proactively, get detailed visualizations, access performance metrics and track changes. As engineers get all of this data in near-real time, there is a real good potential of reducing mean time to repair (MTTR).
Engineers must be aware of the limitations of containers: complexity and changing performance baselines. While containers can spin up quickly, they can increase infrastructure sprawl, which means greater environmental complexity. It also can be hard to define baseline performance as containerized infrastructure consistently changes.
Container monitoring must be specifically suited for the technology; legacy monitoring platforms, designed for virtualized environments, are inadequate and do not scale well with container environments. Cloud native architectures don’t rely on dedicated hardware like virtualized infrastructure, which changes monitoring requirements and processes.
A container monitoring platform uses logs, tracing, notifications and analytics to gather data.
What does container monitoring data help users do?
It allows users to:
The software uses these methods to capture data on memory utilization, CPU use, CPU limits, and memory limit – to name a few.
Distributed tracing is an essential part of container monitoring. Tracing helps engineers understand containerized application performance and behavior. It also provides a way to identify bottlenecks and latency problems, how changes affect the overall system and what fixes work best in specific situations. It’s very effective at providing insights into the path taken by an application through a collection of microservices when it’s making a call to another system.
More comprehensive container monitoring offerings account for all stack layers. They can also produce text-based error data such as “container restart” or “could not connect to database” for quicker incident resolution. Detailed container monitoring means users can learn which types of incidents affect container performance and how shared computing resources connect with each other.
Container monitoring requires multiple layers throughout the entire technology stack to collect metrics about the container and any supporting infrastructure, much like application monitoring. Engineers should make sure they can use container monitoring software to track the cluster manager, cluster nodes, the daemon, container, and original microservice to get a full picture of container health.
For effective monitoring, engineers must create a connection across the microservices running in containers. Instead of using service-to-service communication for multiple independent services, engineers can implement a service mesh to manage communication across microservices. Doing so allows users to standardize communication among microservices, control traffic, streamline the distributed architecture and get visibility of end-to-end communication.
In the container monitoring software selection process, it’s important to identify which functions are essential, nice to have or unnecessary. Tools often include these features:
Beyond specific features and functions, there are also user experience questions to ask about the software:
The right container monitoring software should make it easy for engineers to create alarms and automate actions when the system reaches certain resource usage thresholds.
When it comes to container management and monitoring, the industry offers a host of open source and open source managed offerings: Prometheus, Kubernetes, Jaeger, Linkerd, Fluentd, and cAdvisor are a few examples.
Chronosphere’s offering is built for cloud native architectures and Kubernetes to help engineering teams who are collecting container data at scale. Chronosphere’s platform can monitor all standard data ingestion for Kubernetes clusters, such as pods and nodes, standard ingestion protocols as with Prometheus.
Container monitoring software generates a lot of data. When combined with cloud native environment metrics, this creates a data overload that outpaces infrastructure growth. This makes it important to have tools that can help refine what data is useful, that it gets to the folks that need it the most and ends up on the correct dashboards.
The Control Plane can help users fine-tune which container metrics and traces the system ingests. Plus, with the Metrics Usage Analyzer, users are put back in control of which container observability data is being used, and more importantly, pointing out when data is not used. Users decide which data is important after ingestions with the Control Plane so their organization avoids excessive costs across their container and services infrastructure.
To see how Chronosphere can help you monitor your container environments, contact us for a demo today.
Cloud native applications and microservices are essential for any modern business to effectively run. But tech executives can’t just buy off-the-shelf solutions or lift-and-shift legacy infrastructure into cloud native architectures without planning. Engineers need the right tools, teams, and skills, yet it can be tough to know which tools to purchase, implement and how to calculate the return on investment (ROI).
The Gartner report, “Gartner® Report: a CTO’s Guide to Navigating the Cloud-native Container Ecosystem,” states containers and Kubernetes have emerged as prominent platform technology for building cloud native apps and modernizing legacy workloads. By 2027, more than 90% of global organizations will be running containerized applications in production, which is a significant increase from fewer than 40% in 2021.
The authors also write that “enterprises face challenges in accurately measuring the ROI of their cloud native investments and in creating the right organizational structure for it to flourish.”
Here’s what enterprises should know about containers and Kubernetes, their main use cases, and how they help run cloud native architecture.
Containers are packages of application code bundled together. Kubernetes is a platform that helps manage containers.
These technologies are commonly used for microserves, application portability and to reduce the risk of lock-in. They also enable DevOps workflows and legacy application modernization. Any company that decides to go cloud native or upgrade its infrastructure must use both containers and Kubernetes.
Most container images are based on open source software. A container image is the static file that holds all the executable code to create a container within a computing system.
The report highlights that compared to open source software — where container support is already commonplace — the COTS application container support has much slower growth and greatly varies by vendor.
“While some COTS ISVs [independent software vendors] strategically provide strong support for Kubernetes, such as IBM, many COTS ISVs haven’t supported yet — especially in Windows-based or enterprise business applications. You should review container support strategy and roadmaps of their strategic COTS ISVs,” the authors write.
Still, Gartner notes an increasing number of vendors are developing container support and “more ISVs are enabling deeper integrations with containers/Kubernetes than just providing container images.”
The report highlights that AWS Marketplace for Containers has 524 container-related entries as of February 2022, 64% up from 320 in February 2020.
The industry trends emerging around Kubernetes and containers include VM convergence, stateful application support, edge computing, serverless convergence, and application workflow automation.
The combination of open source and COTS applications for Kubernetes and containers provides organizations several deployment options:
Containers offer multiple benefits for organizations that specifically run cloud native architectures. They provide agile application development and deployment, environmental consistency, and immutability.
As Kubernetes runs on top of the container software, it offers flexibility and choice.
“Kubernetes is supported by a huge ecosystem of cloud providers, ISVs and IHVs [independent hardware vendors]. This API and cross-platform consistency, open-source innovation and industry support offers a great degree of flexibility for CTOs,” the Gartner authors state.
Platform complexity is definitely something that CTOs and IT managers must acknowledge, especially as containers and Kubernetes aren’t optimal for every possible use case. These technologies work best with dynamic, scalable environments, and add complexity if engineers attempt to use them to manage static, COTS applications.
A big part of successful container and Kubernetes implementation is making sure that there are the proper teams and skill sets in place to manage and run the technology. Organizations should invest in a variety of core and secondary roles that cover security, platform operations, reliability engineering, as well as build and release engineering. These designations can help ensure a Kubernetes deployment is secure, reliable and consistently developed for the organization at scale.
The development team should be tasked with coding, application design, implementation and testing, as well as source code management.
A platform engineering team oversees platform selection, installation, configuration, and administration. The members should also be able to maintain base images, integrate the DevOps pipeline, enable self-service capabilities for developers, automate container provisioning, and provide capacity planning and workload isolation.
Reliability engineers work on security, monitoring and performance aspects. They should focus on application resiliency; be able to debug and document production issues; and be responsible for incident management and response.
Lastly, the build and release engineering team chooses the CI/CD deployment pipeline, develops templates for new services, educates development teams, and creates dashboards to measure efficiency and productivity.
“Ensuring ROI by building a thorough business case is important to validate that you aren’t
investing in containers and Kubernetes purely because it is a shiny new technology.
Organizations need to take a realistic view of the costs incurred and potential benefits,” the report authors write.
The benefits of containers that organizations can measure include developer productivity, an agile CI/CD environment, infrastructure efficiency gain, and a reduced operational overhead.
Potential costs that could cut into benefits are container-as-a-service/platform-as-a-service (CaaS/PaaS) license fees; additional software licenses for security, automation, and monitoring tools; infrastructure investment costs; hiring new staff to run such deployments; as well as professional implementation services to get everything online and running smoothly.
The Gartner authors recommend that technology leaders should:
Containers and Kubernetes provide a clear technological foundation for organizations and IT leaders that want to run cloud native architectures and bring legacy applications into the 21st century. Though for a successful deployment, and a healthy ROI, organizations should make sure the right applications and people are in place.
Interested in how containers and Kubernetes are essential for cloud native applications? Contact us for a demo today.
The Cloud Native Computing Foundation (CNCF) has many different projects (with over 143k contributors at the time of writing), and each project falls under one of the following categories (check out the CNCF landscape to see the projects under each category):
The problem with these categories, however, is there are many sub-categories within each one, making it challenging when wanting to locate a project of a specific engineering domain or see how various projects overlap and interact across categories. This is why the CNCF created Technical Advisory Groups (TAGs) – formerly known as Special Interest Groups (SIGs) – to help provide technical guidance and expertise across projects pertaining to a specific domain – in particular around security, app delivery, storage, network, runtime, contributor strategy, and observability.
Each TAG has an active member base that meets on a regular basis to discuss new and existing projects, address any challenges or concerns from the community, establish best practices, and provide resources and expertise around the current state and future of the domain.
Chronosphere is a silver member of the CNCF, and has been involved in the TAG Observability for the past year. This blog provides an overview of the TAG’s overall charter, as well as a recap of what was accomplished this year.
The TAG Observability’s mission statement is to “focus on topics pertaining to the observation of cloud native workloads. Additionally, it produces supporting material and best practices for end-users and provides guidance and coordination for CNCF projects working within the TAG’s scope.” You can learn more about the TAG’s scope in their Github repository.
The TAG meets on the first Tuesday of every month, and is led by co-chairs Matt Young, Alotita Sharma, and Richard Hartmann. Some CNCF projects tied to the TAG include Cortex, OpenMetrics, Prometheus, Thanos, Fluentd, Jaeger, OpenTelemetry, OpenTracing, Chaos Mesh, and Litmus.
The TAG Observability has had a busy and productive year. In addition to helping perform due diligence for the incubation of OpenTelemetry, Cortex, Thanos, and OpenMetrics into the CNCF, the group published a whitepaper, established new working groups, and presented at KubeCon Europe and North America. See the below recaps for more information.
The TAG published a whitepaper earlier this year with the aim of helping community members get started quickly with different approaches to observability within a cloud-native world.
The paper has sections for metrics, logs, traces, and profiles, as well as an explanation around how these observability signals are correlated and should be handled. It lists different methods that companies have used when tackling common observability issues or challenges, presents tools that fall under the observability scope and where they should fit into your observability stack, and finally, shares some of the commonly known gaps in the observability market.
Note: The whitepaper is still a work in progress. If interested in participating or providing feedback, join the #tag-observability channel on the CNCF’s slack.
TAGs can have various Working Groups (WGs) focused on addressing specific problems or areas to help improve or progress the TAG’s overall mission. This year, several WGs were created with a goal of creating more resources for new and existing members of the TAG, as well as to help grow membership and participation. The WGs are broken out as the following initiatives:
TAG leaders Bartlomiej Płotka, Richard Hartmann, and Simone Ferlin gave an update on the TAG at this year’s KubeCon Europe. A few highlights include:
Co-chairs Matt Young and Alolita Sharma gave a session focused on the TAG at this year’s KubeCon North America. In addition to providing an update on the TAG – both on what’s been accomplished in 2021 and on what to expect in the upcoming years – they also discussed:
See the full session recordings for KubeCon Europe and KubeCon North America for more insight into what was discussed.
If interested in learning more about or joining the TAG, feel free to join the #tag-observability channel within CNCF’s slack or subscribe to the mailing list. The group also holds monthly meetings which can be found on the CNCF’s events calendar, and are open to any and all interested participants.