by Jess Lulka
Many people today are asking “What is OpenTelemetry?” OpenTelemetry (OTel) is an observability framework and toolkit for users to create, process, and export telemetry such as traces, metrics, and logs. Its main purpose is to provide a singular standard for code instrumentation and telemetry data transport to an observability tool backend.
As a Cloud Native Computing Foundation (CNCF) project, OpenTelemetry is vendor- and tool-agnostic. Developers can use it with a variety of observability backends, including open source tools such as Jaeger and Prometheus in addition to observability vendors. However, it is not a storage backend or a frontend UI for query visualization and querying.
The OpenTelemetry Collector helps developers monitor microservice application health, capture requests between microservices, trace resource use to specific user groups, create tiered requests for priority resources, and process and transform telemetry before exporting.
To understand OpenTelemetry more fully, let’s take a look at its components, benefits, and tradeoffs.
OpenTelemetry gives organizations control over the telemetry pipeline by processing telemetry before it is sent onto an observability vendor. Open source instrumentation libraries for popular languages and frameworks also mean freedom from vendor lock-in when migrating between platforms.
Before OpenTelemetry, each observability vendor had its own libraries. This made it tough for companies that relied on multiple vendors to collect and export data types, as they couldn’t easily jump between different libraries or various telemetry types.
This meant telemetry was often completely siloed, and businesses would get their competitive advantage on which products they chose for telemetry collection. It also affected how well you could import and work with data depending on the vendor’s library and feature set.
OpenTelemetry removes the need to have multiple instrumentation libraries and streamlines data collection across your observability backend. This makes it easier to get and share telemetry data. Engineers not only get data faster, they only need to maintain one telemetry collector – one that’s based on open source coding – instead of proprietary, vendor-specific code.
Like the old Java tagline “write once, run anywhere,” OpenTelemetry allows you to “instrument once and export anywhere.”
As a toolkit and a framework, OpenTelemetry includes:
All of these components come together to provide a robust foundation for data collection and porting into observability software.
OpenTelemetry brings three main benefits to organizations: A large open source community, increased interoperability, and backend tool flexibility.
Large open source community: A wide variety of users and contributors means that development teams can use OTel in house and can get support through online research and user communities, instead of solely relying on vendor support or wrangling with customer service.
Compatibility across different tools and services: Vendor- and ecosystem-agnostic telemetry collections reduces the required number of tools and lets developers monitor their ecosystem of third-party services without setting up complex integrations that require additional monitoring and operation.
Flexibility to change backend observability tools: Because OpenTelemetry is open source, it gives developers the flexibility to add a telemetry collector without investment in proprietary software or having to set up specific infrastructure for it to run (which adds a point of failure and consumes resources). It also provides a way for organizations to avoid vendor lock-in and make future migrations more efficient.
While OpenTelemetry brings a lot of benefits to observability telemetry collection, there are still things organizations should know during the evaluation process.
OpenTelemetry is much more useful for understanding a system’s inner workings and for backend developers; support for frontend, client and browser data is still pretty experimental and in the early stages.
Some languages are more supported than others, so it’s necessary to see what kind of compatibility developers need before deciding to go all in with OpenTelemetry adoption. The specification is stable and complete, but the language SDK implementation is still a work in progress depending on the specific language.
Generally, OpenTelemetry is not a direct replacement for Prometheus. While both tools work with open source telemetry collection, they offer different feature sets. It’s not a direct, feature-to-feature comparison.
Either project can ingest each other’s data, and so they often coexist within environments.
OpenTelemetry is still a fairly new project, but developers are already seeing its impact within the observability space in terms of adoption acceleration. In 451 Research’s “Voice of the Enterprise (VotE): DevOps, Organizational Dynamics” 2021 data survey, responding organizations have already adopted (47%) or are in the discovery stages (21%) of adopting observability.
Going forward, OpenTelemetry’s standardized processes and frameworks should provide greater flexibility to change backend tools as organizations streamline observability, which boosts the project’s progress and helps teams avoid vendor lock-in.
Using OpenTelemetry as a de facto standard allows organizations to spend less time instrumenting for data collection and more time leveraging the data when they believe it is mature without adding application performance overhead.
As a cloud native, open source platform, Chronosphere provides both community support and product support for OpenTelemetry. Within our product, Chronosphere ingests OpenTelemetry metrics without a server-side component and is currently integrating the same support for traces. Recent contributions to the OpenTelemetry project include a Jaeger Remote Sampling extension and several bug fixes.
Chronosphere believes the future of foundational observability is open source. We are committed to providing a fully open source compatible observability solution from a commercial SaaS vendor.
Part of this mission is growing and improving the open source ecosystem. In 2022, Chronosphere and PromLabs donated PromLens to the Prometheus organization to make it easier to build PromQL queries and expand cloud native standard adoption.
But our commitment to open source didn’t stop there. Let’s take a look at a few highlights of what we’ve been up to lately:
First, congratulations to all the Perses contributors on the exciting public Alpha release announced today at PromCon EU 2023.
What is Perses and why is it so exciting? The project is:
Chronosphere has worked closely with Amadeus and others to reach this milestone. In fact, more than 50% of the non-bot-based commits were from Chronosphere employees.
The work done here will be used to power Chronosphere dashboards, and we look forward to seeing this exciting project flourish.
Previously, customers who wanted to send trace data to Chronosphere needed to set up an OpenTelemetry Collector, set up a Chronosphere Collector, send data from OpenTelemetry to the Chronosphere Collector, and then from the Chronosphere Collector to the Chronosphere platform.
Customers now can send trace and metric data directly from the OpenTelemetry Collector to Chronosphere, eliminating the dependency on Chronosphere software client-side.
Chronosphere recently made contributions that enhance the Jaeger Remote Sampling extension of the OpenTelemetry Collector. This extension allows serving sampling strategies following Jaeger’s remote sampling config API. The changes add support for:
This enhancement is available in v.0.83.0 and higher of the OpenTelemetry Collector.
These contributions to both Perses and OpenTelemetry help establish a more robust foundation for open source, cloud native observability and continue to build out Chronosphere’s functionality to provide an open source SaaS tool.
Remember the days when we were suspicious about buying clothing online? Shoes, maybe, but a sweater – no way! What if it didn’t fit or the pictures were misleading? These purchases were deemed too risky thanks to clunky return processes and policies designed for in-person shopping. Today, no one even blinks before buying clothes online knowing that the return process is seamless with pre-addressed bags and USPS pickups right at your front door. While we still value the ability to “try before you buy,” overall our options have increased thanks to the wide world of e-commerce.
Now think about experiences you’ve had trying out a new observability vendor – it’s basically the polar opposite. Backing out of a wrong choice is not as straightforward as returning a sweater, and it’s common to become locked-in financially with a contract, and technologically with proprietary instrumentation libraries making the ability to leave a mere dream.
To be frank, the entire migration process between observability solutions isn’t straightforward and is highly dependent on your engineering teams’ efforts, system expertise, and enablement support. There is no magic “easy” button here and I am wary of anyone promising a way to “future-proof” anything in technology — the pace of progress and innovation is fast and ever-changing. Standardizing on instrumentation formats, common metadata, processing rules or templatizing dashboards and monitors company-wide can ease the burden of a migration — but these efforts often come too late and increase the scope of a migration project.
What is actualizable today is the ability to easily try observability solutions before you buy thanks to vendor-neutral open source instrumentation standards and increased interoperability between telemetry formats. OpenTelemetry’s savvy decision to support a wide range of data formats means that you can pipe all the signals and configure export destinations through one component, the Collector, instead of a multitude of signal or format-specific agents. Piping your existing telemetry through the OpenTelemetry Collector enables you to quickly ship data to multiple solutions, easing the first step of piloting new observability platforms today.
Nobody has ever said: “I love migrating observability vendors.”
The fact of the matter is, no one really wants to embark on migrating between observability tooling, it’s a thorny project combining high visibility, criticality, and is an exciting way for your SRE team to discover unowned services.
Migrations are often driven by the need to reduce the financial cost, meaning your migration deadline hinges on the end date of your original contract. This leaves your engineering group tasked with exploring options, selecting a new solution, and completing the migration of data, dashboards, and monitors on a fixed timeline.
How did you wind up with an ill-fitting tool in the first place? Common reasons include:
Getting from “let’s start a pilot” to actually being able to query data and fully evaluate the features of a solution was an exercise in frustration when proprietary instrumentation was a truly closed system. It meant finding a few teams that had time to add new instrumentation, getting the ops team to figure out how to configure, deploy, and operate some vendor-specific agent and components. Then hoping that engineers were intentional about spending time to learn, explore, and use the solution and provide helpful feedback to inform a go/no-go decision. If that wasn’t exhausting enough, you’d repeat this cycle for as many vendors and solutions you were willing to explore while navigating mysterious and obtuse pricing models to figure out if this would actually be cheaper than the previous solution.
The more time you have to spend on instrumentation, configuration, and set up, the less time there is to learn your way around a new product, spend quality time testing out common workflows, and give informed, quality feedback. Without engineers actually getting to use the solutions for real world investigations, you risk ending up with an ill-fitting tool and could wind up in this same position 2 to 3 years down the road after the immediate financial pain is soothed.
What’s maddening is that telemetry is your organization’s data. It’s information about the performance of key features, customer experience, service health, infrastructure, and third-party dependencies. Not being in full control of how or where this data is sent is a major challenge when it comes to piloting a new observability platform.
Addressing this pain has brought together popular open source observability projects and all of the major observability/monitoring vendors driving to create industry standards and interoperability with OpenTelemetry. A great example is the recent work to merge the Elastic Common Schema (ECS) into OpenTelemetry Semantic Convention creating a single open schema for all, bringing more consistent signals across metrics, logs, and traces while also adding key security domain fields! Or the joint efforts between Prometheus and OpenTelemetry that enables each project to ingest or translate metrics between their different metric formats.
This era of collaboration and co-opetition is bringing OpenTelemetry’s stated goal of enabling effective observability through high-quality, ubiquitous, and portable telemetry closer with every release.
The OpenTelemetry Collector is a vendor-agnostic way to receive, process and export telemetry data. Not just traces, but also metrics and now logs! This would all start to feel like yet another way to get locked-in if the Collector only spoke OpenTelemetry Protocol (OTLP), but the power comes from the choice to integrate with other data formats, like ones you’re likely already running. Options include Prometheus, FluentBit, StatsD, Zipkin, and yes even vendor-specific receivers.
In addition to handling exporting telemetry in a wide variety of formats, another powerful feature of the Collector is the ability to process data that’s passing through. From configuring sampling, to deriving metrics from spans, or enriching data with business-specific attributes, there are endless possibilities with Processors.
It is this combination that makes the OpenTelemetry Collector so compelling — the ability to receive and send a wide variety of telemetry formats. This removes the immediate need for re-instrumentation, which exposes many ways to interact with and enhance your telemetry on your terms, and enables you to configure when and where that data is sent and stored. In a world with portable telemetry, vendors can no longer rely on locking you into their ecosystem and leaving you to suffer from sweeping pricing changes and unpredictable bills.
The first wave of early adopters have been finding success with the Collector, contributing back upstream to ensure it is production ready.
Earlier this year at the SCaLE conference, software engineer, Paak Quansah, presented PlutoTV’s journey with deploying the Collector in a part of a larger effort to standardize their approach to observability. Their set up will sound familiar to most — PlutoTV was contending with multiple instrumentation formats and pipelines and was looking to consolidate tooling and use a single vendor. What’s different is their experience while evaluating multiple observability vendors. Instead of the pain described above, installing the Collector enabled PlutoTV to send telemetry to multiple vendors without requiring any additional instrumentation or changes at the application level. The Collector only needed a configuration update to know where to send the data!
Another success story came from Adobe, which has many different engineering teams with different processes, libraries, formats, and a mix of observability vendors and open source solutions. It was quite a challenge for their SRE teams to manage this and they adopted the Collector. In the words of the Adobe team:
At Open Source Summit, Shubhanshu Surana, an SRE at Adobe who is focused on observability, presented OTel Collector: The Swiss Army Knife of Observability. During his presentation he said,
With the introduction of OpenTelemetry Collector, as well as the OTLP format, This made it super easy for us; we are able to send their data to multiple vendors, multiple toolings with just few changes on our side.
With these enterprises at scale pioneering use of the Collector, we can soon leave the days of painful observability evaluations in the past.
There has never been a better time to explore your options for observability tooling than now. So is trying out alternative platforms with the Collector as easy as trying and returning a sweater from online? Almost. Thanks to the efforts of countless open source contributors and vendor collaboration, unlocking the power to route all sorts of telemetry through the Collector pilot pain is significantly lessened.
The evaluation phase is just the beginning and the partner you choose will influence how successful and smooth the rest of the migration goes. If open source observability is new to your organization, finding a vendor with deep expertise that can advise and support your SRE team, especially as they get things set up, helps get the data flowing.
A transparent pricing model makes engineering leaders, finance, and executives rest easy knowing there will be no unpleasant billing surprises. Stellar customer support means getting questions answered quickly and accurately. Variety of modals for training and enablement means more engineers can use the platform to effectively investigate and understand system behavior.
If that doesn’t sound like your current experience with an observability vendor, consider Chronosphere where we will pair with you every step on your migration journey.
OpenTelemetry is commonly used across the software industry. But ask any engineer about OpenTelemetry, and the extent of their knowledge roughly equates to “uh, I use it so I don’t have to be vendor locked”. In our latest ELI5 blog post, we uplevel your knowledge of OpenTelemetry’s utility as a universal collector for observability data.
OpenTelemetry is a standard approach to collecting observability data. As part of the Cloud Native Computing Foundation (CNCF), OpenTelemetry provides APIs and SDKs in more than a dozen languages to help you instrument, generate, collect, and export telemetry data (opentelemetry metrics, traces, and logs).
But what exactly does all this mean and why is it important? Taking a page out of the “explain it like I’m 5” playbook, let me break it down into simpler terms.
To illustrate OpenTelemetry’s importance (and to stretch the ELI5 analogy even further) let’s talk about a game popular with children of all ages: Minecraft. In order to properly play Minecraft, you need some hardware building blocks: a laptop, monitor, mouse, keyboard, and headset. And for all of this peripheral hardware to work as it should, each device needs to communicate their respective input and outputs. Unfortunately, each device has a different connector design: the monitor takes an HDMI; the headset uses an audio jack; and the mouse and keyboard are USB-B.
All these differentiated inputs means lugging around cords, adapters, and dongles just to play the damn game. What a pain. Not only that, some vendors (like Apple) made proprietary interfaces (like Lightning) such that you can’t switch platforms without changing hardware all over again. The result is similar to fitting round pegs into Minecraft’s block-shaped holes.
Today, there’s a much better solution: USB-C. As a universal connector with a standardized input, you can connect hardware peripherals — monitor, headset, mouse, and keyboard — such that they can all talk to each other while playing Minecraft. And if one day you decide to use a different computer, such as a PC, they are also USB-C compatible.
What USB-C is to laptops, OpenTelemetry is to your telemetry data. And just as USB-C is vendor-agnostic (remember, the U in USB stands for universal), OpenTelemetry is a universal telemetry data format compatible with any vendor. In this way, the hardware peripherals I just mentioned – mouse, keyboard, and monitor – can be seen as metaphors for different tech stacks like Google Cloud Compute Engine (GCE), Kubernetes, or any number of microservices.
What’s more, if I instrument all my technical services with an OpenTelemetry agent, I can have my telemetry data from GCE sent to any location or vendor that supports it. This is similar to how any USB-C-compatible device can be used on any computer that has USB-C port, not just one specific computer.
Without a universal collector format like OpenTelemetry, I’d have to heavily rely on vendor tools to standardize those differentiated telemetry data types. Remember the nightmare it was linking mouses, keyboards, and monitors — and how USB-C fixed all that? In that same way, cloud native observability solutions like Chronosphere are built to be compatible with open source standards like OpenTelemetry and Prometheus.
Engineering teams need telemetry data to know, triage, and understand the issues that occur across our systems and services. And as organizations adopt cloud native infrastructures, it’s not uncommon for systems and services to be used only for fractions of seconds at a time. Which means you need to set yourself up to not only get the right data, but to get it at the right intervals and right time. Just like USB-C made it easier to adapt to different hardware types, OpenTelemetry has made it easier to instrument telemetry data sources and use services — services like Chronosphere.
Chronosphere is a SaaS cloud monitoring tool that helps teams rapidly navigate the three phases of observability. We’re focused on giving devops teams, SREs, and central observability teams all the tools they need to know about problems sooner, the context to triage them faster, and insights to get to the root cause more efficiently. One of the major things that make us different is that we were built from the ground-up to support the scale and speed of cloud native systems. We also embrace open source standards, so developers can use the language and formats they’re already familiar with, and it prevents lock-in.
What is the difference between open tracing and opentelemetry? OpenTelemetry and OpenTracing are both open-source projects that provide standard APIs for instrumenting applications and collecting telemetry data. The main difference between the two is their focus. OpenTelemetry focuses on providing a vendor-neutral, single set of APIs for capturing metrics and traces, while OpenTracing focuses on tracing and providing a vendor-neutral API for distributed tracing.
OpenTelemetry also provides more built-in support for metrics and includes features such as automatic metrics collection and correlation, while OpenTracing primarily focuses on providing a consistent API for tracing, but does not have built-in support for metrics. Both projects are complementary and can be used together to provide a comprehensive observability solution.
To learn more about OpenTelemetry, and how it can support your projects, get in touch with the team at Chronosphere for a demo.
Chronosphere has its eye on the future of observability. We are constantly talking to companies and industry experts about the observability challenges ahead so we can be sure Chronosphere is part of the solution. Sharing is caring, so we’ve started a video series talking about hot observability topics and we’re summarizing those discussions in some quick-read blogs.
We kick things off with the spotlight on Chronosphere co-founder and CEO, Martin Mao, who shares his insights with Chronosphere Technical Writer, Chris Ward, about why 2022 will be a big year for OpenTelemetry adoption.
Chris: OpenTelemetry has steadily gained adoption over the past couple of years. We’ve seen organizations increasingly reject an unstandardized approach to observability and stick to open standards. What do you predict for the rest of 2022 in terms of this further shift to open standards, open protocols, and, most specifically, OpenTelemetry?
Martin: As you rightly pointed out, there has been a shift toward open standards in the observability space over the past few years. We see that on the metrics side with Prometheus. With logs slightly less so, but there’s a lot of standardization in the collection agents through fluentbit and Fluentd. Here are a few reasons why OpenTelemetry in particular is quickly gaining standardization:
These are a few reasons why OpenTelemetry adoption has been speeding up a lot. As we talk to different companies, I think this year we’ll see OpenTelemetry gaining critical mass because:
This is great for the end users because it means that you don’t need to care about which vendor or open source solution you use to store the data. You do the instrumentation once and that should future-proof you, depending on what other decision you make down the track.
Chris: Why do you think OpenTelemetry adoption will see gains specifically this year?
Martin: It’s not just adoption of the standard, but we are seeing a critical mass of clients and programming languages.
Distributed tracing is harder than metrics and Prometheus because you need the whole end-to-end request flow brought in. We’ve been seeing companies attempt to do this for many years now, but we’re actually seeing a higher rate of success in at least the instrumentation side of things this year. It’s a culmination of years of things that are happening in the industry to get to this point.
Chris: In the three data points you mentioned earlier, are there any other emerging open standards that are worth keeping an eye into 2023?
Martin: Prometheus would be one for the metrics standard protocol. We’re seeing that most business software has native Prometheus integration, and, as people go cloud native, that seems to be the protocol of choice. Prometheus is quickly becoming the standard for the metric data type.
Watch the video below to listen in on the full discussion about OpenTelemetry adoption. In the upcoming weeks, you can expect more videos on new topics ranging from high cardinality, the three phases of observability, to the future of PromQL with Julius Volz. Stay tuned to this space for more, and also make sure to subscribe to the Chronosphere YouTube channel so you don’t miss any future videos.
A lot of what you read around the topic of Observability mentions the benefits and potential of analyzing data, but little about how you collect it. This process is called “instrumentation” and broadly involves collecting events in infrastructure and code that include metrics, logs, and traces. There are of course dozens of methods, frameworks, and tools to help you collect the events that are important to you, and this post begins a series looking at some of those. This post focuses on introductory concepts, setting up the dependencies needed, and generating some basic metrics. Later posts will take these concepts further.
Different vendors and open source projects created their own ways to represent the event data they collect. While this remains true, there are increased efforts to create portable standards that everyone can use and add their features on top of but retain interoperability. The key project is OpenTelemetry from the Cloud Native Computing Foundation (CNCF). This blog series will use the OpenTelemetry specification and SDKs, but collect and export a variety of the formats it handles.
The example for this post is an ExpressJS application that serves API endpoints and exports Prometheus-compatible metrics. The tutorial starts by adding basic instrumentation and sending metrics to a Prometheus backend, then adds more, and adds the Chronosphere collector. You can find the full and final code on GitHub.
ExpressJS provides a lot of boilerplate for creating a JavaScript application that serves HTTP endpoints, so is a great start point. Add it to a new project by following the install steps.
Create an app.js file and create the basic skeleton for the application:
const express = require("express");
const PORT = process.env.PORT || "3000";
const app = express();
app.get("/", (req, res) => {
res.send("Hello World");
});
app.listen(parseInt(PORT, 10), () => {
console.log(`Listening for requests on https://localhost:${PORT}`);
});
Running this now with node app.js starts a server on port 3000. If you visit localhost:3000 you see the message “Hello World” in the web browser.
This step uses the tutorial from the OpenTelemetry site as a basis with some changes, and builds upon it in later steps.
Install the dependencies the project needs, which are the Prometheus exporter, and the base metrics SDK.
npm install --save @opentelemetry/sdk-metrics-base
npm install --save @opentelemetry/exporter-prometheus
Create a new monitoring.js file to handle the metrics functions and add the dependencies:
const { PrometheusExporter } = require('@opentelemetry/exporter-prometheus');
const { MeterProvider } = require('@opentelemetry/sdk-metrics-base');
Create an instance of a MeterProvider that uses the Prometheus exporter. To prevent conflicts with ports, the exporter uses a different port. Typically Prometheus runs on port 9090, but as the Prometheus server runs on the same machine for this example, use port 9091 instead.
const meter = new MeterProvider({
exporter: new PrometheusExporter({port: 9091}),
}).getMeter('prometheus');
Create the metric to manually track, which in this case is a counter of the number of visits to a page.
const requestCount = meter.createCounter("requests", {
description: "Count all incoming requests",
monotonic: true,
labelKeys: ["metricOrigin"],
});
Create a Map of the values based on the route (which in this case, is only one) and create an exportable function that increments the count each time a route is requested.
In the app.js file, require the countAllRequests function, and add with Express’s .use middleware function, call it on every request.
const { countAllRequests } = require("./monitoring");
…
app.use(countAllRequests());
At this point you can start Express and check that the application is emitting metrics. Run the command below and refresh localhost:3000 a couple of times.
node app.js
Open localhost:9091/metrics and you should see a list of the metrics emitted so far.
Install Prometheus and create a configuration file with the following content:
global:
scrape_interval: 15s
# Scraping Prometheus itself
scrape_configs:
- job_name: 'prometheus'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9090']
# Not needed when running with Kubernetes
- job_name: 'express'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9091']
Start Prometheus:
prometheus --config.file=prom-conf.yml
Start Express and refresh localhost:3000 a couple of times.
node app.js
Open the Prometheus UI at localhost:9090, enter requests_total
into the search bar and you should see results.
So far, so good, but Prometheus is more useful when also monitoring the underlying infrastructure running an application, so the next step is to run Express and Prometheus on Kubernetes.
The express application needs a custom image, create a Dockerfile and add the following:
FROM node
WORKDIR /opt/ot-express
# install deps
COPY package.json /opt/ot-express
RUN npm install
# Setup workdir
COPY . /opt/ot-express
# run
EXPOSE 3000
CMD ["npm", "start"]
Build the image with:
docker build -t ot-express .
Download the Kubernetes definition file from the GitHub repo for this post.
A lot of the configuration is necessary to give Prometheus permission to scrape Kubernetes endpoints, the configuration more unique to this example is the following:
apiVersion: apps/v1
kind: Deployment
metadata:
name: ot-express
spec:
replicas: 1
selector:
matchLabels:
app: ot-express
template:
metadata:
labels:
app: ot-express
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9091"
spec:
containers:
- name: ot-express
image: ot-express
imagePullPolicy: Never
ports:
- name: express-app
containerPort: 3000
- name: express-metrics
containerPort: 9091
---
apiVersion: v1
kind: Service
metadata:
name: ot-express
labels:
app: ot-express
spec:
ports:
- name: express-app
port: 3000
targetPort: express-app
- name: express-metrics
port: 9091
targetPort: express-metrics
selector:
app: ot-express
type: NodePort
This deployment uses annotations to inform Prometheus to scrape metrics from applications in the deployment, and exposes the express and Prometheus ports it uses.
Update the Prometheus configuration to include scraping metrics from Kubernetes-discovered endpoints. This means you can remove the previous Express job.
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'prometheus'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9090']
- job_name: 'kubernetes-service-endpoints'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name
Create a ConfigMap of the Prometheus configuration:
kubectl create configmap prometheus-config --from-file=prom-conf.yml
Send the Kubernetes declaration to the server with:
kubectl apply -f k8s-local.yml
Find the exposed URL and port for the Express service, open and refresh the page a few times. Find the exposed URL and port for the Prometheus UI, enter requests_total
into the search bar and you should see results.
The demo application works and sends metrics when run on the host machine, Docker, or Kubernetes. But it’s not complex, and doesn’t send that many useful metrics. While still not production-level complex, this example application from the ExpressJS website adds multiple routes and HTTP protocols.
Adding in the other code the demo application needs, update app.js to the following:
const express = require("express");
const { countAllRequests } = require("./monitoring");
const PORT = process.env.PORT || "3000";
const app = express();
app.use(countAllRequests());
function error(status, msg) {
var err = new Error(msg);
err.status = status;
return err;
}
app.use('/api', function(req, res, next){
var key = req.query['api-key'];
if (!key) return next(error(400, 'api key required'));
if (apiKeys.indexOf(key) === -1) return next(error(401, 'invalid api key'))
req.key = key;
next();
});
var apiKeys = ['foo', 'bar', 'baz'];
var repos = [
{ name: 'express', url: 'https://github.com/expressjs/express' },
{ name: 'stylus', url: 'https://github.com/learnboost/stylus' },
{ name: 'cluster', url: 'https://github.com/learnboost/cluster' }
];
var users = [
{ name: 'tobi' }
, { name: 'loki' }
, { name: 'jane' }
];
var userRepos = {
tobi: [repos[0], repos[1]]
, loki: [repos[1]]
, jane: [repos[2]]
};
app.get('/api/users', function(req, res, next){
res.send(users);
});
app.get('/api/repos', function(req, res, next){
res.send(repos);
});
app.get('/api/user/:name/repos', function(req, res, next){
var name = req.params.name;
var user = userRepos[name];
if (user) res.send(user);
else next();
});
app.use(function(err, req, res, next){
res.status(err.status || 500);
res.send({ error: err.message });
});
app.use(function(req, res){
res.status(404);
res.send({ error: "Sorry, can't find that" })
});
app.listen(parseInt(PORT, 10), () => {
console.log(`Listening for requests on https://localhost:${PORT}`);
});
There are a lot of different routes to try (read the comments in the original code), but here are a couple (open them more than once):
Start the application with Docker as above, and everything works the same, but with more metrics scraped by Prometheus.
If you’re interested in scraping more Express-related metrics, you can try the express-prom-bundle package. If you do, you need to change the port in the Prometheus configuration, Docker and Kubernetes declarations to the Express port, i.e. “3000”. You also no longer need the monitoring.js file, or the countAllRequests
methods. Read the documentation for the package for more ways to customize it for generating metrics important to you.
Chronosphere is a drop-in scalable back-end for Prometheus, book a live demo to see more.
If you’re already a customer, then you can download the collector configuration file that determines how Chronosphere collects your metrics dataa, and add the domain of your instance and API key as SHA-256 encoded values to the Kubernetes Secret declaration:
apiVersion: v1
data:
address: {SUB_DOMAIN}
api-token: {API_TOKEN}
kind: Secret
metadata:
labels:
app: chronocollector
name: chronosphere-secret
namespace: default
type: Opaque
Follow the same steps for starting the application and Prometheus:
kubectl create configmap prometheus-config --from-file=prom-conf.yml
kubectl apply -f k8s-local.yml
Apply the Chronosphere collector definition:
kubectl apply -f chronocollector.yaml
Again, refresh the application page a few times, and take a look in a dashboard or the metrics profiler in Chronosphere and you should see the Express metric.
This post showed you how to setup a JavaScript application to collect OpenTelemetry data using the Prometheus collector and send basic metrics data. Future posts will dig into the metrics and how to apply them to an application in more detail.