Troubleshoot microservices issues faster with Differential Diagnosis (DDx)

Icon of a megaphone on a green and blue abstract background, symbolizing a diagnosis announcement.
ACF Image Blog

Chronosphere is excited to introduce Differential Diagnosis (DDx). Fast troubleshooting is crucial in complex microservices environments. DDx makes it easier for developers to resolve issues without being slowed down by complexity or lengthy investigations.

Scott Kelly, a man with short brown hair and a trimmed beard, is wearing a light-colored collared shirt. He is smiling at the camera while standing indoors, with visible ceiling pipes and light fixtures in the background.
Scott Kelly | Senior Product Marketer | Chronosphere

Scott Kelly is a Sr. Product Marketing Manager at Chronosphere. Previously, he worked at VMware on the Tanzu Observability (Wavefront) team and led partner go-to-market strategies for VMware’s Tanzu portfolio with AWS and Microsoft Azure. Prior to VMware, Scott spent three years in product marketing at Dynatrace. Outside of work, Scott enjoys CrossFit, tackling home improvement projects, and spending time with his family in Naples, FL.

6 MINS READ

What is DDx?

Chronosphere is excited to introduce Differential Diagnosis (DDx), which helps developers quickly find the source of service slowdowns or failures. DDx offers a simple, query-free workflow that requires no prior system knowledge, making it easy for developers at any skill level to use. In complex microservices and containerized environments, fast troubleshooting is crucial. DDx makes it easier for developers to resolve issues without being slowed down by complexity or lengthy investigations. Today, DDx is available as part of Chronosphere’s Distributed Tracing solution.

The challenges of troubleshooting in complex microservices environments

For developers, troubleshooting is part of the daily grind. The process usually begins with an alert – sometimes in the middle of the night an on-call developer then starts to form a hypothesis about what might have caused the alert to fire. Perhaps a deployment with a breaking change went out, or something is wrong with an individual cluster or region. They gather evidence and compare it to system behavior to prove or disprove their hypothesis. The goal is to either confirm the hypothesis so they can make a change to fix the issue, or disprove it so that they can proceed to the next most likely scenario. 

However, in complex microservices environments, forming, testing, and validating hypotheses can be daunting, regardless of experience level. At 3 AM, it feels even more overwhelming.

Today’s systems are too complex for humans to fully understand them. Investigations are often not straightforward and drag on or loop back to earlier steps. And most distributed tracing tools are complicated. They have steep learning curves that create barriers to practical use by the entire team.

System complexity

Microservices allow developers to focus on individual services, promoting faster development cycles without requiring a deep understanding of the entire system. Well-documented APIs guide these interactions. However, during troubleshooting, this specialization becomes a hurdle. Developers must understand the complex dependencies and interactions between services they don’t manage. This can drastically slow down the process of identifying the root cause of performance issues.

Time-consuming investigations

Microservices are inherently complex, which is further compounded when performance issues arise. Troubleshooting these issues often involves manually sifting through large amounts of traces, metrics, and logs. This investigative process can take hours—or even days—for developers at all levels of expertise. The longer this process takes, the more it delays resolution, frustrating both developers and customers.

Tool complexity

Most distributed tracing tools are complex, often requiring specialized query languages. Many developers struggle to use them effectively, leading to frequent escalations and the involvement of multiple team members to troubleshoot. This not only slows down the process but also hampers overall team productivity.

Differential Diagnosis (DDx): Democratizing and accelerating the troubleshooting process

DDx transforms the troubleshooting process, enabling developers to resolve service performance issues faster and more efficiently. It simplifies hypothesis formation and testing and directs attention to resolving problems by highlighting likely trouble spots. With a user-friendly interface, DDx helps developers of all skill levels quickly navigate insights, test hypotheses, and take action more efficiently.

How DDx improves developer efficiency and team productivity

Abstracts away system complexity

DDx eliminates the need for developers to have deep system-wide knowledge. It automatically analyzes spans and span dimensions associated with the service, endpoint, or workflow in question, so developers can focus on resolving issues without needing to understand every detail of the system.

Faster troubleshooting and issue resolution

DDx ranks and highlights the most probable sources of issues, allowing developers to quickly test their hypotheses and pivot if needed. This rapid analysis leads to faster root cause identification and, ultimately, quicker problem resolution.

Built for developers of all skill levels

One of the greatest strengths of DDx is its accessibility. Developers don’t need tool or query expertise to investigate performance issues. The combination of its easy-to-use UI and automatic insights removes bottlenecks caused by skill gaps. This allows all developers to troubleshoot efficiently without waiting for assistance from their senior counterparts – increasing overall team productivity. 

Differential Diagnosis (DDx), which helps developers quickly find the source of service slowdowns or failures

Key features of DDx:

  • Automated Insights from Trace Data: DDx analyzes trace data to identify the most likely source of service slowdowns or failures, helping developers to zero-in on problem resolution faster.
  • One-Click Analysis: When suspicious trends or patterns are discovered, developers can dig deeper with a single click. DDx automatically compares relevant dimensions (e.g., errors, latencies) and highlights differences, enabling developers to spot issues with ease.
  • Intuitive and Flexible UI: The user-friendly interface makes it simple for developers of any experience level to investigate problems, reducing the reliance on senior team members.

Conclusion

In modern microservices environments, the ability to troubleshoot quickly and efficiently is critical. DDx ensures that developers at all levels can identify and resolve performance issues without being bogged down by complexity, skill gaps, or time-consuming investigations.

With DDx, developers no longer need to be experts in query languages or system architecture to fix issues. The streamlined, insight-driven approach enables faster troubleshooting, higher productivity, and ultimately, a better user experience.

Frequently Asked Questions

1. What is being launched today?

Differential Diagnosis (DDx) is a new trace analysis feature that helps developers quickly pinpoint the likely source of a service slowness or failure. Any developer, regardless of experience or knowledge of the system, can use DDx – no need to rely on the most senior developers with historical knowledge of the system.

2. Is DDx included in all Chronosphere plans? 

3. I am a Chronosphere Distributed Tracing customer, how do I enable DDx?

  • DDx is automatically enabled – no setup is required. See our docs for more information about using DDx. 
  • It’s very easy to initiate a DDx analysis within Chronosphere. You can start a DDx analysis from the DDx tab, Statistics tab, or a single span on the Trace Details page.

4. As a new customer, how easy is it to get started?

  • Just send your distributed traces to Chronosphere! DDx requires no additional setup to get started. To learn more about Chronosphere Distributed Tracing contact us

See Chronosphere Differential Diagnosis (DDx) in action!

Share This: