Why is running observability in-house so expensive? It comes down to people. Generally, the most important and complex piece of an observability stack is the time series database (TSDB). While there are several popular TSDBs that are available in open-source, such as Prometheus, it lacks desirable functionality for a critical service that will run at high scales.
This extends to long-term storage options for Prometheus, such as Thanos/Cortex. It falls upon teams to either implement additional capabilities, or forgo them and deal with additional management headaches that arise as a result. These feature gaps and resultant management headaches may not seem like that big of an issue individually, but they can be a real drag on productivity, and easily turn into “death by a thousand cuts” for an observability team that owns the system’s operation and other day-to-day tasks.
TSDBs have all of the same considerations that another database would in a production environment; administrative operations can be difficult to do effectively given the huge volume of data that ends up being stored in many production TSDBs. Because observability is a critical function for businesses, your TSDB must operate at high availability to ensure it is at least as available as the production systems it enables the organization to observe. Once again, achieving this level of resiliency will fall upon the team, and if they are not experts in the implemented technology, it can be a time-consuming task, and is frequently marked by painful mistakes that result in downtime along the way.
Functional limitations
Open source, while it does have low initial expenses, comes with functional limitations with respect to areas like data management, UI/UX, and customer support functions. Without the right internal resources, organizations will either have to hire talent to build out capabilities or pay for software with such features.
Overall, a vendor can provide the following for managed observability:
- Scale constraints automatically and help protect users from the effects of disruptive workloads.
- Ensure high availability, oversee backups, and disaster recovery.
- Monitor the monitoring system and confirm it runs as intended.
- Manage security measures and audit logs.
- Include additional high-level management features that are not present in open-source, such as detailed visibility into the data that is being sent/stored.
- Provide support and training to end-users to ensure they have the visibility they need into their applications and systems.
Beyond overall system management, observability vendors can provide:
Data management and format support: Managing data ingestion and retention can be quite complex for open-source TSDBs. Managing functions like downsampling of historical data can prove to be quite a headache for teams that are not experienced with the TSDB in question and how to proactively monitor all of its background operations effectively. Some open-source options may lack such features entirely, which makes storing/querying historical data significantly more expensive, and limits teams’ ability to monitor long-term trends. You should also consider what data format support your teams need, as open-source TSDBs may not be compatible with more than one format, or have limited data compatibility. In contrast, vendor managed solutions commonly provide support for multiple data formats, and any challenges with managing ingestion and storage of data are handled for you.
UI/UX: Most open-source observability offerings and TSDBs have minimal UI support for management. Instead, your team will spend their time defining things like alerts via code only, without tools to help ensure they will behave as intended. This setup option does encourage configuration-as-code, but it’s a pretty bad user experience, and the friction that individual users experience in day-to-day tasks can add up very quickly. They require a certain level of system knowledge that your team might not have or must take extra time to learn.
Observability vendors can provide a better UX with proprietary dashboards that are less code-based and more visual. These managed offerings provide interactive query-building capabilities that simplify dashboard editing. Commercial observability programs have interfaces specifically designed to display alerts and events. Plus, user access management is streamlined, and easier to get an overview at a glance.