With any open source tool use, organizations should have best practices in place to ensure smooth operation. For Prometheus specifically, make sure to:
Select the best standard exporter
Because Prometheus uses exporters to scrape for default data, developer teams should research which exporter will work best for their foundational data collection and reporting needs – as it can play a large role in quality of data and overall monitoring strategy. Use feature overviews, recent updates, user reviews, and security alerts to figure out the most ideal exporter. For custom metrics, you’ll have to use manual code instrumentation to insert and generate the business metrics you want to collect.
Label carefully to avoid confusion and extra storage
Use the exporter documentation to ensure that any collected data has all the necessary context and strive to have consistent labeling across monitoring targets. Each label that developers use in Prometheus uses resources, so developers want to have labels that are needed and used to oversee their cloud native environment – not take up unnecessary storage.
Set actionable alerts to reduce troubleshooting time
Monitoring strategies require planning and documentation so that developers and engineers know what’s happening when they get an alert. Before implementing Prometheus, determine what events and services are critical to monitor and what their thresholds are for receiving an alert, as well as what relevant information should be included.
Know when you need to scale
Prometheus is a way to easily start open source monitoring for cloud native and Kubernetes instances, but it does require technical and developer resources to manage over time and has scalability limitations. Additionally, it can get complex for large scale enterprise environments that require data duplication or have spread-out infrastructure. Be sure to have indicators in place when it is time to either implement Prometheus’ Federation feature, use functional sharding, or work with a vendor.
Before Chronosphere, Abnormal Security had trouble with metrics availability and scalability; their 10-12 million active metrics were tracking to hit 50 million. Knowing about this rapid growth and needing a service that could handle so many metrics, they reached out to Chronosphere to help manage their Prometheus instances. In doing so they saw increased stability, reliability, and reduced costs.
Interested in more about the connection between Prometheus and cloud native? Learn five reasons why the two work best together.