December is a time for basking in the glow of a menorah, hanging boughs of holly, and pondering new year resolutions as you sip spiced apple cider and unwind from the chaos of 2022. Unless you’ve been handed the pager and are on-call during the holiday season.
Balancing holiday cheer and on-call rotations for one is tricky, but take it from me — two pagers under one roof is madness! Before my retirement from SRE, both my partner and I were on-call for mission critical infrastructure and software. Sometimes our rotations synced, and sometimes we’d spend a whole month with either one of us on primary or secondary.
One winter, the siren call of holiday lights and ice skating beckoned. This was especially momentous for us as he previously had the unfortunate luck to get scheduled for primary on-call during Christmas for the past two years! But for one holiday season, we were able to enjoy a night free from thinking about computers and the spectacular ways in which they can fail.
It shouldn’t be the norm to depend on your teammates’ goodwill and open schedules to partake in holiday cheer. In fact, if you’re reading this in the beginning of December there’s still time to invest in smooth holiday on-call operations!
Ultimately, it is change in routines to people, processes, and technology for the end of year that contribute to unique stressors of holiday on-call.
People-wise, having most of the company offline means a smaller pool of engineers and support staff are available to troubleshoot any issues that arise. When you’re flying solo, what could have been a quick Slack saying, “Oh yeah that error? We just ignore it” turns into an anxiety-inducing investigation. Odds are your own routine is a bit different and you may face spotty internet connectivity while out on holiday excursions.
Process-wise, many organizations institute a deployment freeze that means no new code, configuration, or infrastructure changes are made during the specified window. This also means there are likely new and unfamiliar steps to the deploy process, such as requiring extra reviewers and getting special approval from leadership.
Technology-wise, deploy freezes reduce the amount of available answers when investigating issues, but bring about their own challenges when thawing out post holiday season. One trick is to ensure your systems are ready to handle atypical traffic patterns at this time of year. For example, you’ll have different requirements to handle peak loads for an e-commerce site versus a reduced load for business tooling, such as business chat or document sharing infrastructure.
Directives meant to ease the difficulties for your holiday on-call teams actually end up placing them in uncharted waters when something does happen. Reduced access to help in troubleshooting, changed remediation processes, and modified infrastructure behavior all helping to ratchet up their stress levels.
There are many great intentions by organizations trying to implement steps to ensure a smooth, painless, and stress-free holiday on-call experience for their teams. The problem is, most of these intentions contribute to stress levels, which is the exact opposite of the desired outcome.
I’ve compiled a list of my holiday on-call wishes. This is not a complete list of what each organization should do, but a list to cherry pick from when managers and department heads consider the sacrifices their on-call teams are making. It should be looked at as more than just doing their jobs and acknowledge the sacrifice involving their holiday break.
Here is my wish list for engineering leaders and managers to consider when implementing holiday on-call schedules:
Engineers can also take steps to help their leaders and managers improve the on-call experience for all. The following things have worked very well for me in Christmases past:
No matter what checklist you follow, what steps you take, or how happy the on-call team is with the final schedule, holiday on-call still sucks. I hope that these insights, ideas, and tips do make your organization’s holiday on-call experience just a little less stressful every year.
From the entire crew at Chronosphere, we are wishing you a zero downtime, eventless, silent pager, reliable, and stress-free holiday season!
Request a demo for an in depth walk through of the platform!