An alert escalation can be triggered when the primary support engineer does not respond to or acknowledge an alert within the escalation policy time limit. Keeping managers and stakeholders informed during an incident can help improve confidence in the support team. Once an escalation policy has been established, alert escalations can be automated to ensure consistency.
What is Alert Escalation?
An alert escalation happens for several reasons. When the primary support engineer does not respond to or acknowledge an alert within the escalation policy time limit, an escalation can be triggered to send reminder notifications to the primary support engineer, escalate to one or more secondary engineers or escalate to a manager.
An escalation may also be needed when the primary support engineer is deeply involved in solving an existing incident; in this case the engineer can immediately escalate to the next person in line. There may be other types of escalations, for example, sometimes a manager needs to be made aware of an incident, or an additional resource may be needed with specific skills. Keeping stakeholders informed of the incident’s status may also be necessary.
All escalations involve communication, and each use case may require specific communication, an email to stakeholders, a text message to a team member. Status updates may also be posted in a team chat group such as Microsoft Teams or Slack. In some cases, an alert should be directed immediately to a specialist with a specific skillset.
Why do you need an escalation policy?
Monitoring systems and ticketing systems lack the flexibility to address all the above scenarios. While certain ticketing systems may provide some features, these often involve writing code or engaging specialized resources to develop solutions. Without an escalation policy, critical alerts can be missed, resulting in downtime. Automated alert escalations can also help to resolve issues more quickly, by engaging other resources to help.
Without an automated alert escalation policy, manual escalations result in uneven service levels, and the potential for outages which could be avoided. Keeping managers and stakeholders informed during an incident can help improve confidence in the support team, and stakeholders can help keep customers informed. Status pages can also help to keep stakeholders updated. Once an escalation policy has been established, alert escalations can be automated to ensure consistency in meeting service-level goals,
How does AlertOps help manage alert escalations?
AlertOps can easily provide many standard alert escalations out of the box. When alerting the primary on-call engineer, AlertOps offer automatic escalations using all available communication channels, such as phone, SMS, email, mobile push notifications to Android or iPhone, as well as chat communication to Microsoft Teams or Slack. When the primary support engineer does not respond within the policy time limit, AlertOps Escalation Policies will automatically escalate to the next in line and continue until the alert is assigned. When a support team member is receiving alert notifications, they can escalate to the next person line, thereby short circuiting the escalation flow and saving time. A user can easily add anyone to an alert at any time, or an AlertOps Workflows can automatically notify managers and stakeholders over any communication channel at specified times. AlertOps also includes a Status Page
which can be used to keep subscribed users updated on an incident’s status. AlertOps rules can direct alerts to any person or team based on data in the alert. AlertOps escalation policies and Workflows can ensure that service level goals are being met. AlertOps postmortem reports provide incident closure and can be provided to customers as evidence that the incident has been investigated and improvements may be implemented to prevent recurrence or improve response.