Incident severity is a classification system used to categorize incidents based on their impact on users and the business, urgency of response required, and scope of the problem. A common severity scale includes: SEV1/Critical (complete service outage, all users affected), SEV2/Major (significant degradation, many users affected), SEV3/Minor (partial degradation, some users affected), and SEV4/Low (cosmetic issues, workaround available).
Severity levels drive the incident response process: who gets paged, through what channels, what the response time expectation is, whether external communication is needed, and what level of postmortem is required. A SEV1 might trigger immediate phone calls to the on-call team, leadership notification, and status page update, while a SEV4 might simply create a ticket for the next business day.
Well-defined severity levels help teams avoid both under-response (treating a critical outage as a minor issue) and over-response (waking people up for cosmetic bugs). They also provide a consistent framework for SLA reporting and trend analysis. Configuring alert severity in monitoring tools like Hyperping ensures the right people are notified through the right channels for each level of issue.