MTTD, or Mean Time to Detect, measures the average elapsed time between when a problem actually begins and when it is detected by monitoring systems or human observers. It is a critical metric because you cannot respond to a problem you don't know about — MTTD represents "wasted" downtime where users are impacted but no one is working on a fix.
MTTD is often overlooked in favor of MTTR, but it is actually the first component of total downtime. Total Downtime = MTTD + MTTA + time to fix. A service with a 1-minute MTTD and 29-minute fix time has the same MTTR as one with a 15-minute MTTD and 15-minute fix time, but the first scenario provides a much better user experience.
Reducing MTTD requires comprehensive monitoring coverage (monitoring all critical endpoints and services), frequent check intervals (checking every 30 seconds vs. every 5 minutes), multi-region monitoring (detecting region-specific issues), and smart alerting that minimizes false negatives. Hyperping's monitoring checks from 15+ global locations with intervals as low as 30 seconds to minimize MTTD.