Monitoring debt leads to alert fatigue and increased operational risk, is it time you performed an alert audit and got your monitoring in order?
It will likely be a scenario familiar to any engineer who has been asked to take a pager and be on-call: the ratio of signal to noise feels completely out of sync, leading to a frustratingly large number of alerts to deal with.
This is what we refer to as “monitoring debt”. It’s the hole you dig by not changing how you monitor while you change your technical system. The longer this drift occurs, the less relevant the alert thresholds can become.