Delayed/Unavailble SMS OTP delivery US1 data center
Incident Report for SecureAuth Service
Postmortem

Incident Description

At approximately 0920 on 5/10/2019 internal monitoring detected delayed or failed SMS delivery in our US1 data center. Investigation initially indicated and issue with the one of our servers, which was removed from the load balancer pool and rebooted. Upon being added back into the load balancer pool, the delayed SMS delivery issue persisted. At approximately 0934 the issue self-resolved and normal operation was restored.

Root Cause

After extensive investigation with our network providers it was determined that an issue with some of the data center equipment was causing inbound requests to queue up for a period of 10-15 seconds and then be delivered to the server within a few hundred ms, which exceed the server’s capacity to respond.

Corrective Actions

We have aggressively engaged with the network provider to resolve the issue. In addition we are adding traffic gap monitors to our internal monitoring to more rapidly detect a recurrence of this or a similar issue. We are also updating our Level 1 response playbooks to move the check of traffic patterns and volumes earlier in the sequence of diagnostic steps.

Posted May 10, 2019 - 15:41 PDT

Resolved
This incident has been resolved.
Posted May 09, 2019 - 10:14 PDT
Monitoring
A fix has been implemented and we are monitoring the results.
Posted May 09, 2019 - 06:57 PDT
Investigating
We are currently investigating this issue.
Posted May 09, 2019 - 06:36 PDT
This incident affected: SecureAuth Cloud Services (SMS Service - US1).