SMS Delivery issues
Incident Report for SecureAuth Service
Postmortem

Incident Description

On 2017-07-19 starting at 9:29 PM until 10:31 PM PDT SecureAuth customers experienced intermittent delivery of SMS OTP messages. SecureAuth cloud monitoring alerted the engineering team of SMS services failures and began an investigation was initiated. It was determined that communication failures were occurring with the Primary sms and telephony provider's API. Nexmo acknowledged intermittent connectivity issues and due to a failure at one of their US data centers. SecureAuth Engineering team monitored SMS and Telephony services after the initial communication issue was resolved but determined the provider's services were intermittently delayed therefore all SMS traffic was routed to our secondary provider. Fail-over to the secondary provider stabilized delivery of SMS messages was completed at 10:31 PM PDT.

Root Cause

After a thorough analysis and review of logs we have determined the following issue was the primary contributor to the failure:

Nexmo experienced a network hardware failure in one of their US data centers which initially caused intermittent communication failures to their API infrastructure and intermittent delivery delays of SMS and Telephony messages. See full description here: https://www.nexmostatus.com/incidents/gk9nmqv859z6 .

Corrective Actions

The following measures are being taken to prevent an incident of this type from happening in the future:

The official RCA from Nexmo was received on 7/28 and reviewed. SecureAuth Engineering met with Nexmo Support and Operations teams on 8/1 and 8/2 to discuss outstanding questions and to review short and long term remediation plans. - The affected hardware issue was resolved immediately and fail-over monitoring was reviewed and improved. - Over the next 30-90 days additional improvements are planned around failure detection and fail-over automation efficacy. - SecureAuth and Nexmo will convene regular cadence calls in support of our strategic partnership.

Posted Aug 02, 2017 - 14:59 PDT

Resolved
SecureAuth's Primary SMS provider has resolved the issue. SMS services are stable and functioning normally. We will update this incident with a root cause analysis as soon as possible.
Posted Jul 19, 2017 - 23:49 PDT
Monitoring
SecureAuth's primary SMS provider is experiencing a partial outage. We have moved services over to our secondary provider and SMS services are functioning normally. We are monitoring services and will update this incident again with more information as we receive it. We apologize for any inconvenience.
Posted Jul 19, 2017 - 22:42 PDT
Investigating
We are currently investigating a report of SMS delivery issues. We will update this incident with more information as it becomes available.
Posted Jul 19, 2017 - 21:50 PDT