Impacted Customers: All on US2 Cloud Datacenter
Impacted Services: Push, Link2Accept, SMS, TTS, IP threat and geo, Certificate.
Incident Date: January 26th, 2020
Incident Description:
Starting at approximately 2:22AM UTC on 01/26/2020 the SecureAuth Services hosted at US2 Cloud Datacenter were substantially degraded. Operations staff received alerts at 2:24AM UTC and began remediation procedures, but access rights issues limited their ability to take corrective actions. At 4:31AM UTC DNS Failover was initiated. Operations staff noticed that DNS Failover was marginally effective due to DNS TTL not being respected or relayed to client servers for some impacted customers. At 5:58AM UTC Load balancer failover was initiated routing all traffic to the US1 cloud datacenter. This resulted in rerouting the balance of customers that were not impacted by the DNS failover.
Root Cause:
Impaired Network / Performance at US2 Cloud Datacenter Hypervisor infrastructure, currently under research by provider, and RCA is pending. This caused CPU and Network anomalies on hosted services, and process queue increase which impacted the services ability to respond to requests.
Corrective Actions:
Failover processes will be implemented more aggressively, bypassing DNS failover. Access rights limitations will be resolved for 24x7 operations staff.
Additional monitoring has been implemented, to detect infrastructure service degradation based on current findings. After RCA from provider is received, additional remediation may be taken.