A routine database automatic scaling operation failed around 9:30 AM ET on 4/19/2023. This caused the database cluster to become unresponsive and fail to accept connections. While data integrity was unaffected, data availability was compromised. The SecureAuth team worked to restore service and forced the scaling operation to finish. Once the scaling operation finished, the database cluster once again was accepting connections and service was restored around 12:30 PM ET on 4/19/2023.
It appears that the automatic scaling operation failed to complete due to an issue with the database platform in general. One cluster node was marked unhealthy by the scaling algorithm incorrectly, and this prevented the rest of the cluster nodes from scaling properly. The database cluster began rejecting connections soon thereafter.
Typically such automatic scaling operations are seamless and without incident. However, given the conditions for auto-scale failure on 4/19/2023, we acknowledge there is still risk with such activity. The SecureAuth team is following up with a database platform vendor for more information regarding the scaling operation failure. In the meantime, any auto-scale operations for this database cluster have been suspended, and the SecureAuth team will monitor capacity and do any scaling activity for this database cluster during planned maintenance windows. Finally, the SecureAuth team is confirming vendor best practices are being followed for any/all connections to this database cluster and will make adjustments as needed.
Note: times are Eastern time zone.