Microsoft Azure down
Starting today at 03:58 UTC Microsoft Azure starting experience issues with customers accessing databases in the West Europe region. New connections to databases result in errors or timeouts. Existing connections remained available but once terminated they could not be re-established. As we are hosting our databases in this region, 282 resources were impacted: the Mapiq application and Mapiq Essentials environments. Without any connections available, the data for our applications can not be accessed and users are will not be able to login.
Azure is aware of the issue - the issue tracking link is available here https://app.azure.com/h/3TBL-PD8/ba49f8 - but at the moment they still have not been able to determine a root cause. Services seem to be restoring slowly but at the moment no concrete indication yet that everything is 100% back to normal.
Additionally, Azure started experiencing issues at 04:19 UTC with app services in the West Europe region. Autoscaling and service management were probably impacted. Our applications are hosted in this region but impact is currently not fully known. Current assumption is that once our databases started to get back up online, this impacted our response time on scaling out our applications to deal with the increased load from returning users. Issue tracking can be found here https://app.azure.com/h/3TFH-PZ0/8375e5
Azure is currently investigating and trying to resolve the issues, Should more information be available, we will be sure to inform you.
Azure is currently investigating and trying to resolve the issues, Should more information be available, we will be sure to inform you.
Current Status: SQL team has identified a configuration change on the metadata drop operation which has caused the overall issue. We have currently several parallel workstreams to revert the change to mitigate the issue. A number of customers should already see recovery and the number of recovered instances will increase progressively until full recovery. The next update will be provided in 60 minutes, or as events warrant.
Current Status: We have applied several parallel workstreams to revert the change to mitigate the issue and we are continuing to monitor progress. The majority of impacted customers should now see service recovery. We are continuing to monitor the final recovery phases to ensure full mitigation. A further update will be provided in 2 hours, or as events warrant.