On October 20, 2025, between 07:13 UTC and 22:25 UTC, we experienced a disruption affecting multiple services due to the widespread outage in the AWS us-east-1
region which we operate our US deployment out of..
From 07:13 UTC to 09:21 UTC, pipeline activities were unavailable due to outages in several dependent AWS services, including DynamoDB, SQS, SNS, and Glue. Between 07:30 UTC and 08:28:48 UTC, we were unable to send SNS notifications for completed activities. Beginning at 09:21 UTC, new activities could be initiated; however, recovery was delayed as EC2 instance provisioning was throttled, limiting our ability to restore capacity promptly. Full recovery of all pipeline activities was achieved by 18:40 UTC.
Throughout the course of the day, we observed certain source and destination connections either failing to be extracted from or failing to be loaded to do to their own use of AWS infrastructure. For more information, we recommend reviewing the status pages for these third parties.
Between 08:20 UTC and 17:35 UTC, webhook (event stream) endpoints were unable to receive events. This was caused by insufficient EC2 capacity and networking issues between our load balancer and EC2 targets. Recovery began at 17:35 UTC, with intermittent connectivity issues persisting until 22:25 UTC, at which point all services and capacity were fully restored. During this period, some webhook calls were responded with a 502. Depending on how the sender is configured, these may have been retried until the events were processed successfully.
During the outage, the webhook endpoints returned HTTP 502 responses. Any messages that received this response should be retried.
Any SNS notifications for activities completed between 07:30 UTC and 08:28:48 UTC were not sent.