Resolved -
We are seeing most pipelines and dbt schedules have recovered. For any remaining issues, Etleap Support has reached out directly to affected customers, and we are working on addressing the last remaining issues.
For private deployments, Etleap Support has reached out if any remedial steps are required.
Oct 20, 14:43 PDT
Update -
We have addressed capacity and networking issues for our streaming ingest endpoints.
Oct 20, 13:36 PDT
Update -
We are seeing connectivity issues affecting our streaming endpoints; We are investigating the root cause.
Oct 20, 13:13 PDT
Update -
Events stream ingestion has fully recovered; We are continuing to monitor pipeline recovery
Oct 20, 12:43 PDT
Update -
We are able to provision extra capacity for our event streaming endpoints, and we are seeing a reduction in error rates; We are continuing to monitor the situation.
Oct 20, 11:17 PDT
Update -
We are seeing increased connectivity issues to our event streaming endpoints due to networking issues
Oct 20, 10:55 PDT
Update -
We are seeing pipeline and dbt schedule latencies recovering; We are continuing to monitor the overall recovery.
Oct 20, 10:46 PDT
Update -
We are continuing to monitor for any further issues.
Oct 20, 10:42 PDT
Update -
We were able to recover some capacity for our Event Stream endpoint; They are currently able to receive data, however request latencies are still high. We are working to provision extra capacity.
Oct 20, 10:41 PDT
Update -
The AWS outage in US East (N. Virginia) is still ongoing. They have identified the root cause to be an internal networking issues and have throttled requests for new EC2 instances. This is currently causing potential outages to the following Etleap components:
- Pipelines - may become latent as EMR may fail to scale up for increases in demand.
- dbt Schedules - may become latent as EMR may fail to scale up for increases in demand.
- Event Streams - may fail to read from source as the autoscaling group that serves these connections fail to provision new instances.
Our AI and UI components are currently fully operational.
Oct 20, 09:35 PDT
Update -
We were experiencing increased errors in both the UI and API in the US hosted environment due to a credentials error caused by the on-going AWS outage. We have implemented a fix and are seeing a decreased rate of errors in both the API and UI.
Oct 20, 06:41 PDT
Update -
We are continuing to monitor for any further issues.
Oct 20, 06:31 PDT
Update -
We are continuing to monitor for any further issues.
Oct 20, 06:22 PDT
Update -
We are continuing to monitor for any further issues.
Oct 20, 06:15 PDT
Update -
VPCs deployed in US East (N. Virginia) also affected
Oct 20, 04:29 PDT
Monitoring -
AWS Outage Operational issues – Multiple services (N. Virginia)
Outage first reported Mon 20 October 07:11am UTC (12:11am PTC)
Outage has started recovering Mon 20 October 09:27am UTC (02:27am PTC)
Oct 20, 03:15 PDT