Resolved [13 June 2025] Power / Chiller Event

KH-LukeM

Super Moderator
Staff member
We're currently experiencing a PDU failure, which effects only a single rack of dedicated servers in our GA location.

We're working on restoring power to that rack and will be going over each server in turn to ensure there's no lasting effects.
 
Do you have any updates? I'm surprised that there's no PDU redundancy in place—this kind of failure shouldn't bring down an entire rack. Our production services are affected, and we're concerned about the lack of electrical failover.
 
Howdy,

This appears to be a larger issue than just a PDU failure, all necessary parties are engaged in restoring services as quickly as possible but we are reliant on the facilities engineers at this stage.
 
Hi Daniel,
Thanks for the update. Since this appears to be a larger infrastructure issue, could you please give us a better idea of the expected timeline for resolution? Our production systems are impacted, and we need to plan accordingly. Even a rough ETA would help. Also, any clarification on the nature of the issue beyond 'PDU failure' would be appreciated.
 
I would anticipate service restoration within the next 60~ minutes as it is being actively worked on.

The situation is too fluid and I don't have enough factual information to give much of a detailed answer other than we have lost all power to a single rack due to what appears to be a chiller failure.
 
Services are back up on our end. Thank you for the work involved in resolving this. That said, the downtime was considerable — we’d appreciate any follow-up on steps being taken to prevent similar incidents in the future.
 
This will be thoroughly reviewed. Some of the extended downtime was due to incorrect information passed along but we'll be getting to the bottom of everything after we ensure all customers services are fully operational.
 
This will be thoroughly reviewed. Some of the extended downtime was due to incorrect information passed along but we'll be getting to the bottom of everything after we ensure all customers services are fully operational.

Why does it seem that some servers are back up and others are not?

We have now been down for over 2.5 hours.

We need an ETA when our sites will be back up.

We had nine successful years with the infrastructure in Michigan. We move to the Atlanta data center hoping for continue good performance and then this happens within a month.

this has caused huge concern.
 
Why does it seem that some servers are back up and others are not?

We have now been down for over 2.5 hours.

We need an ETA when our sites will be back up.

We had nine successful years with the infrastructure in Michigan. We move to the Atlanta data center open for continue to success and this has caused huge concern.

If your servers are still offline please respond to or open a support ticket so we can check further. All core issues have been addressed so it's down to any individual issues that may be present within the server.
 
If your servers are still offline please respond to or open a support ticket so we can check further. All core issues have been addressed so it's down to any individual issues that may be present within the server.

Obviously, I replied to the Support ticket before posting here, but I’ve not had any response yet
 
At this time all known issues have been resolved and all impacted systems have been hand checked. If any issues still remain that we may have missed please reach out to our support team and we will get you taken care of asap.
 
Top