Major Outage : Power Failure

Service Outage

At around 3am, BlueSquare 1 experienced cooling issues which resulted in the overheating of several servers in the facility.

Unfortunately, due to an apparent knock-on effect, this tripped a power bar in one of our racks which resulted in the fuse blowing causing half of the PDU (power distribution unit) to stop supplying power to some servers and equipment.

Engineers are onsite and are trying to get back on all servers as soon as possible. Some have blown PSU (power supply units) because of this which is delaying matters.

A full outage report will be sent as soon as possible.

Posted: Wednesday 16 Jan 2008 at 8:17:05 am GMT

7 Updates to “Major Outage : Power Failure”

  1. XILO Says:

    Most equipment is now back online.

    Charlie and VPS-1 are both in disk checking mode before booting so we are monitoring those.

  2. XILO Says:

    VPS-1 is now backup and all VEs have cleared their respective checks.

    Charlie is still failing to load due to disk check problems - we are working as quickly as possible to get this server online.

  3. XILO Says:

    Charlie is still completing a disk check - we’re sorry about the time this is taking. Unfortunately, it appears to be in a bad state of repair which is taking longer to rectify.

    Delta has also had to have an emergency reboot because errors were being shown in the logs. To prevent dataloss, we have had to start a quick check on this disk.

  4. XILO Says:

    Delta has cleared a disk check and will be back up in minutes.

  5. XILO Says:

    Charlie has also cleared the locked blocks in the filesystem allocation table. We’re just waiting for completion then a reboot.

  6. XILO Says:

    Charlie has now been restored.

    We don’t anticipate any further outages and we apologise for the service outage this morning.

    We have raised the issue with BlueSquare and await a detailed report.

  7. XILO Says:

    We’ve had a further outage due to another blown fuse this evening on another PSU bar.

    We’ll be conducting emergency maintenance this evening and changing all the PDU bars in the effected racks to prevent a repeat.

WP Theme & Icons by N.Design Studio
Entries RSS Updates RSS Login