We just observed a small outage that affected some of our customers in BlueSquare 2.
Upon contacting the BSQ NOC we have received the following statement.
At approx 12:40pm today we experienced an issue with one of our four UPS systems whilst carrying our routine testing which resulted in this one UPS system powering off.
This caused a short brown out to approx 30% of customers equipment located in BlueSquare 2 & 3 as the unit powered off in an unusual way. The remaining three UPS systems took the load, and continue to provide power to the sites.
Full UPS protection is in place and we are running as normal and our UPS vendors are on-site investigating with senior management.
We aim to provide a further update within the next 3 hours.
We apologise for any inconvenience caused.
Our monitoring shows no servers offline at this time but if you are having problems connecting to your server - please raise a ticket via my.xilo.net.
Posted: Wednesday 30 Apr 2008 at 1:51:41 pm BST
May 1st, 2008 at 10:14 am BST
We have received the following update and reply from BlueSquare management.
–
Please find below a further status update on the power situation for BlueSquare 2 & 3.
The parallel UPS system based in BlueSquare 2, which also feeds BlueSquare 3, consists of 4 separate UPS systems that are paralleled together and share a common bus bar system. UPS units 1,2 & 3 have been running since the building went live, nearly 1 year ago. Unit 4 was added to the system yesterday to increase the overall system capacity ready for the load on first floor of BlueSquare 3 to go live.
During our maintenance window today which was performed by the UPS manufacturer, the new 4th UPS system was given a bypass check, and a battery test. This completed successfully. Approximately 15 minutes later, with all 4 systems running normally, the existing UPS unit 2 failed. Unfortunately it failed in a critical way, which meant a bypass message was sent to the remaining UPS systems which put them onto raw mains. The critical load transferred to the remaining three systems that were running on raw mains without any problem.
However, the failure of UPS 2 which was caused by a faulty component almost immediately created an overload situation on the mains incomers, which meant the mains incoming supplies to each UPS were tripped off. The UPS’s detected the loss of raw mains, and immediately reverted back to battery backup and out of bypass mode. During this sequence of events customers would have noticed a very short (sub-second) loss of power which may have resulted in some customers equipment rebooting.
The incomer mains circuit breakers were then reset back to the on position for the 3 remaining UPS units, and service was back to normal conditions.
Initial investigation by the manufactures onsite UPS engineers, and subsequent visit by senior engineers who travelled to site showed that a critical internal component had failed causing this knock on effect.
We have since been continually liaising with senior management from the UPS manufactures UK division, as well as the designers at the factory in Switzerland to find the cause of this component failure. An engineer is being flown over from Switzerland to arrive at BlueSquare within the next 48 hours to continue on site investigation work on UPS unit 2.
We have conducted a heath check of the remaining three UPS systems and we are happy that these are all working normally and are at no risk of developing any foreseeable faults.
We have N+1 redundancy on the UPS systems, and full generator backup if required.
We will issue a further update once the engineers from the factory arrive on site.
Again, we apologise for any inconvenience caused today.
–
Many thanks to all customers for their understanding and patience.
May 13th, 2008 at 2:47 am BST
We have now received the final report from the engineers who travelled from Switzerland to visit the failed UPS system.
The summary of their report as below (please excuse the Swiss/Italian English language):
Situation of the failed UPS :
Input Rectifier fuses OK
Input By-pass fuses OK
Battery fuses OK
Input positive Booster fuses NOTOK Input negative Booster fuses NOTOK
Output positive Booster fuses OK Output negative Booster fuses OK
Inverter L1 T1 pos. NOTOK Inverter L1 T2 pos. NOTOK
Inverter L1 T1 neg. NOTOK Inverter L1 T2 neg. NOTOK
Inverter L2 T1 pos. NOTOK Inverter L2 T2 pos. NOTOK
Inverter L2 T1 neg. NOTOK Inverter L2 T2 neg. OK
Inverter L3 T1 pos. NOTOK Inverter L3 T2 pos. NOTOK
Inverter L3 T1 neg. NOTOK Inverter L3 T2 neg. NOTOK
Battery Charger pos. NOTOK Input Battery fuses pos. NOTOK
Battery Charger neg. OK Input Battery fuses neg. NOTOK
The unit failed because of a short circuit between positive booster and positive rectifier caused by insulation damage on the booster battery inductors cables near the positive capacitors.
The input breaker on the distribution cabinet, in case of a short circuit on By-pass, opened before the rupture of the input fuses.
As happen on site: the short circuit on the unit Nr.2 was supplied from the other UPS’s, the load has be transferred to By-pass which continued to supply the short circuit. As a consequence the input breaker of all the units trip and the units lost the load.
Corrective Actions:
The power cables (Inductor Booster) holder are of units now in production are protected and in the same way the isolation sheet on the “sandwich” is now protected.
NEWAVE SA, after the suggested assembling critical points undertake immediately a corrective action but this unit left the factory before performing the enhancement.
All new units have the protective enhancement, and a NEWAVE SA engineer will support a engineer of UPS Ltd to again check minutely all the cables of the other units at BlueSquare.
——
We will issue a further post in advance of any works to re-check the cabling of the remaining on site units. We expect this work to be completed within the next two weeks.