[OIT-Colocation] Urgent Update -- RE: Power outage in the OIT Data Center - update and problem resolved

Ken Cooper cooperk2 at uci.edu
Fri Sep 11 23:31:40 PDT 2015


Unfortunately we were unable to transfer our Data Center load back to UPS power due to the discovery of an additional faulty part within the UPS.  Parts have been ordered and they are scheduled to be delivered tomorrow morning (Saturday 9/12/15) by 9am.

We are still on Utility power at this time and all activities have been completed for tonight.   Data center staff and the UPS technicians will be onsite by 12 noon tomorrow to continue with the repairs and transferring of the data center load back to UPS power.

Thank you,

OIT Data Center Manager,

Ken Cooper


From: Ken Cooper
Sent: Friday, September 11, 2015 2:56 PM
To: 'oit-staff at uci.edu' <oit-staff at uci.edu>; 'school-it-directors at uci.edu' <school-it-directors at uci.edu>; 'oit-colo at uci.edu' <oit-colo at uci.edu>
Cc: Brian Buckler <bbuckler at uci.edu>; Kian Colestock <kcolestock at uci.edu>; Cheryl Ann Watt <cheryl.watt at uci.edu>; Steven J. Weaver <sjweaver at uci.edu>; Carol Jackson <cjackson at uci.edu>; Lyle P. Wiedeman <wiedeman at uci.edu>; Hector Segui <hsegui at uci.edu>
Subject: Urgent Update -- RE: Power outage in the OIT Data Center - update and problem resolved

OIT Data Center Emergency Change will take place this evening, Friday September 11 starting at 8pm to test, validate and transfer the OIT Data Center load back to UPS power.

As you all are aware, portions of the OIT Data Center have been on utility (unprotected) power since we experienced the catastrophic failure of our main UPS on Wednesday morning. (FYI.... This was not the UPS that was recently installed on August 19th).  During the UPS failure, we lost power to a large portion of the OIT data center that had not been set up with redundant power feeds.

The UPS repairs are scheduled to be completed this evening. However, before the UPS technicians can complete the testing and validation, we will need to supply power to the UPS.  Even though it is highly unlikely, there is a chance that the action of putting power to the UPS could cause the main breaker to trip, which would cause us to temporarily lose utility power to OIT Data Center.

In that event, the only equipment / services that would be impacted are those that were impacted when we lost power on Wednesday morning (September 9th).

Once the UPS has power, the UPS technician will then be able to test and verify the UPS prior to putting our Data Center load back on UPS. After all the testing and verification has been successfully completed, we will then transition our load from utility power back to UPS power.  All of this should take no more than an hour to complete.  If for some reason the UPS does not pass the testing and validation, we will not transfer the load, but continue to run on utility and reschedule the transfer for a later date.

I am not in my office, I am setup in the Data Center and can be reach at 949-824-2842.


Thank You,

Kenneth Cooper, CDCDP

Data Center Manager
Office Of Information Technology
University Of California, Irvine
Office: (949) 824-3704
Email: cooperk2 at uci.edu<mailto:cooperk2 at uci.edu>
[CDCDP Logo Sm]



From: Ken Cooper
Sent: Wednesday, September 09, 2015 5:25 PM
To: oit-service-status at uci.edu<mailto:oit-service-status at uci.edu>; oit-staff at uci.edu<mailto:oit-staff at uci.edu>; school-it-directors at uci.edu<mailto:school-it-directors at uci.edu>
Cc: Brian Buckler <bbuckler at uci.edu<mailto:bbuckler at uci.edu>>; Kian Colestock <kcolestock at uci.edu<mailto:kcolestock at uci.edu>>; Cheryl Ann Watt <cheryl.watt at uci.edu<mailto:cheryl.watt at uci.edu>>; Steven J. Weaver <sjweaver at uci.edu<mailto:sjweaver at uci.edu>>
Subject: RE: Power outage in the OIT Data Center - update and problem resolved

As of 5pm all systems in the OIT DC are up and running on normally utility power.  As previously noted, the outage was cause by a catastrophic failure of our main UPS unit.  Replace parts have been ordered and will arrive tomorrow, however due to the nature of the failure it will take approximately two days to complete all repairs and validation testing of the UPS unit.  During this time we will continue to operate on normal Utility power (unprotected power) and the systems listed below could be impact in the event of a power outage to the OIT Data Center.  Should we have a power outage in the OIT Data Center we will automatically transfer over to generator power,  however there will be a brief loss of power during that transfer phase because of the UPS being offline.

Thank you,

OIT Data Center Manager,
Ken Cooper

From: oit-staff-bounces at department-lists.uci.edu<mailto:oit-staff-bounces at department-lists.uci.edu> [mailto:oit-staff-bounces at department-lists.uci.edu] On Behalf Of Steven J. Weaver
Sent: Wednesday, September 09, 2015 9:00 AM
To: Cheryl Ann Watt <cheryl.watt at uci.edu<mailto:cheryl.watt at uci.edu>>; oit-service-status at uci.edu<mailto:oit-service-status at uci.edu>; oit-staff at uci.edu<mailto:oit-staff at uci.edu>; school-it-directors at uci.edu<mailto:school-it-directors at uci.edu>
Subject: Re: [OIT-Staff] Power outage in the OIT Data Center - update and problem resolved

At this time we have been able to determine that the Main UPS failed at approximately 4:30am.  Most of the equipment in the Data Center is supported by this UPS.  Power has been restored, however all of that equipment support by the main UPS is currently on Utility power. We are aware of the following impacts;


*         All HPC equipment were down

*         COLO Network Rack was down(effecting all COLO connectivity in the OIT Data Center)

*         Engineering Servers were (Dan Melzer)

*         All of Green Planet Equipment was down

*         DMRNET Router was down (External Facing Webpages)

*         ZOTPortal was down

*         HVAC systems in the Data Center were down




Thank you,

OIT Data Center Manager,
Ken Cooper

949-824-2222

From: oit-staff-bounces at department-lists.uci.edu<mailto:oit-staff-bounces at department-lists.uci.edu> [mailto:oit-staff-bounces at department-lists.uci.edu] On Behalf Of Cheryl Ann Watt
Sent: Wednesday, September 09, 2015 7:49 AM
To: oit-service-status at uci.edu<mailto:oit-service-status at uci.edu>; oit-staff at uci.edu<mailto:oit-staff at uci.edu>; school-it-directors at uci.edu<mailto:school-it-directors at uci.edu>
Subject: Re: [OIT-Staff] Power outage in the OIT Data Center - update and problem resolved

As of 7:30am all power fully restored and systems up.

Root cause of generator breaker problem being investigated.


From: Cheryl Ann Watt
Sent: Wednesday, September 09, 2015 6:40 AM
To: 'oit-service-status at uci.edu' <oit-service-status at uci.edu<mailto:oit-service-status at uci.edu>>; 'oit-staff at uci.edu' <oit-staff at uci.edu<mailto:oit-staff at uci.edu>>; school-it-directors at uci.edu<mailto:school-it-directors at uci.edu>
Subject: Power outage in the OIT Data Center
Importance: High

At about 4:30 AM we lost power in parts of the Data Center.

Facilities was able to restore power to the Air conditioning devise.
As of 6:15, most systems are coming up.

Servers that we know of that where down, Green Planet, The Library, Engineering.

Please refer to the OIT Service Alerts on our Home page for updates - http://www.oit.uci.edu/

Thank you.
OIT Help Desk
(949) 824-2222

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://department-lists.uci.edu/pipermail/oit-colo/attachments/20150912/2e64d544/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 2450 bytes
Desc: image001.jpg
URL: <http://department-lists.uci.edu/pipermail/oit-colo/attachments/20150912/2e64d544/attachment-0001.jpg>


More information about the OIT-colo mailing list