The impact on Chemistry's IT of about 40 minutes of unexpected power outage.
Summary
- Parts of this power outage went fairly well. Some less so.
- Cluster head nodes and compute nodes & drives seem to be in good shape.
Things we know failed:
248 Baker
Topic or event | Action taken | Action required | Notes |
---|---|---|---|
ChemIT UPS in cluster rack died. Would not turn on. | Moved Widom HeadNode to Loring UPS Moved C4 to Loring UPS Removed ChemIT UPS and plugged in to charge, just in case. | ||
ChemIT UPS for Windows servers – limited battery power, Hyper-V machines were not able to shut down gracefully. | Even after 24 hours charging, Synology shows it with only 672 seconds of battery life. Probably not even that | Needs battery or replacement | |
Mathematica license server did not start after restart. | Restarted manually. (common issue) | ||
NMR Web server didn't start up right. | Needed to be kicked a bit to start. | This Gateway needs to go. | |
NMR Router puked | Reprogrammed passwords and forwarding for SSH & RDP. | Needs external administration access configured. |
Other
Topic or event | Action taken | Action required | Notes |
---|---|---|---|
Lee - Steven Lee's SGI had several problems, boot, date, etc. | Lulu wrestled with re-setting (hardware) time. | Advise get UPS? If so, how automate shut-down if no monitor powered? | INC000001652417 |
Marohn - B19 PSB's AS-CHM-Maro-03 RAID-1 had a drive fail. | Oliver tested OK, wiped and re-added to RAID | INC000001652216 | |
Fors - Fors instrument came up and started working. Were both the computer and the instrument previously rebooted? | Roger asked group about reboot history of instrument. | Awaiting group's response (Dillon) on reboot history of instrument | INC000001645245 |
Petersen - His group's UPS died. | Roger got him a quote. | INC000001653506 |