Michale, nevim jak je to autenticke - mysleno ten popis incidentu.
Z praxe vyplyva, ze pokud doslo ke zkratu na nejnizsi urovni spotreby (napr zdroj nejakeho spotrebice), a vybavily se jistici prvky na nadrazene urovni (zde asi na jisteni az o dve urovne vyse), tak je asi:
a.) spatne spocitana selektivita jisteni
nebo
b.) doslo k pomalemu zkratu
a hlavni jistici prvek na cele vetvi to nevydrzel a sel down.
Ja bych to typnul na spatne spocitana nebo provozovana selektivita jisteni - celkem dost opimijena vec v provozu datovych center.
Pokud sla jedna vetev DOWN a druha byla stale UP, tak zustavame stale v beznem provoznim stavu cele napajeci soustavy.
Pokud vsak maji zakaznici jen jednozdrojova zarizeni, tak to pro ne mohl byt problem a neco slo DOWN. Je veci zakaznika, jestli ma ci nema dvouzdrojove zarizeni.
Ozvali se vsak i zakaznici, kteri maji v CeColo dvouzdrojova zarizeni a ty se vcera vypnuly. To muze ukazovat na to, ze sly DOWN obe napajeci vetve a to uz je problem, nebo maji zakaznici oba zdroje pripojene na jednu napajeci vetev (i takovych je dost).
Nedelejme (mysleno vsichni) predcasne zavery a vyckejme na finalni komunikaci T-Mobile k tomuto vcerejsimu vypadku.
RadekM
Na jednom z for se objevilo nasledujici. Zda je to autenticke nevim, nedokazu to potvrdit za zaklade zadnych sdeleni, ktere jsou mimo NDA.
Management summary:
UPS A1.1 AC load outage by tripped output breaker. Source of the incident is probably a shortcircuit
in the distribution, which caused superior power-breaker to disconnect the load, but the
investigation is still pending. UPS and breaker loads were within the normal limits in the moment
of this incident and no maintenance was being done.
Redundant AC power feeds B and DC feeds A+B were not affected and remained operational,
therefore there was no availability impact.
Timeline of the incident:
12:27 alarm raised in our monitoring system, helpdesk operators started investigating
12:36 problem isolated to UPS output breaker QF2
12:42 service restoration complete
12:50+ customer’s breakers one-by-one ON based on our monitoring data and customer
feedback, possible short-circuit not present anymore
equipment connected to UPS A1.1AC back operational