partial CEPH outage in London, reduced performance and possibility of total outage due to network loss
Incident Report for Heficed
Resolved
The Ceph cluster has recovered and steps to mitigate this problem in the future have been taken.
Posted Mar 04, 2024 - 15:21 UTC
Monitoring
Ceph should be back up and operational with no data lost, filesystem corruptions could have occurred due to the extended performance degradation, with continued performance loss as the cluster recovers.
Posted Mar 04, 2024 - 14:33 UTC
Identified
The issue has been identified and a fix is being implemented.
Posted Mar 04, 2024 - 10:39 UTC
This incident affected: London, United Kingdom (LON - Cloud Servers).