Realtime measurement timeouts
Incident Report for ROOCKiE
Postmortem

Today 17.8.2008 in the morning around 6 am, a bug in a component of our time series database triggered a chain reaction of issues in the ROOCKiE data processing modules. Users device data was not processed in real-time and data display in the ROOCKiE app was misleading throughout the day.

After identifying the root cause in the database component configuration, a fix was applied immediately. We then had to restart all affected services, for them to start processing the delayed data of all ROOCKiE Home devices. It took 30 minutes for the system to recover completely.

We’re looking into ways to make our time series database configuration more reliable by:

  • improving fail-over behavior of the affected database component
  • creating dedicated test scenarios

We’ll also continue to invest in our monitoring and alerting solutions to fix this kind of issues as timely as possible.

Posted Aug 17, 2020 - 23:19 CEST

Resolved
This incident has been resolved.
Posted Aug 17, 2020 - 22:52 CEST
Monitoring
Proccessing of delayed data has finished. All system are operational again. We continue to monitor all results.
Posted Aug 17, 2020 - 22:39 CEST
Update
We are continuing to work on a fix for this issue.
Posted Aug 17, 2020 - 22:30 CEST
Identified
The issue has been identified and a fix is being implemented. Today's data starts to drip in and appear in the app interfaces.
Posted Aug 17, 2020 - 22:25 CEST
Update
We are continuing to investigate this issue.
Posted Aug 17, 2020 - 19:30 CEST
Update
We are continuing to investigate this issue.
Posted Aug 17, 2020 - 19:30 CEST
Investigating
ROOCKiE is investigating issues with realtime updates for device measurements. Users may be encountering connectivity issues and erroneous data in the Circle of Consumption.
Posted Aug 17, 2020 - 17:23 CEST
This incident affected: Home (API, App).