Vexera unresponsive

Resolved

April 6th, 2022 at 19:05

Resolved

April 6th, 2022 at 19:05

Vexera was unresponsive for about 10 minutes (18:44-18:53 UTC). This issue is due to the deployment of a bogus update.

The update was quickly tested on our beta instance before the deployment, and no issues were found (we used only very simple tests/commands). Edit: See at the end
The issue was therefore only identified when the update got deployed to our production instance, as alarms about high errors rate fired.

We immediately initiated a rollback, but an issue in our orchestrator prevented it from automatically receiving the rollback instruction, delaying the recovery of Vexera.

Please accept our apologies for this outage 😦

Edit: After investigation, it looks like we had almost zero chances to identify the bug.
The bug occurs only if the user sending the message doesn't have any configuration option set in our database (eg. if you never used +userlocale or such commands). We actually have user config set on our accounts on both our dev and beta instances…

Monitoring

April 6th, 2022 at 18:53

Monitoring

April 6th, 2022 at 18:53

We rolled back an update and currently monitoring the result.

Investigating

April 6th, 2022 at 18:45

Investigating

April 6th, 2022 at 18:45

Our monitoring system fired an alert regarding elevated errors rate. We are investigating!

Vexera - Vexera unresponsive – Incident details

All systems operational