Intermittent Access Issue on 28th Nov 2025 (06:43–06:48 UTC)

Incident Report for Sardine AI

Postmortem

Post-Incident Report: Service Degradation (November 28 & 29, 2025)

(All timestamps are in UTC)

Summary

  • First Incident (Nov 28): Elevated error rates and intermittent access were observed for approximately 5 minutes (06:43–06:48 UTC).
  • Second Incident (Nov 29): 70-minute window of elevated errors (18:10–19:20 UTC), including a severe service degradation period of roughly 22 minutes.

Root Cause

This was caused by an unexpected and significant spike in traffic volume. The surge in requests temporarily exceeded our forecasted capacity and our auto scaling capability, causing congestion in our application layer.Impact

Symptoms

During these windows, customers using the Dashboard and Core Risk APIs experienced increased latency and 502/503/504 errors.Resolution and Next Steps

Short Term Solution

Our engineering teams intervened to stabilize the platform during the events.

Long Term Solution

To ensure our systems remain resilient against future traffic spikes of this magnitude, we are currently provisioning additional infrastructure and permanently increasing our system capacity.

Posted Dec 04, 2025 - 17:21 UTC

Resolved

Dear Customer,

We experienced a brief period of intermittent access on 28th Nov 2025, between 06:43 UTC and 06:48 UTC.

Our team quickly identified the root cause and has implemented system reinforcements to prevent recurrence. All services have been fully restored and are operating as expected.

Should you notice any further anomalies, please report them to us immediately.
Posted Nov 28, 2025 - 06:30 UTC