We were investigating an anomaly in our number of completed operations yesterday and of course checked our AWS metrics. We use Elastic Beanstalk to run a key component of our system and its service metrics are really puzzling us. In particular, the gap in the target response time (see image below).
We were still successfully processing at least some operations in that period – you can see a dip in the request count, and that’s roughly reflective of the number of successful operations, so it’s not like the service completely stopped responding. I would have expected to see a spike (or something else interesting), rather than the data simply being missing.
I would be very grateful if, as well as giving a cause for the odd metric, you could suggest how we might dig deeper into the issue.