I have a standard AWS architecture to capture data from my source that streams data into Firehose endpoint. Firehose streams data into s3, which again ahs a PUT event that triggers lambda to clean firehose payload, adds metadata etc. and then prepares the final data structure that’s ready for consumption
[source] > [Firehose] > [s3-raw] > [lambda] > [s3-stage]
I want to build monitors on these pipeline with the goal of notifying any mismatch between # of s3 files that have been published vs # of lambda invocations. Goal is to notify is any s3 file missed the processing step to land into stage from raw store.
What would be the right s3 and lambda metrics to capture here.