I’m new to Grafana, and quickly realizing I may not have a great grasp on how alerts and/or Prometheus data are evaluated, so forgive my ignorance. I’ve tried to set up an alert, but it never fires. Other alerts do fire though.
I have a value from Prometheus, vastai_machine_gpu_idle
. This value is usually either 8 or 0. I’m trying to set up an alert when that value changes.
My query (A) is delta(vastai_machine_gpu_idle[$__rate_interval])
, and I have two expressions which I don’t understand very well:
B – Reduce: Input: A
, Function: Last
, Mode: Strict
C – Threshold: Input: B
, Is outside range: 0
to 0
Intuitively it seems like this should work? The way I assume this works is that alert conditions are evaluated every 15 seconds (in my case), and so delta(vastai_machine_gpu_idle[$__rate_interval])
returns the difference between what vastai_machine_gpu_idle
was 15 seconds ago vs now; in some cases this will be 8, in others it will be -8. The part where I’m less sure are the expressions.
Any ideas why this may not be firing or any advice on how I could learn?
4