I would like to detect drops or spikes in Network I/O for a specific Kubernetes Pod, that somehow are to be considered anomalies. I have a Grafana dashboard for this.
I ended up with this Z-score formula for the metric container_network_receive_bytes_total
(
avg_over_time(container_network_receive_bytes_total{pod=~"xxx-.*"}[$__rate_interval])
- avg_over_time(container_network_receive_bytes_total{pod=~"xxx-.*"}[1d])
)
/ stddev_over_time(container_network_receive_bytes_total{pod=~"xxx-.*"}[1d])
but what I would like to calculate the Z-score on is the rate of this metric:
rate(container_network_receive_bytes_total{pod=~"xxx-.*"}[120s])
But when I use the rate(...)
in the Z-score formula I end up with this error: parse error: ranges only allowed for vector selectors
How do I build such a metric based on the Z-score of the rate of the network I/O metric?
The error “ranges only allowed for vector selectors” occurs because you’re trying to use a range (e.g., [120s]) within the stddev_over_time
function, normally which expects a vector of values.
In Prometheus, certain functions like avg_over_time and stddev_over_time expect range vectors. However, the rate()
function itself returns an instant vector, and you can’t directly use rate() inside a range vector function.
( rate(container_network_receive_bytes_total{pod=~"xxx-.*"}[2m]) - avg_over_time(rate(container_network_receive_bytes_total{pod=~"xxx-.*"}[2m])[1d:1d]) ) / stddev_over_time(rate(container_network_receive_bytes_total{pod=~"xxx-.*"}[2m])[1d:1d])
Here rate(container_network_receive_bytes_total{pod=~"xxx-.*"}[2m])
computes the rate of bytes received over the past 2 minutes.
This avg_over_time(rate(container_network_receive_bytes_total{pod=~"xxx-.*"}[2m])[1d:1d])
calculates the average rate over the past day.
Here stddev_over_time(rate(container_network_receive_bytes_total{pod=~"xxx-.*"}[2m])[1d:1d]
) computes the standard deviation of the rate over the past day.
Refer to this official promlabs blog for more information on rate function.
Refer to this similar SO question Q1, Q2 to understand the ranged vectors better.
Refer to this blog by metricfire to know about the rate()
function
1