Prometheus seems to assume that infinite memory is available for satisfying any given query, so expensive queries can OOM the entire Prometheus server.
Is there any way to configure it to limit the memory available to queries, individually or collectively, so the queries will fail instead of crashing the entire Prometheus server instance?
Or, better, limit queries’ working memory, so the query still executes it just runs slower due to e.g. spilling interim results to tempfiles on disk. Like a relational database can do with on-disk tempfiles, tapesorts, etc.
It’s difficult to accept that a single query can DoS the entire Prometheus server, causing metric drops etc, and this is considered normal, so I must be doing something wrong.