I have to create a splunk search and alert which triggers a ticket creation event (managed by some other script in trigger actions). The criterias are –
- we get logs for hosts every 15 mins.
- we check if the database is running for eact host. Then we keep a count of logs when not running.
- if the database is down for 45 mins that is 3 counts for a particular host, it should trigger alert.
- But if the alert is triggered for a host already, it should not let new alerts be triggered for that host for 48 hours.
- Any new host which is down after the first host should be able to create alerts but not the ones that have already triggered in the span of 48h.
- If after 48h a host database is down, create a new ticket.
index=abc db.status= “0” earliest=-45m@m | stats count by host | where count > 3
To get hosts where db was down for last 45 mins.
I am not able to figure out how to filter out hosts that I have already sent alert for in the past 48h. How to modify the search or alert for this part?
Will throttle feature stop alerts being sent for new hosts, that fail after the first host failure alerts?
skiddaa is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.