Sending exactly one notification n seconds after the event

I have a system that checks the status of a number of systems, such as, is http up, does the system respond to ping, that sort of thing.

I check all of these systems every 5 minutes. If something is broken, I send out an e-mail with a notice. If the problem is cleared I send out an “all clear, problem resolved” notice too.

I want to send out a notice 30 minutes after a problem is detected and the problem is still ongoing and then at failure +1h, +2h, +4h, etc. I do not want to flood the user with messages every time the event “problem still ongoing” occurs which is every 5 minutes.

I could store a “last notification sent at” somewhere and just work with that but I would rather just solve it in the code itself. Am I missing something very obvious or is this the only feasible way?

You could first of all store the status of all systems connected with a given recipient, e.g.

[ {
    recipient: '[email protected]',
    situation: [
        { 'system': 'alnitak',
          'failed': [
              { 'subsystem': 'http',
                 'failures': [
                     { 'code': '80701',
                       'text': 'SYN unacknowledged',
                       'timestamp': 1375189804,
                       'alertno': 1,
                       'nextalert: 1375193404 },
                     ...
              },
              ...
          ]
        },
        ...
    ]
  },
  ...
]

Every five minutes you save each recipient’s extant block, and generate a new one.

You now need a function that will walk two such blocks, and divide all items in three blocks with the same structure as the “status block”:

present in new only: add new issue to “NEW ISSUES” block.
present in old only: add old issue to “RESOLVED” block.
present in both: checks the nextalert, and alertno. If nextalert has not expired, do nothing. If it has expired, the item is added to the “UNRESOLVED” block, alertno is incremented, and its value is used to decide how much to add to current timestamp to get the new value for nextalert.

The important thing is the do nothing in the above loop. This ensures that if nothing happens, no alert will be sent even if there are ongoing issues.

This architecture allows little tweaks – for example: if there are no new nor resolved issues (this is important, for both might require timely decisions, such as to inform a customer of a service being back on line), you can do an additional walk of the status blocks and approximate the “next alert” time in the new block to the nearest multiple of, say, 15 minutes, before calling the function that builds the unresolved block.

This has the effect of posticipating all alerts so that they come together at intervals of 15 minutes. In this example:

10:05 alert 1 goes out, and is scheduled for 15 minutes from now: 10:20
10:10 alert 2 goes out, and is scheduled for 15 minutes from now: 10:25
10:15 nothing happens
10:20 alert 1 is sent ("Still unresolved") and rescheduled for 10:50
10:25 alert 2 is sent ("Still unresolved") and rescheduled for 10:55

what would happen is that both alerts would be sent at the exact time the first time, and then both would be sent together at 10:30, thereby sending three emails instead of four. The alert 1 rescheduling is delayed by 10 minutes in this example.

(Other time-tweaking tricks can be played). Notice: if you snooze the nextalert value, remember that this value being different does not count when comparing old and new status blocks.

Now that we have the three (possibly empty) blocks, a third function can compose an email by walking the three blocks and spewing out text — or just returning if all three blocks are empty:

Dear mr. Serni,
The following issues are now marked SOLVED:

    System: albutain
    Reason: fail to respond to PING
    Alert : 27 Jul 2013, 20:27:33 UTC

The following issues are NEW and require your attention:

    ...

The following issues are still unresolved:

    ...

Possibly, if all three blocks are “{}” for a given period, you may want to send an email all the same — just to remind that the checking system is alive and well. A subject line might suffice. This periodic email could be sent also to deal with unresolved issues, which otherwise would only be sent upon resolution of an issue or creation of a new one, and “starve” otherwise.

I think the most straight forward way of achieving what you want is to store the state of the monitored systems as well as whether a notification was sent at the point in time that your process runs. Depending on how many systems and metrics you are monitoring this could be as simple as writing a few entries to a flat text file.

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị

Filed under: softwareengineering - @ 20:12

Thẻ: time

Thiết kế website giá rẻ

Danh mục

Sending exactly one notification n seconds after the event