User login

Weekly Report for week ending 8 June 2012

11

Jun

2012

Wrote some test code to try to group similar events together and then
determine whether or not to alert on the group. Simply grouping events
that happen within a couple of measurement periods of one another looks to
work well in the general case for the data I have, and increasing the
tolerances for tests to the same target or the same metric expands quite
nicely to also catch escalating problems. Using the knowledge gained from
that I started to put together a better system to process/merge events
online and write them to a database for ease of searching.

Also started to look at how different levels of reporting could work based
on number of events in a group, number of different locations involved,
reported severity levels etc. Most single events are unlikely to be
reported in any sort of urgent fashion, but will still be recorded for
historical analysis. Filtering out these means that the more serious
events can be reported in a more time critical way, to more important
people.