Brendon Jones's blog
Spent most of the week trying to integrate Joel's event marker graphs with
the interactive AMP graphs that he also wrote. Took me a while to get my
head around the levels of indirection, but I've pretty much tracked down
the correct locations in the code to be able to update events on the fly
in the same way that the data is presented. Will be linking this in with
the output of various anomaly detection algorithms to better visualise the
quality of the detected events.
Also made the event detection on multiple AMP time series a bit more
robust after it failed to continue running over the weekend. If for some
reason erg is unreachable or the data returned doesn't make sense it
should be able to continue running. Looking forward to getting some useful
events and being able to quickly investigate the data behind them!
Downloaded and installed a perfSONAR livecd to try to get a better feel
for what it actually does and how it does it. It gets a bit unhappy not
having real Internet access, but it seems to do enough that I can see what
is running and what the code looks like. Finally found some useful
documentation about the interactions between different parts of the
system, in the form of presentations from a workshop a couple of years
ago. If we speak the right bits of XML to register with a lookup service
and respond to queries we should be able to export data to perfSONAR ok.
Wrote up a small bit of documentation about using clusterGL on the display
wall, based on information from Paul and Andreas. Had a look around other
scripts that I found to display video, images, presentations, etc and
might have a look at getting them working and documented too if I get a
Fixed a couple of small problems with multiple input time series for event
detection where data wasn't being assigned to the proper series, and set
it running over the weekend watching RTT to destinations from Waikato.
Wrote a short script to regularly report AMP data to stdout (anticipating
the sort of data that Nathan might produce) and updated my main anomaly
detection program to operate with multiple simultaneous time series
inputs. It will process a few days of historical data for each time series
before adjusting to real time and continuing to process data as it
arrives. Wrote a small event filter program to try to get a feel for what
might be required there - need to get an idea of how many events, of what
severity should be required to generate actual alerts of some level. Also
need to determine how best to aggregate events over time, as different
detectors may take different lengths of time to trigger for the same
Spent Friday at a stall for open day trying to encourage students to study
computer science. Seemed to be fewer students than in previous years, but
many of the ones we did see were quite enthusiastic and seemed to be
leaning strongly towards doing computer science. Thanks to everyone who
helped get BSOD working on the display wall, carrying stuff around,
setting up equipment and evangelising to students. I've started to
document what I've learnt about the display wall so hopefully it's a bit
easier next time around.
Tried to make more distinctions between different types of observed time
series data so that the most appropriate combination of detectors and
detector parameters can be applied (a time series measuring loss needs to
be treated quite differently to a time series measuring rtt for example).
With better targetted detectors I continued to tweak their parameters, and
am generally happy that my idea of an event and the systems idea match.
Now I think it is time to throw even more data at it and get more opinions
on the quality of my opinions/results.
Also spent some more time looking into the operation of perfsonar and how
it might be possible to interoperate with AMP. Most communication seems to
be done using SOAP/XML which is unfortunately exceedingly verbose and
annoying to read. It does look like we could register our results as an
available resource and share them with any of the existing perfsonar
Updated most of my event detectors to estimate a severity level for
anything they detect to better visualise results. I can investigate less
severe events to see if they should trigger anything or could possibly be
Tried exploring the parameter space (threshold levels, sensitivity, etc)
of the existing plateau detectors to try to find any sweet spots with high
accuracy, low false positive rate. Best results tended to be achieved with
very high values (simply having a high barrier to being an anomaly), but
still included data that I don't agree with. Spent some time reading other
papers to try to find ways to improve the situation, while still keeping
it simple and understandable by a human.
Started getting up to speed with the anomaly detection framework that
Shane has been putting together. Implemented a couple of very simple
plateau/mode based detectors to familiarise myself better with the system
and started tweaking parameters in existing detectors to see what sort of
effects they have. Wrote a few supporting scripts and wrappers to allow me
to easily fetch new data, test across all fetched data and view graphs of
all the results.
Caught up with Sam Russell and Steve Cotter from REANNZ on Tuesday which
was good. Sounds like they are really keen to get more measurement across
Attended a talk at Lightwire by Jamie about UFB. It was based on the one
Shane Hobson gave at NZNOG but with updated information about how some
parts of it will actually work. Still seems to be a few gaps and
Short week this week, which I spent mostly tidying up recent changes to
AMP and double checking that my interactions between tests weren't leaking
anything. The combined dns/icmp test now properly matches icmp results
with the right address, regardless of the ordering or the number of tests
running. This means we'll now be able to better resolve and test to
targets that pull dns tricks rather than using fixed addresses. Also
improved general error logging to better report where data is failing to
Tracked down the cause of the results corruption in the combined dns/icmp
test to the way I was calling one test from within another. Tests normally
try to tidy up after themselves but in this case it isn't required as it
happens at a later date, but because it was happening anyway, memory got
clobbered. Just need to deal with matching icmp results to the right dns
responses now and then this can be tested out on the Waikato amplet.
Updated all the amplets with a new version of amp containing fixes to the
http2 test as well as a few other small fixes to byte ordering and test
stability. Had to update the server too due to the byte ordering fixes.
Wrote a small wrapper around the power management board in the emulation
network to allow easy access to controlling the outlets.
Fixed the build problems I was having with AMP in Lenny. I now have
working packages with the fixed HTTP test that can be pushed out to the
Worked on updating the COMP312 assignment from last year to provide a
little bit more initial direction and pointers to some documentation. Also
spent an afternoon helping out during the 312 lab time.
Spent some time working on MSI proposals fleshing out some of the sections
about AMP, general proofreading and reference hunting.
Fixed the http2 test in AMP to properly share the DNS cache between
simultaneous connections which means it no longer performs unnecessary
lookups for the same name. The sharing interface in libcurl actually works
Tried to build new amplet packages including the recent changes, but ran
into some problems with libraries when building in my lenny buildroot.
Autoconf/make is meant to build a particular binary with an extra library
that the rest don't need, but this doesn't make it through to the Makefile
Jamie put together new RJ45-DB9 serial connectors for the emulation
network, so I created some sensible minicom configs for all the machines,
should be just as easy to use now as the old system with the Cyclades
terminal server was. Also set up udev on my linux image to force a
consistent order of the network interfaces that matches the way they are