Several projects have produced network maps of the global Internet. New Zealand has not been well covered by such projects because their observation points were overseas and so were limited with respect to finding links between ISPs within New Zealand. A map of New Zealand, produced from observation points within New Zealand is more likely to be complete and could have a more useful level of detail for New Zealanders to understand.
Internet maps of New Zealand have a variety of potential uses including:
- communicating the nature of the Internet to the public,
- investigating the structure of the infrastructure,
- particularly with respect to resiliency and peering,
- mapping the deployment of IPv6 in New Zealand over time and
- mapping the impact of UFB and RBI.
This project is to set up the infrastructure required and then to map the New Zealand Internet infrastructure in as much detail as possible. The infrastructure should be such that repeated maps can be produced at regular intervals.
There will be two phases: data collection and map production.
The data collection phase will consist of scamper software running on AMP servers. The data collection will require a set of addresses to define the New Zealand Internet. These can be obtained from APNIC whois data. Previously we have used this data and checked it with round trip times and found very close correlation.
The second phase, map production, will consist of using existing graph layout software to plot the collected dataset. An evaluation of possible software will be performed first using test data. The scamper output will have to be transformed into the required input format for the selected graph layout package. Custom software will be written to perform this transformation.
Brendon had a larger set of alias data, so I had a go at using that, which took me a lot of effort that I still cant fully explain, and involved producing a graph which had a single node connected to over a third of all nodes on the graph. We couldnt rule that out as a possibility entirely, so that involved a lot of investigation. Eventually I got to the point where I could no longer reproduce the bug, so, I guess I fixed it..
Anyway, that led to trying to make the programme select aliases more intelligently, as opposed to picking them entirely arbitrarily. I made it so it picks an arbitrary address from the network that more of its addresses are part of than any other, or arbitrarily if there is a tie, with a special case for Reannz which gives it priority over other networks.
This required a reasonably major change of how the programme worked. which, on the plus side should mean I can fix the memory leak from last week much more easily, though I havent done that yet (actually I got rid of all of the memory leak removal stuff to get this to work, and will put that back next week).
Then I produced graphs with the data, and asked Shane how he found the idea of using the second graph to index the first. He suggested that on the network graph to size nodes based on the size of the networks, so I wrote a programme to resize the nodes in python. However gephi seems to be rescaling the node sizing when it produces the images, so whatever you do you end up with significant differences in the size of nodes.
I spent the week fixing memory leaks in the code, which are now all gone but one, which is small and not going to cause significant problems.
I also produced graphs using Brendon's alias data, which, disappointingly, looked pretty much the same as the original graphs. Though there are several hundred fewer nodes (in the ungrouped version). Which address is used for the node is chosen arbitrarily (whichever appears first in Brendon's list), and in the case of the original issue that tipped us off to the fact we needed alias detection - that Waikato appears to connect directly to things it goes through Karen to reach - it chose a Waikato address. However, looking over the alias data the only alternatives were a second Waikato address, or a Fijian address.
Generated a full dataset for Chris using all the available AMP monitors.
Looking at the network maps he produced showed up some unusual
topology that I didn't expect to see and didn't agree with, so I
looked further into what was going on with the traceroute data. It turns
out that quite a few devices in the path respond on multiple addresses, so
I spent some time investigating de-aliasing using scamper. Using
the radargun method across the address set came up with 11000 pairs of
addresses representing 500 devices. I'm not entirely convinced of the
accuracy yet so I will run it a few more times and compare results, but it
has been correct for the addresses I have manually confirmed as being
I think I've solved my problem of direct loops being created by the smtp
state machine and the new machine is at least as good as the previous ones
on my initial test trace. Will need to test it on some more traces now.
Having trouble visualising the machine though, as dot won't create a graph
using the layout algorithm that I want.
I tried graphviz again, and managed to produce a large black and white graph with a slightly smaller data set, but it doesnt cope with the full sized data.
I modified Brendon's preprocessing programme so it would check the asn list from team cymru before performing a lookup, so I could set the rtt cutoff much lower without having to do heaps and heaps of lookups.
Brendon made a list of aliases, so I made the programme read those in, and choose an arbitrary address from the aliases for each node. Though the results seem shaky at least with this first data set.
Brendon also got some data from a lot more sources, so I produced graphs with those, using the preprocessing to eliminate the international nodes, but not with the aliasing.
I'm away all next week.. merry christmas.
Kept looking at generating better data for Chris to use in his maps. Wrote
some simple code to try to remove destinations that are obviously
international using latency measurements and hostnames. I err on the side
of caution though so there are still a few included that are quite likely
Got William set up with live access to json weathermap data so that he can
start testing his new weathermap in a more realistic fashion.
Some different visualisations of my smtp state machine graphs show some
nodes with a huge number of links, all at various stages of the protocol
that it probably shouldn't be able to transition between. These tend to
link to another node with a large number of transitions that also links
back, so tight loops are formed with many transitions. Direct loops should
be prevented from forming (they are being checked for at least), but it's
possible that check is missing a case or the loops are being formed as
side effects of merging others states. Need to keep wading through the
step by step merging output to see how this is coming about.
I tidied up my code a lot, and combined the 8 or so different versions of the programme into one that takes command line options.
I made a few more options for the data output, I tried removing all leaf nodes, so it would only display core nodes, but this had little impact in clarifying the data. I tried eliminating the first arbitrary number of nodes in each hop, but the graph would break up before there was any significant improvement in clarity. I also allowed it to remove any hop that never had an rtt less than 35ms, in order to remove the foreign nodes that had creeped in, though this led to a lot of false positives.
I also managed to get gephi to produce an ASN graph, though my solution was pretty clumsy. I saved the graph in gephi, edited the file and reopened it..
I also made a graph with LGL. Gephi is coping with the size of the graphs however, and does give a lot more control over how the data is represented.
Collected some new traceroute datasets for Chris containing fewer
unresponsive hops. Still quite a significant number in the middle of
the path don't reply and it seems they probably won't due to the way they
are configured. May need some heuristics to try to merge unresponsive hops
that are probably the same device.
Saw an NZNOG post about web page loading times ("user experience") and
thought it might be interesting to do a bit more work with the web
download speed data we've been collecting on a couple of AMP monitors.
Wrote a script to convert the data from the http2 test into the HTTP
Archive (HAR) format so I could use their tools to generate waterfall
graphs similar to what firebug produces. This seems to be working fine
except that data from the Waikato monitor is showing sub-millisecond
connection times to websites hosted in the US. Spent some time looking
inside libcurl to see if the problem could be there, but it turns out that
it is the fortinet on the edge of the university network acting as a web
proxy that is breaking my results.
In trying to generate a nice state machine graph to use to illustrate some
points I decided that the merged tree looked really bad - it had large
numbers of transitions between the same pairs of nodes. Generated a lot of
graphs showing the distributions of object sizes across those transitions
to confirm that they were all distinct (in general they were).
I added reading in ASN data to my programme, from files from Team Cymru's website, and used a formula Perry gave me for converting ASNs to colours, so now graphs are coloured by ASN.
Assuming every node with an unknown address was a new node was resulting in 80% of the nodes in the graph being unkown, since the same routes are taken many times. So I made it not display unknown nodes.
Brendon got some new data that checks more times for an address before giving up, which reduced the number of unknown nodes, but it still has around 60% unknown nodes.
Gephi can group nodes according to an attribute into one larger mode, so I played around with that, but it leads to awkward looking results, which I havent been able to work out how to overcome.
After seeing the graphs Chris made with a small number of destinations and
a single testing site, I found a few more sources of addresses that are
used within New Zealand (though not necessarily assigned to NZ
organisations) and started generating a new test data set. The new dataset
will have data for 28,000 /24s from four sources once collection
completes. Will need to have a look and see if this resolution is
Also put together a bunch of sample data and databases for William to use
with the new weathermap. It mimics the KAREN set up, so hopefully he can
use that to write scripts to deal with getting the data into the right
Spent some time picking up the smtp state machine documentation again,
trying to get some images of example trees to show the merging process.
Spent some time helping settle in summer students Chris and William. I'll
be involved in their projects, supplying data etc. As part of that I
collected some sample traceroute data from a few monitors to a few
destinations to get an idea of what I can get from a warts file, and to
give Chris some data to start building small maps with. Played around a
bit with scamper to see how the doubletree implementation works and if it
would be useful to use when collecting the data (and any other ways I can
use to minimise the impact of collection). Also put together some sample
code to demonstrate how to operate on the traceroute data.
Investigated information published by APNIC about addresses
assigned/allocated for NZ use to get a good initial pool of addresses to
test to in order to generate larger topology graphs. Will have to see if
there are any other avenues that will list addresses used within New
Zealand (even if they are assigned elsewhere), thinking some of the public
looking glasses may be useful here.
Short week back due to travel. Most of the week was spent catching up on
emails and things that needed to be done while I was away (weathermap
updates, etc). Spent some time looking at the way scamper was packaged for
AMP machines and checking that it would work fine for an upcoming topology
collection project, as well as investigating NZ IPv6 documentation
(community best practices) to see if there were any clues that might make
it easier to perform measurement to addresses that are actually in use.