Stuff
Linux Congestion Control looks interesting.
Now that I understand how things work a bit better in ns-2 and have a better idea of what I am supporting and the functionality I want to expose, I want to redesign a bit of the code in the ns-2 agent.
It makes sense for a TCP NSC agent to be derived from Agent/TCP. Therefore I propose Agent/TCP/NSC. As usual I’ll make class names derived from this one that just load the correct stack. So Agent/TCP/NSC/Linux24 and so on. This will just be to make it easy to integrate into existing scripts more than anything else.
For other protocols I should create NSC agents with the correct names. So, there should be an Agent/SCTP/NSC. At the moment I don’t really support SCTP, so I should look into how I’ll add that to the current interface.
One problem I have is that I have my “compatibility” mode, where I don’t specify IP addresses and the like, and my “low level” mode, where you have to add interfaces and specify IP addresses and so on. I want to still be able to use the low level mode now and again. Maybe I’ll call this Agent/NSC and leave it at that. Time to deprecate Agent/BSD.
Generally with ns-2 TCP agents you can do something like this:
set tcp [new Agent/TCP]
... do something ...
puts "Round trip time: [$tcp set rtt_]"
Now, for several reasons using set rtt_ has not really worked up until now with the NSC stacks. I was just attemping to bind a variable in C++ to Tcl like the ns-2 agents do, but this is now powerful enough; I really need to call a function for each stack.
This can be done if I override the OTcl set operator. I believe I know how to do this in OTcl. It could also be done in C++ using the C interface to OTcl, but that is a heck of a lot messier.
So the question then becomes: where do I add the OTcl code that I need? I’ll have to look into it.
The code to do what I want will be something like below (prolly still needs some more work, that is just from the top of my head):
Class Agent/NSC
Agent/NSC instproc set {args} {
if {[lindex $args 0] == rtt_} {
return $self get_rtt
}
eval $self next $args
}
The simulation framework has come some way in all the days I haven’t bothered to blog about it. Here’s a summary of features I can think of off the top of my head:
Backend:
* Asynchronous server that should be able to handle any number of clients (within reason, it is a simple select-based server using the Python async-socket framework)
* Text-based protocol that used Python’s pickle support to serialise python objects. Protocol is human-readable and extendable.
* Full control of clients connected: clients can be told to stop or start simulating.
* Ability to delete certain data points or stop a specific simulation or specific client from simulating.
* Validation of simulations by comparing results from a previous run and using a specific script to do the comparison. This has already come in handy when checking if each machine was getting correct results.
Client side:
* Client runs simulations in a similar fashion to the predecessor script, but this time it forks and uses getrusage() to figure out time spent simulating.
* Client requests simulations from server and reports results back over the connection.
* Forking to simulate means that when a stop command is received from the server the simulator can be killed instantly.
* May be informed of a repository update; in this case an “svn update” is performed and the client re-connects. The repository version is exchanged in connection establishment. The control script can also be used to inform the clients to update. This makes sure that each client is as up to date as the server is, meaning we don’t run random old versions of a simulation.
* Testing of a single simulation locally to check for errors that will occur when run as part of the framework.
Control side: * Seperate control script that allows full control of all possibilities in the protocol. This script can be used by any frontend to control the simulation server.
Web pages side: * Bad-looking but functional webpages that expose most of the functionality of the control script. Uses the control script to get simulation data as well as some manual processing of the directory hierarchy. * Viewing of data with the ability to delete specific points. * Saving of validation information.
Need to move on to a newer version of lwip. By graphing sim-19 I was able to see the weird results of lwip, and by doing some further testing I found that the old version (0.7.2+patches) entered a sort of “stop-and-wait”.
Using the CVS version fixes this, so I’ll probably move on to that for all my simulations with lwip in the near future. Should really have done this some time ago.
Validation stuff works now, though it is rather simple.
gnub was compiled with -O2, which is bad for FreeBSD. Setting it to -O fixed it.
Simple permissions added to webpages. The webpages are still rather heavyweight though.
Still haven’t added any stuff into the validation part of my new framework. Looks like I’ve found a need for this.
Because I am getting different results simulating on different systems!
I really need to figure out why that is. But I might as well make a validation system so I can detect this in the future.
Simple simulations still seem to have the same results, it’s just one more complex simulation that is stuffing up at the moment.
Possible stuff to work on: * Continue making tomacs paper better. Tony should be reading over this sometime, so this is in part in his hands. * Chase up more emulation stuff for pam. * Work on simulation framework. * Update FreeBSD (and Linux soon?) * Start on Linux 2.6 * Clean up NSC < -> ns-2 interface (stats gathering needs work) (other protocol stuff needs work) * Fix parser for SCTP * Update literature review stuff with atm-tn
Looked at the ATM-TN code. It seems quite far removed from the original bsd-lite code. At very least, it is C++ in classes and so on. The basic structure looks BSD-like.
Probably not worth talking about too much for now, but I should look at it a bit more when I look to do literature review stuff.
Webpage frontend, of course. * Generates graphs on said webpages. Not sure where we should specify how the graph is generated. Maybe each sim should have its own graph script, and if that doesn’t exist a webpage to generate a graph is shown. * Allows viewing of data on webpages. Deleting of data. Saving data. Saving data as validation data. Delete validation data. * Start simulation runs on webpages. Validate against previous data. Should be able to set only some simulations to be run or simulation order. * Need some way for the clients to know of a new simulation that was started from the webpage frontend.
Results * Need a consistent format for results to be printed out in. CSV sounds like the best idea so far, it is standard and easy enough. * Need for a way for a simulation to produce just a single result (one line in the CSV) or multiple results. Think a simulation that records sequence number over time vs. a simulation that records goodput over 200 seconds of sim time given a certain random loss. * Reporting stats into CSV should be as simple as possible. Perhaps have a family of standard functions in TCL and Python that can be used.
Other stuff
* When a client asks for a simulation, it should report its repository version. If this is old, it will need to update to be in sync with the server.
* How do we notify clients of a new simulation run? We want the clients to be idle until we start a simulation run, in which case they all start simulating.
* Remove work/ dir limitation of only one client per directory structure. Need to figure out some other place to run the sim.
* Still want to be able to run stuff from the command-line. This is important for testing.
So I ran some of the last simulations of sim-23. They ran on voodoo. The times are reported below:
OpenBSD3 16m18.399s
lwip 10m8.064s
Linux24 108m19.546s
FreeBSD5 10m39.963s
Linux took 108 minutes! I knew it wasn’t the quickest, but that is extraordinarily slow. Should really profile it and figure out what is up. Something strange is going on.
Tried fixing up lwip so it understood different MTUs. I’m not sure that it worked. It might have. Will need to test it sometime by running sim-12-bsd or similar and looking at a packet trace.
Fix memory leak in Linux. When allocating inodes during an accept() call, we malloc’ed some memory which was never freed. So inode_put() now frees the memory.
This makes a big difference. 500MB -> 80MB or so when the simulation was long.
Also, __get_free_pages was chewing a lot of memory when I instantiated a lot of stacks. I modified it so it only allocated smaller amounts of memory. This helps alleviate the problem.
If I compile libfreebsd5.so with -O2 (optimise more) then the __start_set and __stop_set symbols don’t get created as they normally do. Using -O is fine.
Using liblinux24.so on gnub with -O2 -pg seems to result in a segfault. Needs more investigation.
So, performance on gnub and carceri doesn’t really match up, I get different trends in the graphs.
Need to sort this one out. Check compile options, run again, profile and so on.
Otherwise I just need to bloody finish this performance section then go about making the paper as good as it can be.
So my distributed simulation stuff works rather well. The simulation framework is getting a little messy now because it has been hacked at whenever I wanted to add a feature. It’s now a bit different to it’s original design.
It could easily be cleaned up, it is only 427 lines of Python code. Not bad, it features:
* Simulation framework to run simulations: normally in random order, but any simulation or set of simulations can be specified
* Distributed simulation, server is set to manage a set of simulations, clients connect, run a simulation, then report back statistics.
* Basic validation system.
* Creation of dummy simulation scripts to get a new simulation scenario up and running.
* Support for running X-many runs of each simulation number to get the mean of every simulation (generally the get_parameters script will setup different random seeds for each run)
I could almost touch it up and make it available for download. There could even be a publication in it if a bit more work was done maybe. It is getting quite useful…
Just coded up a distributed simulation system using RPC with Python. Python 2.3+ comes with an xml-rpc module which I used. It was very easy, including the learning curve to use this library and use rpc for the first time ever, I got my system going in about 1.5 hours.
Right now I’m loving how high-level Python is. Bugger doing any of this in C++. I imagine similar facilities exist in other so-called “scripting” languages. But anyway, xmlrpclib in Python rocks.
Powered by WordPress