KVM the hard way

The goal of this article is to document the process of setting up a basic KVM instance the hard way. That is, without using GUI tools, magic scripts, VNC’d installers, etc.

Pretty much all of the KVM guides I’ve seen assume you’re sitting at the physical box you’re deploying the VMs on and have access to the local display (i.e. they use GUI tools). Or they assume you just want a magic script to do it all for you. Or they assume you want to sit there and click through the debian-installer over a VNC session.

None of those options sound particularly fun or interesting. Below is a step-by-step guide to getting a KVM instance running the hard way. Most of the information can actually be found by pulling apart Hardy’s ubuntu-vm-builder script. I’m using Debian Lenny as Etch doesn’t have some of the necessary tools, for example kpartx for mapping partitions on a loopback device. Also we assume a 64-bit host and guest, though it should be fairly obvious how to use other numbers of bits. I decided to document the process mainly as a reference, but also as an education in the underlying process.

Let’s begin. First, we need a disk image to work with. Create a 5GB raw image using qemu-img:

$ qemu-img create -f raw lenny-base.raw 5G

Now, create a loopback device for it:

# losetup /dev/loop0 lenny-base.raw

Partition the loopback device using fdisk. I use a single primary partition in this article.

# fdisk /dev/loop0

Now we need to create device-mapper entries for each of the partitions on the loopback device:

# kpartx -a /dev/loop0

If you made a single partition on /dev/loop0, there will now be a device-mapper block device at /dev/mapper/loop0p1, which you can go ahead and make a filesystem on and bootstrap to whatever flavour of Debian/Ubuntu you’d like. You’ll also want to remember the UUID of the root filesystem for later. For example,

# mke2fs -j /dev/mapper/loop0p1
# vol_id --uuid /dev/mapper/loop0p1 > target.uuid
# mkdir /mnt/target
# mount /dev/mapper/loop0p1 /mnt/target
# debootstrap lenny /mnt/target http://ftp.nz.debian.org/debian

At this point, the bootstrapped filesystem will need some manual setting up. In particular you’ll need to

  • Set up /etc/hostname
  • Set up /etc/hosts
  • Set up /etc/fstab (you’ll want to use the UUID you saved before)
  • Enable serial getty in /etc/inittab (important, we’ll use this for initial login)

Copy the following into the target’s /etc/kernel-img.conf:

do_symlinks = yes
relative_links = yes
do_bootfloppy = no
do_initrd = yes
link_in_boot = no
postinst_hook = update-grub
postrm_hook = update-grub
do_bootloader = no

Now we need to install a kernel and set up the boot-loader.

# chroot /mnt/target
target # apt-get install linux-image-amd64 grub
target # mkdir -p /boot/grub
target # cp /usr/lib/grub/x86_64-pc/* /boot/grub

Exit from the target’s chroot. Now we’re back in the host, we need to install grub into the MBR of the disk image. You’ll need that UUID for the root filesystem from before. This step requires the host’s /dev to be bind-mounted into the target’s filesystem

# mount --bind /dev /mnt/target/dev
# echo "(hd0) lenny-base.raw" >> device.map
# grub --device-map=device.map
grub>root (hd0,0)
grub>setup (hd0)
grub>quit
# echo "(hd0) UUID=UUID of target root fs goes here" >> /mnt/target/boot/grub/device.map
# chroot /mnt/target
target # update-grub

update-grub will have written a basic menu.lst, but because it’s using the host’s /dev it will be pointing to the wrong place. Edit the target’s /boot/grub/menu.lst to use the UUID of the filesystem and not use the loop0 device. So, open up the target’s /boot/grub/menu.lst, search for the line:

# kopt_2_6 root=/dev/mapper/loop0p1 ro

and replace /dev/mapper/loop0p1 with

UUID=the uuid of the root filesystem

so it will look something like (make sure the # is still at the start of the line):

# kopt_2_6 root=UUID=81af6388-cca5-4bf2-99dc-a47c81c00445 ro

Also, replace the line groot=(loop0p1) with groot=(hd0,0). Again, it’s “commented out”. Now we need to update grub again.

# chroot /mnt/target
target # update-grub
target # exit

Almost done… At this point check that you’ve enabled the serial getty in the target’s /etc/inittab file as we’re going to boot the VM and use the serial console to do the initial login. Of course, you could install SSH in the chroot, set up networking and forget about the serial console, but it’s interesting none the less. The other alternative is to use VNC to connect to the console, but that’s the easy way out, though you will get boot messages, so if something goes wrong you’ll get to see why.

Unmount everything:

# umount /mnt/target/dev
# umount /mnt/target
# kpartx -d /dev/loop0
# losetup -d /dev/loop0

Now boot your shiny new VM:

# kvm -nographic -serial pty -drive file=lenny-base.raw,if=virtio,index=0,boot=on -daemonize

If all goes well you’ll see something like:

char device redirected to /dev/pts/10

Use minicom to connect to that pseudo-terminal and login to your new VM. Done! Sure it would have been easier if you’d just used a script or a graphical tool, but we got there in the end.

At this point the VM isn’t overly useful without networking, but there’s plenty of documentation in the qemu man pages about the options to enable a virtual NIC. There’s also options for changing the amount of RAM the guest is allocated, the number of CPUs, virtual disks, etc. Go RTFM.

Another option from here is to create a libvirt XML description of your VM and use virsh to manage it. This makes networking and management a bit easier, but isn’t necessary.

Enjoy!

Add comment August 28th, 2008

A great time to be a Nine Inch Nails fan

A few days ago I received my copy of “Ghosts” in the mail. I paid $10USD about a month ago and got the FLACs that day and have been listening to them ever since. Having the physical CDs now is a nice bonus - in fact, I haven’t even bothered to take them out of the shrink-wrap plastic :) You might remember that “Ghosts” was released under a Creative Commons license and the MP3 versions were available for free (legitimately) via Bittorrent. Paying a bit extra got you the physical CDs (or Blu-ray or vinyl depending on how much you wanted to spend). “Ghosts” was released independent of any major label with no advertising yet the album was a huge success for Nine Inch Nails.

So, I was having a kick around musicbrainz and noticed a NIN release I hadn’t heard of - “The Slip”. I went to nin.com and this is what I found:

Click HERE to get the new full-length nine inch nails record: the slip (thank you for your continued and loyal support over the years - this one’s on me)

A couple of clicks later and I’m downloading FLACs of a brand new NIN album via Bittorrent for free and contemplating whether I want to spend 1.2GB of my monthly data cap on the 24kbit/96kHz WAV files :) So, if you’re a NIN fan, join the swarm!

It’s a great time to be a Nine Inch Nails fan.

Add comment May 9th, 2008

libradiotap-1.0.0

Changes since radiotap-0.2:

  • Name changed to libradiotap
  • Now uses the GNU build system, creates both shared and static libraries
  • Removed ieee80211_radiotap.h and folded the appropriate definitions into libradiotap.h
  • Removed the IEEE80211_ prefix from the radiotap field constants
  • radiotap_has_field() is now a static inline function
  • Fixes for big-endian, tested on a PPC host

Download libradiotap-1.0.0

Add comment November 3rd, 2007

Outdoorsy stuff…

On the weekend Emily and I decided that we needed to do more outdoorsy stuff, so we went and found a nice little walk on Mt. Pirongia, just out of Hamilton. We did the “Mangakara Nature Walk“, which is a loop through some native bush and crosses the Mangakara stream. It took us about an hour which included a stop of about twenty minutes for a picnic by the stream - how quaint. The walk itself was very easy and the track was well-formed the entire way. The scenery was stunning - apparently that part of the forest is pretty much pristine native bush as it has never been cleared.

The plan is to make our way through the various walks and then move on to something a little more challenging. Eventually I’d like to take Emily on the Tongariro Northern Crossing which was part of a three-day hike I did when I was in high-school.

Photos on flickr as I can’t seem to get this flickr plugin working :(

1 comment October 29th, 2007

Mac OS X Leopard - Built-in SSH agent

Leopard now comes with a built-in SSH agent. The really nice thing about it is that it integrates with your user’s Keychain. So, the first time you try to unlock your SSH key a dialog will appear asking you for its password along with an option to save that password in your Keychain.

On Tiger I was using SSHKeychain to achieve this, but it had a nasty bug where it would randomly start to consume 100% of a CPU. This chewed through my Macbook Pro’s battery, which was a pain. If you’ve been using a third party SSH agent and want to switch to the built-in agent, make sure to check that you’re not manually setting the SSH_AUTH_SOCK environment variable, which is something I had to do to get SSHKeychain working.

If launch-services is managing your SSH agent, it should look something like:

kenshin:~ scottr$ echo $SSH_AUTH_SOCK
/tmp/launch-fTiPvL/Listeners

Otherwise, check your various profile settings, and check to make sure your third party agent isn’t set as a launch item. You’ll have to log out for this to take effect. Once launch-services is managing your SSH_AUTH_SOCK, logging into OS X will unlock your keychain and allow the ssh-agent to unlock your SSH keys without having to enter another password.

2 comments October 28th, 2007

Mac OS X Leopard “Easter Egg”

I installed Leopard last night and as I was browsing my local network I noticed something kinda funny… apparently this is what a Windows PC looks like:

2 comments October 27th, 2007

Updated Radiotap Decoder for C

An updated version of my Radiotap decoder for C is available. It fixes several bugs, cleans up the interface and most importantly now implements correct byte-swapping on big-endian hosts.

Get radiotap-0.2.tar.gz.

Add comment October 24th, 2007

Error: Timed-out thinking up post title.

I’m sure that no-one has noticed that I’ve been fairly silent on the blogging front for a while. I took a three month break from the Ph.D to do some work for Cambridge Silicon Radio. The experience working on a real-world project was great and the project itself was both interesting and challenging.

I am however looking forward to getting back into the Ph.D work. I’ve still got a week or so before the Ph.D kicks back in so at the moment I’m doing some driver work for RuralLink - specifically getting MadWiFi working better on the CPE/AP devices.

I spent a week or so before the CSR work started looking into performance improvements for MadWiFi. After spending quite a bit of time with oprofile I found a couple of areas in the driver which were causing a large number of PCI transactions to take place unnecessarily. Now, on a laptop or desktop platform this didn’t really make much difference. On an already resource-starved platform such as the Soekris 4526 however, this was resulting in some pretty significant overhead. A couple of patches to MadWiFi later (a couple merged upstream already, one that’s a bit more of a hack specific to our needs) and we’re seeing some much nicer throughput numbers. Off the top of my head, we went from being able to bridge about 9-10 Mbit/s of traffic over wireless through the wired ethernet to about ~15 Mbit/s.

The other neat hack we did was to create a transparent wireless bridge by hacking the ad-hoc demo mode to use 4-address 802.11 frames. This could already be done in other modes, but we really like ad-hoc demo due to its utter simplicity - no associations, no beacons, nothing - just passing frames.

Right now we’re working on implementing our own rate control algorithm. We seem to run into far too many problems on our networks with rate control and Perry came up with a neat idea - as is his wont - so we’re running with it. At the same time we’re looking at using it as a chance to collect large amounts of performance data to give us some deeper knowledge as to what’s going on on our networks. Hopefully lots more info on that soon.

At some point in the (very) near future I need to start thinking about the Ph.D again - I’m starting to think that I should be putting more of a measurement focus into it, but I need to nail down a few ideas first. And maybe play a bit of Guitar Hero as well :P

1 comment August 9th, 2007

Thoughts on libtrace wireless API and radiotap

With libtrace 3.0 we included an API for extracting wireless metadata from packets. So for example, you can call trace_get_wireless_signal_strength_dbm() on a libtrace packet and get it’s absolute signal strength. This is done by decoding the Radiotap monitoring header if present. This is all fine for physical layer attributes, such as signal and noise levels, but the abstraction starts to get fuzzy when it comes to link-layer specific stuff. For example, libtrace 3.0 released with trace_get_wireless_fcs() which extracted the 802.11 FCS from the Radiotap header (even though this was a non-standard field and has since been removed). The problem is, trace_get_wireless_* shouldn’t be specific to certain MAC layer protocols. What if a CRC-16 is used instead of a CRC-32? trace_get_wireless_fcs() has since been removed, but the point also applies to some of the other functions.

So some of the functions as they exist now extract physical layer attributes, and others extract MAC layer attributes. Since libtrace was released, the Radiotap standard has been updated to include a couple of extra fields, such as the number of retries a packet had. Should a new accessor be added to libtrace to extract this? I’d say no, even though it’s a very interesting piece of data. Keep the trace_get_wireless_* functions as generic ways to get physical layer attributes of wireless frames. Let the user decode the Radiotap header in full if they want the 802.11 specific stuff.

Turns out I’m one of those users, so I’ve created a stand-alone Radiotap decoder in C which can extract all the Radiotap fields. If a new Radiotap field is added that describes an interesting physical layer attribute, then maybe an accessor can be added to libtrace for it, but for MAC layer specific fields a stand-alone Radiotap decoder should be used. This should hopefully keep libtrace as generic as possible.

Download version 0.1 of my C Radiotap decoder if you’re interested. Maybe I’ll get around to uploading it to Google Code Hosting at some point in the future.

Add comment April 25th, 2007

Characterising errors in wireless links, continued.

After spending most of last week manually validating the packet matcher I’ve decided that it’s at a pretty good state. I’ve settled (for now at least) on calculating the hamming-distance over 128 bits from various parts of TCP, IPv4 and 802.11MAC headers. Any frames that are transmitted and received within a certain time period and have a hamming-distance of 10 or less are marked as matches. This appears to work “quite well”, but I’ll need to come up with some hard numbers if I’m going to make any arguments based on it.

After convincing myself that the packet matcher was in a good state, I started to look at packet errors. On Monday, I took several traces. Each trace consisted of a 10 minute bi-directional TCP iperf between two nodes which were placed about 4 meters apart in the lab. One acted as an AP and the other acted as a station. I took a separate trace for each of the 802.11b rates, 1, 2, 5.5 and 11 Mbits. I did this so I could look at how different encodings affect error patterns within the packet. I’ll take more traces later with 802.11g and a rates.

Interestingly the error rates were very asymmetric. From the station to the AP, each trace showed a Packet Error Rate (PER) of > 10%. However, from the AP to the station the PER was ~1%. This suggests a hidden terminal which is surprising given that the nodes are only approximately 4 meters apart. Also, the PER decreased as the rate increased, which was surprising.

Over the different rates, the errors patterns within packets were very distinct. In the 1 and 2 Mbit traces (DBPSK and DQPSK), errors within packets were highly clustered, however in the 5.5 and 11 Mbit traces (CCK) the errors were more “spread out”.

Another interesting point is that although the PER decreased as the bitrate increased, the rate of errors within each packet increased. So, at higher rates we had less packets with errors, but the packets were more corrupted.

My next objective is to come up with some statistical models of these bit errors so that they can be applied in simulation. I’d also like to get some “proper” traces from real-world links to work on. Hopefully I’ll get some graphs up to illustrate this data within the next few days.

Add comment April 25th, 2007

Previous Posts


About

I’m a Ph.D student at the University of Waikato, interested in wireless MAC protocols. More

Categories

Links

Feeds