Monthly Archives: October 2008

How I dealt with KVM host identification

Recently I started researching how a KVM guest could tell who it’s host is. I dug into the KVM code and found in r77 there’s new functionality to pass -uuid on the command line on the host and access it from the qemu monitor. KVM-78 should have bios modifications such that ‘dmidecode’ will return this uuid if passed as well. As noted in that last link, libvirt doesn’t support that yet. Although it’s a trivial change I can’t really wait for all this stuff, nor do I want to take the time to package it.

The existing vmport code in qemu, best I can tell, is used mostly for some vmmouse stuff. I can’t find anywhere than anyone actually uses it. I don’t want to write code to start poking around memory holes, so I threw out the idea of prodding around there.

In the end, I used my infrastructure, again. kvm/libvirt hosts report their guests to iclassify based on hostname:

#!/usr/bin/ruby -w
# This is an iclassify script for updating a libvirt (KVM) host with the list of guests
# Until KVM guests can access their UUID, this will rely on hostname being unique
# 2008-10-28 -- Bryan McLellan 

def update_guests
  local_guests = Array.new
  server_guests = Array.new

  # Get the list of current guest hostnames
  %x[virsh list --all 2>/dev/null].each { |line|

    # ignore header lines, match lines that start with a number ID, grab hostname from second column
    if row = line.match(/^\s+\d+\s+(\S+)/)
      local_guests << row[1]
    end
  }

  unless local_guests.empty?
	  # update the array, and try to give it a break if it's the same
	  server_guests = attrib?("guest-hostname")
	  replace_attrib("guest-hostname", local_guests) unless server_guests == local_guests
  end

end 

if tag?("kvm")
  update_guests
end

Then I could modify my ruby script that displays virt server status, originally written when I was using just vmware servers, to also show kvm/libvirt hosts:

client = IClassify::Client.new(config[:server], config[:user], config[:passwd])

vmwarehosts = client.search('(tag:vmware-server OR tag:kvm) AND NOT tag:workstation', [])

vmwarehosts.sort_by { |node| node.description }.each do |node|

  puts node.description
  puts "\tmemory free/total: " + node.attrib?('memoryfree').gsub!(/[^0-9.]/,"") + "/" + node.attrib?('memorysize') \
    + "\tswap free/total: " + node.attrib?('swapfree').gsub!(/[^0-9.]/,"") + "/" + node.attrib?('swapsize')
  print "\tguests: "

  # remember that node.description and friends aren't always clean
  guests = client.search("vmwarehost:#{node.attrib?('hostname')}", [])
  guests.sort_by { |node| node.description }.each do |guest|
    print "#{guest.attrib?('hostname')} "

  end
  if node.attrib?("guest-hostname")
    node.attrib?("guest-hostname").each { |guest| print "#{guest} " }
  end
  puts "\n"
end

And I guess that works for me for now. Later it’ll be nice to use the host and guests’s real UUID as the iclassify uuid. That’s a project for another day.

Mounting KVM qcow2 qemu disk images

qcow2 images are not flat files, see qemu-img(1). KVM ships with qemu-nbd, which lets you use the NBD protocol to share the disk image on the network.

First, for partition nbd partition support you need to be running kernel 2.6.26 (commit, changelog) or greater. For ubuntu users, that means it’s time to upgrade to intrepid ibex. Load the nbd module with:

sudo modprobe nbd max_part=8

If you leave off the max_part attribute, partitions are not supported and you’ll be able to access the disk, but not have device nodes for any of the partitions. Running

sudo qemu-nbd root.qcow2

will bind to all interfaces (0.0.0.0) and share the disk on the default port (1024). It’s important to note that the nbd kernel module produces /dev/nbd0 while the nbd-client man page recommends /dev/nb0 in it’s examples. The error message isn’t so clear, see lp:290076.

# nbd-client localhost 1024 /dev/nb0
Error: Can not open NBD: No such file or directory

This can all be reduced in steps using the ‘–connect’ option of qemu-nbd, like this:

sudo qemu-nbd --connect=/dev/nbd0 root.qcow2

At which point you can view the disk partitions:

sudo fdisk /dev/nbd0

or mount a disk, such as

mount /dev/nbd0p1 /mnt

KVM: Who’s your daddy?

I guess I never blogged about the VMWare solution to this. I wanted a node to detect if it was a VMWare guest, and if it was, to register the name of it’s host as an iClassify attribute. Originally the solution to know what VMWare server host a guest was on was to put it in it’s hostname when you created the node, like ‘vm03-somecrap’. That just sucked. With the information in iclassify, it was easy to write a ruby script to give me a list of hosts and their guests. And that worked alright. I’d run a script on the host against each guest something like:

#!/bin/bash
VMWARECMD=/usr/bin/vmware-cmd

# vmware config files returned by 'vmware-cmd -l' often have spaces, by default bash for treats spaces as a field seperator
IFS=$'\n'

if [ "$1" == "" ]; then
  echo $0: requires vmware host hostname as an argument
  exit 1
fi

for config in `$VMWARECMD -l` ; do
  $VMWARECMD "$config" setguestinfo host $1
done

Which one could then read on the guest if the vmware-guest tools where installed with:

/usr/sbin/vmware-guestd --cmd 'info-get guestinfo.host'

Finding the syntax for vmware-guestd took forever. I don’t know why I didn’t blog that.

I want to do something similar for KVM. I use libvirt, so the first step was looking at the configuration file options for libvirt and the command line options for qemu to see if anything nice matched up. I played a bit with the serial pipe option, which you can use even though it’s not listed on the libvirt page because I saw it in the source, but I couldn’t get it to do much. I had thought about having a daemon return hostname or such to the FIFO whenever queried with a \n or something.

I figured I could find something to pass to the kernel cmdline in an append option, but I really don’t want to have to template the libvirt template files just to stuff a hostname in there.

I really want something I can pass to SMBIOS that I can pull with dmidecode on the guest. I started digging into the LinuxBIOS/coreboot code that comes with the ubuntu kvm source package and in bios/rombios32.c, I found a function called uuid_probe that sets the uuid variable. I just posted to the coreboot mailing list asking about it. #qemu on irc.freenode.net just got me this when I asked about it:

15:29 < aliguori> in linux bios?  it's stuff that shouldn't be there

Googling for 0x564d5868 revealed it’s indicative of the VMWare backdoor stuff. This lead me to what appears to be some emulation of this called VMPort in QEMU. I have no idea yet what it does, if it ever worked, and thus if I can interact with it yet.

Running multiple nrpe binaries on debian/ubuntu

Silly init scripts. I have two nrpe binaries running on the nagios server, one for it to check the normal things like CPU, disk, etc, and one for an external nagios server to connect and make sure that nagios is running. For basic security reasons, these use different nrpe config files with a seperate set of plugins.

But the init script has to be modified so they don’t kill each other. I made a link from ‘/usr/sbin/nrpe-ext’ to ‘/usr/sbin/nrpe’ then copied ‘/etc/init.d/nagios-nrpe-server’ to ‘/etc/init.d/nagios-nrpe-server-ext’

Make the following modification to both so they don’t kill each other:

52c52
< 	start-stop-daemon --stop --quiet --oknodo --exec $DAEMON
---
> 	start-stop-daemon --stop --quiet --oknodo --name $NAME
57c57
< 	start-stop-daemon --stop --signal HUP --quiet --exec $DAEMON
---
> 	start-stop-daemon --stop --signal HUP --quiet --name $NAME

Then make these changes to nagios-nrpe-server-ext so it starts a unique process:

5c5
< # Provides:          nagios-nrpe-server
---
> # Provides:          nagios-nrpe-server-ext
17,20c17,20
< DAEMON=/usr/sbin/nrpe
< NAME=nagios-nrpe
< DESC=nagios-nrpe
< CONFIG=/etc/nagios/nrpe.cfg
---
> DAEMON=/usr/sbin/nrpe-ext
> NAME=nrpe-ext
> DESC=nagios-nrpe-ext
> CONFIG=/etc/nagios/nrpe-ext.cfg

And update nagios-nrpe-server so it works with our new invocation of start-stop-daemon:

18c18
< NAME=nagios-nrpe
---
> NAME=nrpe

KVM Virtio network performance

I’ve switched my production infrastructure from VMWare server to KVM and libvirt recently. I’ve been working on moving from ubuntu-vm-builder to python-vm-builder (now just vm-builder). Nick Barcet made a tree while we were talking about the lack of a bridging option that adds a ‘–net-virtio’ option. So I started using virtio on a new libvirt guest for networking.

On the guest, lspci will show this device when using virtio:

00:03.0 Ethernet controller: Qumranet, Inc. Unknown device 1000

From the host, in simple tests (‘ping -c100 guestname’) aren’t all that different and are pretty statistically useless.

with virtio:

100 packets transmitted, 100 received, 0% packet loss, time 99012ms
rtt min/avg/max/mdev = 0.113/0.387/0.821/0.065 ms

without virtio:

100 packets transmitted, 100 received, 0% packet loss, time 99007ms
rtt min/avg/max/mdev = 0.143/0.200/0.307/0.032 ms

Running iperf with the server on the guest and the client on the host produces:

with virtio:

------------------------------------------------------------
Client connecting to virtio, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[  3] local 10.0.1.172 port 54168 connected with 10.0.1.33 port 5001
[  3]  0.0-10.0 sec  1.34 GBytes  1.15 Gbits/sec

without virtio:

------------------------------------------------------------
Client connecting to novirtio, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[  3] local 10.0.1.172 port 34414 connected with 10.0.1.13 port 5001
[  3]  0.0-10.0 sec    375 MBytes    315 Mbits/sec

So that’s better. Both guests are Ubuntu 8.04.1 running 2.6.24-19-server SMP x86_64 with 1 vcpu and 768MB of RAM. Host has 8GB of RAM, same kernel and distro, with 8 CPUs (Xeon’s with HT and all that crap).

Access is denied when saving files from Office 2007 on Vista to a redirected documents folder

Client is Office 2007 on Vista, saving to ‘Document’s which is redirected folder on a Server 2003 R2 file share. Whenever saving files you get “access is denied” and it creates a 0KB file, but you can copy the file there in explorer no problem. As a side note, offline files and folders is also enabled.

It’s related to SMB2, which is Vista + Server 2008 tricks. See KB 296264 about ensure that opportunistic locks are enabled in the registery on the server.

Consumer hardware in enterprise environments

Sunday, 6:55AM: The CEO of my last startup, which laid off the entire development team over a year ago calls my cellphone to report a total network outage. He has an international trip that he’s leaving on and needs to get into his email to coordinate his trip. Too bad they laid off everyone that could help, with prejudice.

I wrapped up my social plans and went in to the data center last night. The workstation in the lab couldn’t reach the domain controllers on another subnet, so it lacked a DHCP address as well. I narrowed it down to the Netgear GSM7324 L3 Switch that was being used in the core. A purchase of my predocessor there, I always looked at it funny. Netgear’s one foray into enterprise routing that I’ve ever seen, I didn’t care for it much. It’s CLI tried to be a Cisco, but was substantially different so it only served to confuse people with Cisco experience (the dell gear is much closer at the UI emulation, for the record).

I drag out a serial console to reveal:

Timebase: 33.000000 MHz, MEM: 132.000000 MHú

NetGear Boot code......

Flash size = 16MB

Testing 128MB SDRAM..............................Pass

Unknown PCI devices.
PCI devices found -
Motorola MPC8245
Select an option. If no selection in 10 seconds then
operational code will start.

1 - Start operational code.
2 - Start Boot Menu.
Select (1, 2):

Operational Code Date: Thu Aug  3 22:43:40 2006
Uncompressing.....

50%                     100%
||||||||||||||||||||||||||||||||||||||||||||||||||
Attaching interface lo0...done

Adding 36274 symbols for standalone.

Unknown box topology

This is apparently common for people that take the plunge and try to save some money over buying a tried and true piece of equipment for their core. There lineup has a few newer, more expensive, models. The GSM7324 still sits at the lowest price point for a Netgear L3 switch though, luring those in thinking that there’s no tradeoff in price.

So apart came all the trunks and redundant switch links. There was enough redundancy in the physical cabling to the edge switches that I could switch to access links for each subnet. I chained all the switches back together, like it was when I first started, and set up routing on a [Juniper] Netscreen 50 instead, being the only alternative. Everything started coming back up as I dig through the network in search of the original static routing entries that I never found the time to upgrade.

How important is network administration? Too important for system administrators to get away with not knowing. A colleague was recently complaining that he couldn’t get an interviewee to answer why having two physical separate switches is better than having one. I find all of this unfortunate and trying, when I have had to re-architect every network I’ve inherited since moving to Seattle. Sometimes I think we should go back to yelling about the 5-4-3 rule on a soapbox so we’ll at least get a sane switch topology. Hopefully by the time they realize why the 5-4-3 rule doesn’t apply anymore, they’ll have picked up why switch topology is too important to be a matter of just plugging switches together like they’re power strips. Because that’s a great pet peeve too.

rubygems super slow

On any Debian Etch or Ubuntu Hardy box, running gem tends to do a source update (‘Bulk updating Gem source index for: http://gems.rubyforge.org/’) and go super slow. Especially on low memory virtual machines. I could put in a purchase order for another 40GB of RAM, oorrrr…. Rumors were that newer versions of rubygems were better, so I went about upgrading hardy the debian way. And it’s much better.

Put intrepid in your sources.list. Puppet manages mine for my network so I use sources.list.d:

/etc/apt/sources.list.d/intrepid.sources.list:
deb-src http://mirrors.cat.pdx.edu/ubuntu/ intrepid main restricted universe multiverse
deb-src http://mirrors.cat.pdx.edu/ubuntu/ intrepid-updates main restricted universe multiverse
deb-src http://security.ubuntu.com/ubuntu intrepid-security main restricted universe multiverse

Then (basically)

sudo apt-get update
mkdir /tmp/rubydeb
cd /tmp/rubydeb
sudo apt-get build-dep ruby1.8 ruby1.9 rubygems
sudo apt-get source ruby1.8 ruby1.9 rubygems
# run dpkg-buildpackage -rfakeroot in ruby1.8 and ruby1.9
# sudo dpkg -i the resulting appropriate debs (you need ruby1.9 to build rubygems)
# run dpkg-buildpackage -rfakeroot in rubygems
sudo apt-get remove libgems-ruby1.8 rubygems
# sudo dpkg -i the new rubygems deb

Everything seems good, life seems better.

BackupExec 12.0 RALUS on debian/ubuntu

A recent upgrade of BE forced out the old legacy unix agent that was a single binary and easy to script, making us use RALUS. Could have been worse. The manual install is a bit sketch, someone noted that alien could convert the rpm’s, so I started there. I found the rpms in tgz files under ‘pkgs/Linux’ and ‘RALUS64/pkgs/Linux’ for both architectures. I ran alien on an x86 and an x64 box to create a full set. The debs don’t have dependency information, but I just dealt with that with puppet because it was quicker. Ignore the errors about alien not creating postrm/postinst scripts, they’re no big deal.

RALUS wants the user that connects to have beoper as a secondary group. Also you need libstdc++5 installed. Here’s a self-documenting puppet recipe, in case there are more questions:

class ralus {

  realize User["beoper"]
  realize Group["beoper"]

  package { "vrtsvxmsa":
    ensure => installed,
  }

  package { "libstdc++5":
    ensure => installed,
  }

  package { "vrtsralus":
    ensure => installed,
    require => [ Package["vrtsvxmsa"], Package["libstdc++5"] ],
  }

  exec { "ralus-init":
    command => "/bin/cp /opt/VRTSralus/bin/VRTSralus.init /etc/init.d/ralus",
    require => Package["vrtsralus"],
    onlyif => "/usr/bin/test ! -x /etc/init.d/ralus",
  }

  service { "ralus":
    ensure => running,
    enable => true,
    require => [ Exec["ralus-init"], User["beoper"] ],
  }
}

fixing OCS 2007 for LM with lcscmd

Live Meeting cannot connect to the meeting.
Wait a few moments, and then try to join the meeting again.
If you still cannot connect, contact your administrator or technical support.

See Microsoft KB #938288.

Lots of ‘lcscmd’ options (in \Program Files\Common Files\Microsoft Office Communications Server 2007) require a pool name. I eventually found it with ‘lcscmd /forest /action:checkallpoolsstate’, where I saw a line that said ‘Executing “Check Pool BLAH”‘ where BLAH was my pool name. Even easier, in the MMC applet, the first container under ‘Standard Edition Servers’ is the pool, the one that doesn’t specify the FQDN.

Get on your full server, not the edge server, and from the above directory run:

lcscmd /web /action:clearpoolexternalurls /poolname:POOLNAME
lcscmd /web /action:updatepoolurls /externalwebfqdn:conference.example.org /poolname:POOLNAME

Their example just shows ‘contoso.com’ as the fqdn. That’s a little sketchy, being a domain name that might point to a web server. There’s little explanation as to which IP address (host) on the edge that should go to. I have the hostname of my web conference edge server there. After running this the clients could connect to live meeting without a restart of OCS (LCS).

running god inside runit

God ignores the TERM signal. Maybe this is a ruby thing? I’m not really going to bother looking. But when trying to run God inside runit, which normally sends a TERM when you tell it to stop a process; this is no good. It looks like God ignores HUP if run as a daemon:

      def run_daemonized
        # trap and ignore SIGHUP
        Signal.trap('HUP') {}

In which case the only way I could find to stop it is with an INT signal. But HUP will kill it when run non-daemonized with the “-D” option. I guess mysql and other daemons do this sorta thing too and I got the idea from this thread. For other processes see this archive of runit scripts for something that works or examples. Anyways, my runit run script for god:

#!/bin/sh
exec 2>&1

trap 'kill -HUP %1' 1 2 13 15

/usr/bin/god -D -c /etc/god/master.god --log-level debug --no-syslog & wait

debianizing ruby gems

The Ruby Oniguruma gem pissed me off the other day. I was trying to install the ultraviolet gem, and while building dependencies, compiling was done, that failed. Ultraviolet depended on oniguruma. gem did it’s thing and started installing that, but oniguruma has extensions that must be compiled, and to do so needed some headers from oniguruma itself.

I’m not a debian developer, but I like debs. First, if I was doing this with debs instead of gems, a binary package compiled for my architecture would have been pulled in automatically and life would have moved on. If I really wanted to build this from sources, libonig-dev would have been installed when I ran ‘apt-get build-dep liboniguruma-ruby’.

I’ve built a couple ruby debs in the past by stealing from other debs, but last night in my frustration I went out, read a bunch, chattered some, and started building more. You can find a repo here:

deb http://ubuntu.ninjr.org/ hardy ninjr
deb-src http://ubuntu.ninjr.org/ hardy ninjr

And if you so desire grab the key with:

wget http://ubuntu.ninjr.org/btm@loftninjas.org.gpg.key
sudo apt-key add btm@loftninjas.org.gpg.key
sudo apt-get update

Gunnar Wolf responded to my query about packaging with an awesome tip, libsetup-ruby1.8.

  1. apt-get install libsetup-ruby1.8 ruby-pkg-tools fakeroot dpkg-dev debhelper cdbs ruby1.8-dev
  2. Grab a .tgz or .gem
  3. untar it (or ‘gem unpack foo.gem’).
  4. cp /usr/lib/ruby/1.8/setup.rb package-1.1.1/
  5. mkdir package-1.1.1/debian
  6. cd package-1.1.1
  7. dch –create -v1.1.1-1
  8. fix your email, pick a package name (libpackage-ruby is my choice), put in ‘unstable’
  9. cd debian
  10. put this in ‘rules’:
    #!/usr/bin/make -f
    # copyright 2006 by Esteban Manchado Vel�zquez
    
    include /usr/share/cdbs/1/rules/simple-patchsys.mk
    include /usr/share/cdbs/1/rules/debhelper.mk
    # Ruby package with setup.rb
    include /usr/share/ruby-pkg-tools/1/class/ruby-setup-rb.mk
  11. Make a ‘control’ file like this:
    Source: libtextpow-ruby
    Section: libs
    Priority: optional
    Maintainer: Bryan McLellan 
    Build-Depends: cdbs, debhelper (>> 5.0.0), ruby-pkg-tools, ruby1.8
    Standards-Version: 3.8.0
    
    Package: libtextpow-ruby
    Architecture: all
    Depends: libtextpow-ruby1.8
    Description: a library to parse and process Textmate bundles.
     .
     This is a dummy package to install the GD library bindings for
     the default version of Ruby.
    
    Package: libtextpow-ruby1.8
    Architecture: all
    Depends: ruby1.8, libplist-ruby, liboniguruma-ruby
    Description: a library to parse and process Textmate bundles.
    
    Package: libtextpow-ruby-doc
    Section: doc
    Architecture: all
    Description: a library to parse and process Textmate bundles.
     a library to parse and process Textmate bundles
     .
     This is the documentation package, with upstream documentation as well as
     generated rdoc.
    

    On the package libpackage-ruby1.8 line, change architecture to ‘any’ if the package compiles any extensions so your package output will correctly be architecture specific. If the ruby package has no docs, pull that section out.

  12. cd ..
  13. dpkg-buildpackage -rfakeroot

That’s about it. Contributing to debian appears difficult. I think you’ve got to know someone who knows someone to get involved. But at least this way you can start building debs.