Category Archives: Uncategorized

Replacing munin with ganglia

I’ve been using munin for some time for server trending. It works well out of the box, but it gets really difficult to get it to scale. The poller runs every five minutes and if it doesn’t finish, the next run is simply skipped. As you add more and more data points, this becomes more likely and more common. You simply can’t use SNMP with it (well, you CAN) because the poll is real time and so slow it increases the poller run time significantly.

Adam Jacob at HJK put together a replacement poller called Moonin, but they’ve been busy with chef and it appears in maintainence mode (or worse). We currently run Moonin, until we find a better solution. John Allspaw talks everywhere about using Ganglia at flickr, so I’ve been testing that.

Ganglia definitely lacks the community that munin has, but I like it’s design much better. It was written for monitoring clusters and supports all sorts of business like using multicast to share traffic data about the cluster. I also like that it’s interface for exchanging data is XML and opposed to the custom stuff in munin. This makes it easier to share data about. It’s fast though. When you write plugins for it using gmetric, you give the data to the monitoring daemon, gmond, instead of it polling. Then you collect the data from your clusters using gmetad, and eventually display the data with the web front end.

The lessons I’ve learned so far is that, at least as of 3.1.1, you can only have one cluster per multicast address/pair combination. Regardless of the setting in your gmond configuration, all nodes get reported as a part of the cluster that the machine running gmond is in when gmetad contacts it. I’ve had to deal with this by setting each cluster to use a different port. This isn’t a big deal, because I’m using chef so the gmond configuration file is a ruby template anyhow, but I consider it a bug. In the gmetad configuration you then poll a gmond in each cluster (you can poll multiple nodes in each cluster for redundancy) which forms a grid. Each gmetad instance only supports a single grid for now. The point is this is all very scalable.

The bonus of clusters for us is you can group each type of server, say all your front end web servers, into a cluster, and you get aggregate graphs out of the box. They are limited to a couple default metrics like CPU, but it’s nice. In regard to aggregates for other metrics, I don’t know yet if you can do it or how to go about it.

In my first attempt at adding additional metrics, I wrote a ruby script to poll jboss for statistics data, which you can then pass to gmetric using cron. I’m going to dump it here so it’s on the net. If I keep writing these I’ll put them on github or somewhere.


#!/usr/bin/ruby
#
# tomcat-stat - Collects statistics from tomcat via the status interface,
#   and provides the data for use in other scripts
#
# Copyright 2009 Bryan McLellan (btm@loftninjas.org)
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# To use with ganglia add a cron entry such as:
# * * * * * /usr/bin/gmetric -n 'tomcat threads max' -t uint32 -v `/usr/local/bin/tomcat-stat --thread-max`
#
require 'optparse'
require 'net/http'
require 'rexml/document'

include REXML

options = {}
OptionParser.new do |opts|
options[:host] = "localhost"
options[:port] = "8080"

opts.banner = "Usage: tomcat-stat [options]"

opts.on("-h HOST", "--host HOST", "Host to connect to") { |host| options[:host] = host }
opts.on("-p PORT", "--port PORT", "Port to connect to") { |port| options[:port] = port }

opts.separator " "
opts.separator "Choose one:"
opts.on("--memory-free", "Return free memory") { |free| options[:memoryfree] = free }
opts.on("--memory-total", "Return total memory") { |total| options[:memorytotal] = total }
opts.on("--memory-max", "Return max memory") { |max| options[:memorymax] = max }

opts.on("--thread-max", "Return max threads") { |max| options[:threadmax] = max }
opts.on("--thread-count", "Return count threads") { |count| options[:threadcount] = count }
opts.on("--thread-busy", "Return busy threads") { |busy| options[:threadbusy] = busy }

opts.on("--request-mtime", "Return max request time") { |mtime| options[:requestmtime] = mtime }
opts.on("--request-ptime", "Return request processing time") { |ptime| options[:requestptime] = ptime }
opts.on("--request-count", "Return request count") { |count| options[:requestcount] = count }
opts.on("--request-error", "Return error count") { |error| options[:requesterror] = error }
opts.on("--request-received", "Return bytes received") { |received| options[:requestreceived] = received }
opts.on("--request-sent", "Return bytes sent") { |sent| options[:requestsent] = sent }
end.parse!
# build a url from options
url = "http://#{options[:host]}:#{options[:port]}/status?XML=true"

# retrieve xml document
tomcat_xml = Net::HTTP.get_response(URI.parse(url)).body
doc = REXML::Document.new(tomcat_xml)

puts doc.elements["//jvm/memory"].attributes["total"] if options[:memorytotal]
puts doc.elements["//jvm/memory"].attributes["free"] if options[:memoryfree]
puts doc.elements["//jvm/memory"].attributes["max"] if options[:memorymax]

puts doc.elements["//connector[@name='http-0.0.0.0-#{options[:port]}']"].elements["threadInfo"].attributes["maxThreads"] if options[:threadmax]
puts doc.elements["//connector[@name='http-0.0.0.0-#{options[:port]}']"].elements["threadInfo"].attributes["currentThreadCount"] if options[:threadcount]
puts doc.elements["//connector[@name='http-0.0.0.0-#{options[:port]}']"].elements["threadInfo"].attributes["currentThreadsBusy"] if options[:threadbusy]

puts doc.elements["//connector[@name='http-0.0.0.0-#{options[:port]}']"].elements["requestInfo"].attributes["maxTime"] if options[:requestmtime]
puts doc.elements["//connector[@name='http-0.0.0.0-#{options[:port]}']"].elements["requestInfo"].attributes["processingTime"] if options[:requestptime]
puts doc.elements["//connector[@name='http-0.0.0.0-#{options[:port]}']"].elements["requestInfo"].attributes["requestCount"] if options[:requestcount]
puts doc.elements["//connector[@name='http-0.0.0.0-#{options[:port]}']"].elements["requestInfo"].attributes["errorCount"] if options[:requesterror]
puts doc.elements["//connector[@name='http-0.0.0.0-#{options[:port]}']"].elements["requestInfo"].attributes["bytesReceived"] if options[:requestreceived]
puts doc.elements["//connector[@name='http-0.0.0.0-#{options[:port]}']"].elements["requestInfo"].attributes["bytesSent"] if options[:requestsent]

configuration management with chef announced

Chef has been announced. Listen to this podcast at Cloud Cafe. There’s no way around comparing puppet and chef. Sure, they’re both configuration management tools. It’s simplest to put it this way:

We’re replacing puppet with chef.

And why? A little while ago I wrote about problems I’ve been having scaling puppet. Off the top of my head, the biggest issues for me working with puppet have been:

  1. Dependencies graphs
  2. Limited capabilities of the language (DSL)
  3. Templates are evaluated on the server

Dependency Graphs

There’s a talk about vertically scaling puppet, but not a lot of it about horizontally scaling. I tend to run everything under puppet. People argue that it’s too much work to put single servers in puppet, and you should only use it for systems you intend to clone. I disagree. Puppet recipe’s are self documenting. The same people who don’t want to take the time to write puppet recipes for the single services are the people you have to beat with a sucker rod to get to document anything. Sometimes if I don’t have the time to put into fully testing a puppet recipe for a new machine, I’ll at least write the recipe as I’m working to server both as documentation and a starting point for if/when I come back to it.

The point is that as I scale out in this fashion, more often puppet will fail with a dependency problem on one run, and be fine on the next.  I asked Luke about this at a BoF at OSCON 2008, and he basically told me that he really only focuses on the problems his paid customers have and was anxious to leave and get a beer. That’s fine, I understand it, but since it does nothing to fix my problem it drove me away from the puppet community.

While in theory having puppet do all this work to resolve depency issues seems fine, it is more complexity and trouble than it is worth. As a systems administrator I know what the dependancies are. As you build a system you simply write your recipe in the same order as the steps you’re taking.

Chef takes this idea and runs with it. Recipes are parsed top to bottom. If a package needs to be installed before a service is started, you simply put the package in the recipe first. This not only makes a lot of sense, it makes depencies in a complex recipe visually understandable. With puppet you can end up with spaghetti code remincisent of “goto”, jumping around a number of recipes in an order that’s difficult to understand.

Language

Before the recent 0.24.6, you could not even do:

if $ram > 1024 {
    $maxclient = 500
}

The support for conditionals was rudimentary. I run into a lot of languages and the biggest problem I have is remembering how to do the same thing in each language. The puppet language does not do what a lot of lot of other languages do. I didn’t need another language to learn, let alone one written from scratch. It was just silly doing something like:

  # iclassify script addes vmware-guest tag based on facter facts
  $is_vmware = tagged('vmware-guest')
  if $is_vmware {
    include vmware
  }

Chef uses ruby for it’s recipes. This makes the above stupidly simple with something like:

include_recipe "vmware" if node[:manufacturer] =~ /VMware/

Templates
Puppet evaluates recipes and templates on the server. I ended up with this block of code once when I need to specify the client node’s IP Address in a configuration file:

require '/srv/icagent/lib/iclassify'
ic = IClassify::Client.new("https://iclassify", iclassify_user, iclassify_password)
query = [ "hostname:", hostname].to_s
mip = nil
nodes = ic.search(query)
nodes.each do |node|
  # node.attribs is an array of hashes. keys is 'name' value is 'values'
  node.attribs.each do |attrib|
    if attrib[:name].match(/ipaddress/)
      ip = attrib[:values].to_s
      if ip.match(/10.0.0./)
        mip = ip
        break
      elsif ip.match(/10.0.1./)
        mip = ip
        break
      end
    end
  end
end

This was so much work. Of course with chef you can easily get this information in the recipe because it’s parsed on the node, let alone the ease of doing it in the template if that’s more appropriate. Since the template’s parsed on the client, you grab the information out of variables or directly from the system.

As time goes on I’ll surely write more about using chef. We’re using it production now, and happy with it. In the interim, come to #chef on freenode if you have any questions.

Creating user vms with libvirt and kvm

I used virt-manager to create a local vm to build a debian guest. I usually use vm-builder, but it doesn’t support debian at this time.

I was a little confused at first why I could see the vm in virt-manager, but the xml file wasn’t in /etc/libvirt/qemu nor could I see it in virsh.

virt-manager appears to have a connection open by default called “localhost (User)”, as opposed to “localhost (System)” which you need to open a new connection to localhost from the menu to access. The latter is what you connect to when you run virsh. To make the former connection run ‘virsh –connect qemu:///session’, as opposed to ‘virsh –connect qemu:///system’ which is the default.

System vm’s are stored in ‘/etc/libvirt/qemu’, user vm’s are stored in ‘~/.libvirt/qemu’.

No valid PXE rom found for network device

Using virt-manager (libvirt) to build a KVM debian etch guest on ubuntu intrepid via pxe boot produced the error: “No valid PXE rom found for network device”.

Reading LP Bug #193531 showed the need to install the ‘kvm-pxe’ package (sudo apt-get install kvm-pxe).

Then I got “Out of space while reading console startup output”, which I haven’t solved and I’m probably giving up on backporting to try to solve due to a number of hurdles.

Using an ISO image as an apt repository

I picked up an MSI Wind desktop recently for $140 + $20 or so for a 2GB SO-DIMM from Frys and put it together with a SATA hard drive I had kicking around. I didn’t want to spend the money on a SATA CDROM that I would use just for the install, or bother pulling the one out of my mothers identical PC I just built for her. I did a PXE network install of Ubuntu 8.10, not choosing to install anything over base. Then I installed openssh-server and removed all input devices.

To install the ubuntu-desktop virtual package, I wanted to use apt-cdrom to allow using the iso image as a repository rather than download 500MB worth of packages.

sudo mv ubuntu-8.10-desktop-i386.iso /media
sudo mkdir /media/iso
#add  "/media/ubuntu-8.10-desktop-i386.iso /media/iso iso9660 user,loop 0 0" to /etc/fstab
sudo mount -a
sudo apt-cdrom add -d /media/iso -m

This turned out to be completely useless though, as there are only a few debs on the Ubuntu LiveCD. The LiveCD uses Ubiquity to install which just copies the CD to the new partition. I almost always use the alternative installer via PXE booting, so I never noticed this before.

gem fetech errors with Errno::ENOENT

This one isn’t too hard to figure out, but annoying and frustrating. On rubygems <= 1.3 (Including Ubuntu Intrepid’s 1.3.0~RC1really1.2.0):

$ gem fetch erubis
ERROR: While executing gem ... (Errno::ENOENT)
No such file or directory - /var/lib/gems/1.8/cache/erubis-2.6.2.gem

This is a silly permissions problem. Instead, ‘sudo gem fetch erubis’.

See Rubygems bug #21134.

Terminal Services for Remote Administration third connection refused

I keep a Windows Server 2003 virtual machine around as a workstation for administrators. Rather than installing the Support Tools, Exchange Tools, Communicator Tools, etc on all machines we just install them here and connect when we need to use them. A couple of us use Linux on the desktop too, so it makes it easier than maintaining multiple local virtual machines. It’s also great when you’re on the road and need access to these tools.

When Terminal Server is in Remote Administration mode, you’re only allowed two connections. If you disconnect, you can reconnect elsewhere but sometimes you forget. Or two administrators have left themselves logged in and you can’t get in. Normally you can use ‘mstsc /console /v:hostname’ or ‘rdesktop -0 hostname’ to connect to the console (aka session 0), where you can then use the task manager or the terminal services configuration mmc applet to logoff or disconnect a session. For a while now, every third attempt to connect has gotten “ERROR: recv: Connection reset by peer” from rdesktop or a similar error from mstsc which I’ve since lost. Wireshark shows a TCP handshake, the client sending a packet of data, and the server replying with an RST.

I eventually found the solution. If you’ve ever adjusted the ‘Maximum Connections’ value on the ‘Network Adapter’ tab of the ‘RDP-Tcp’ properties in Terminal Services Configuration, you may have inadvertently change this setting from it’s default of ‘ffffffff’ to ‘2’, which is the maximum value the UI will take. You can set this value back to ‘ffffffff’ via regedit by editing this key:

HKLM\SYSTEM\CurrentControlSet\Control\Terminal Server\WinStations\RDP-Tcp\MaxInstanceCount

I believe I had to reboot after doing so.

Removing a certificate from Terminal Services

In the Terminal Services Configuration MMC applet, in the properties for the RDP-tcp connection on the general tab is a certificate entry. Adding a certificate here allows the use of SSL for encryption. In the course of trying to debug a problem with a terminal server not allowing the third connection to the console, useful for disconnecting one of the other two, I wanted to remove this certificate. As usual, I did it the hard way since there’s no ‘Remove’ button. This is all under TS for Remote Administration.

Open up the ‘Certificates MMC’ applet on the computer, choose the computer store, and under personal certificates delete the certificate for the server for ‘server authentication’. This may break other things. Reboot. After rebooting I could not TS back into the machine and had to use the console. I opened the TS Configuration applet again and made sure that certificate said none and that security layer was set to ‘RDP Security Layer’.

To create a new certificate, open the same MMC applet. Click on ‘Certificates (Local Computer)’ then View -> Options and select ‘Certificate Purpose’. Right click on Server Authentication, All Tasks, Request New Certificate. Once installed, I rebooted again and TS was working again.

OCS 2007 and Communicator Address Book issues

“Type your credentials to access the corporate address book”

There are a number of good articles out there for troubleshooting these issues. UCNoEvil is a good place to start. There’s another on the Communicator blog. By following the steps on the later I confirmed my issues were with authentication, specifically kerberos.

Open the OCS Snapin from Administrative tools, expand ‘Standard Edition Servers’ and click on the pool name. In the right window expand ‘Address Book Server Settings’ and copy the ‘File share location for internal connections’. It will look like “https://server.example.com/Abs/Int/Handler”

Open IIS on your OCS server and expand ‘Websites’, ‘Default Websites’, ‘Int’. Right click files and choose open. Copy the filename and create a url like: https://server.example.com/Abs/Int/Files/D-0b3e-0b3f.dabs

Put that in a web browser and when prompted enter DOMAIN\Username and your password. If you don’t get the option to save the file, such as getting another password prompt or an HTTP 401.1 error, then you have authentication or authorization (permissions) issues.

I was seeing errors in the event log sometimes when I’d try to log in 3+ times that looked like:

Event Type:    Error
Event Source:    Kerberos
Event Category:    None
Event ID:    4
Date:        12/17/2008
Time:        10:50:03 PM
User:        N/A
Computer:    SERVER
Description:
The kerberos client received a KRB_AP_ERR_MODIFIED error from the server host/server.example.com.  
The target name used was HTTP/server.example.com. 
This indicates that the password used to encrypt the kerberos service ticket is different than that on the target server. 
Commonly, this is due to identically named  machine accounts in the target realm (EXAMPLE.COM), and the client 
realm.   Please contact your system administrator.

These didn’t always come up though.

You can get a list of domain-wide SPN’s with:
“ldifde -f spn.txt -l servicePrincipalName -r (servicePrincipalName=*)” then view spn.txt with notepad or whatever. ldifde should exist on a domain controller.

You can see what accounts a site uses by right clicking a folder in IIS such as Abs, choosing properties and looking at ‘Application pool’ on the ‘Virtual Directory’ tab. Then right click and choose properties for that application pool under ‘Application Pools’ in IIS. On the identity tab you’ll see the username. Some users found that this user wasn’t created by OCS with Password Never Expires set, and they had to reset the account password in AD and here. The application pool being stopped was a sign of this.

You can download ‘setspn.exe’ from the Server 2003 Support Tools. Then you can run ‘setspn -L RTCComponentService’ and see what spn’s were set for that account. I had none set. the ‘CWAService’ account for Communicator Web Access had ‘http/HOSTNAME’ and ‘http/fqdn’. I set these on the RTCComponentService account with ‘setspn -A http/fqdn.example.com DOMAIN\RTCComponentService’ and ‘setspn -A http/HOSTNAME DOMAIN\RTCComponentService’. Setting these fixed my problems. CWA appears to still work, although you’re not supposed to have multiple accounts with the same SPNs. If it comes to it and I have problems I’ll probably delete the CWAService account and move it’s application pools to use the RTCComponentService account. Perhaps it was installed at a later time by another admin, or perhaps this is a bug.

Cisco Anyconnect SSL VPN on Ubuntu Intrepid

I finally got the Cisco Anyconnect SSL VPN Client working on Ubuntu Intrepid. There’s an error in 2.2.x where the ‘vpn’ tool says “error: Connection attempt has failed due to server certificate problem.” and exists. Running 2.3.x via ‘vpnui’ you get a pop-up window to accent the certificate, but click accept just brings the popup window back up.

I tried getting this working a few times, my last failed attempt appears to have been because I was running the client (which talks to a seperate service that runs as root) as root. I figured that out on this go around on a separate workstation and now have 2.2.0140 and 2.3.0185 running on separate amd64 / x86_64 Ubuntu Intrepid workstations.

This should be a pretty accurate log of the steps on the latest attempt.

# downloaded the latest Linux Anyconnect client from http://www.cisco.com
tar -xvzf anyconnect-linux-2.3.0185-k9.tar.gz
cd ciscovpn/
sudo ./vpn_install.sh 

# Downloaded latest firefox from http://www.mozilla.com/en-US/firefox/
sudo tar -xvjf firefox-3.0.5.tar.bz2 -C /usr/local

for lib in libnssutil3.so libplc4.so libplds4.so libnspr4.so libsqlite3.so libnssdbm3.so libfreebl3.so
do sudo ln -s /usr/local/firefox/$lib /opt/cisco/vpn/lib/$lib
done

I didn’t bother going back to check, but it looked in the strace output of the ‘vpn’ utility that it was looking in /lib32 for most binaries, so it sound like the amount of hackery required may be decreasing.

multipath broken on ubuntu intrepid

After getting open-iscsi going on Ubuntu Intrepid, the next horse was multipath. I did this before on debian etch, but this time it turned into quite an adventure.

First, multipath is simply broken in Intrepid and this doesn’t seem to be taken to be very important. udev’s arguments for /lib/udev/scsi_id changed somewhere around udev 114. Debian patched udev in udev=125-4, see bug 493075.

So I opened an ubuntu bug against udev, LP 306723. Ubuntu doesn’t want to go down this path (1, 2). So off to look at multipath-tools.

Package: multipath-tools
Maintainer: Ubuntu Core Developers

Last night I tried sending a message to the list, but it was moderated: “Post by non-member to a members-only list”. Today I jumped on #ubuntu-devel, but nobody took responsibility for the package or for that matter responded. So I joined the list, but once again was moderated because: “Post by non-developer to moderated list.” Well, maybe eventually my messages will get passed to the list, but I didn’t want to wait for someone else to say “not my problem”.

So I patched multipath-tools myself, and uploaded it as a PPA. It works for me. Getting a PPA setup was an ordeal of it’s own.

So until someone at Canonical cares about multipath being broken, use my package. It looks fixed in the multipath head, but it’s been sixteen months since the last release of multipath so I’m not crossing my fingers waiting for an upstream sync to fix the problem.

Setting up an Ubuntu launchpad PPA

Ubuntu Personal Package Archives (PPAs) let you upload source packages to be built. There’s help on PPAs which lists these requirements:

  1. learn Ubuntu packaging
  2. install dput – sudo apt-get install dput
  3. have imported your PGP key to your Launchpad account.
  4. become an Ubuntero (i.e. you must sign the Ubuntu Community Code of Conduct)

Becoming an Ubuntero lists:

  1. Log into your Launchpad account and click your name in the top-right corner of the page.
  2. Click Codes of Conduct in the Actions menu in the left-hand column of the screen.
  3. Following the on-screen instructions to download the most recent Code of Conduct, then sign it using gpg.
    Note: You must import your pgp signature before signing the Code of Conduct.

First, get a GPG key into launchpad. If you don’t have a gpg key, read this. Go to: https://launchpad.net/people/+me/+editpgpkeys

For whatever reason ‘gpg –list-keys’ doesn’t output column headers of any kind and making it more verbose with a ‘-v’ argument doesn’t do much useful.

$gpg --list-keys -v
gpg: using PGP trust model
/home/bryanm/.gnupg/pubring.gpg
-------------------------------
pub   1024D/874DF056 2008-04-09
uid                  Bryan McLellan 
sub   2048g/857DA9D9 2008-04-09

The GNU Privacy Handbook discusses what the fields are for. The hexadecimal number after the / is the key-id. You’ll want to upload the pub key to the keyserver like so:

$ gpg --keyserver keyserver.ubuntu.com --send-key 874DF056
gpg: sending key 874DF056 to hkp server keyserver.ubuntu.com

Then get the key fingerprint and put it in launchpad at https://launchpad.net/people/+me/+editpgpkeys:

$ gpg --fingerprint 857DA9D9
pub   1024D/874DF056 2008-04-09
      Key fingerprint = 5056 0995 E4F5 5338 70A6  A0FD FD58 20E3 874D F056
uid                  Bryan McLellan 
sub   2048g/857DA9D9 2008-04-09

If launchpad can match the fingerprint to a key in the ubuntu keyserver, it will email you an encrypted message. With a helpful, albeit late, link to a GPG howto. Since I use Gmail and Firefox, I use FireGPG to decrypt the message. Highlight the encrypted portion of the message then choose Tools -> FireGPG -> Decrypt. (Or, if you want a few moments you’ll get a “Decrypt this message” option near the bottom of the message next to reply/forward). Inside the decrypted message you’ll get a link to follow to confirm the key.

Once you have a key set up, follow this link to the Ubuntu Code of Conduct. You can also get there from https://launchpad.net/people/+me/, then scrolling down to the bottom where it says “Ubunteroo: No”. There’s a button after the ‘no’ that you can click on.

Read it. This entire post is triggered because of issues I’m having with multipath and udev and getting to the point where I’m going to have to fix it myself. It’s good to be reminded to calm down.

When you choose the ‘Sign It’ option you’ll download the text and sign the file, then upload the signed message. The directions in launchpad at this point are pretty clear. When finished, your user account in launchpad should now list you as an Ubunteroo.

Now go to: https://launchpad.net/people/+me/+activate-ppa
. Read and accept the terms of service to activate your PPA.

open-iscsi on ubuntu intrepid

Some time ago I started playing with iscsi on a Dell MD3000i on debian etch. I found etch was too far behind the times, and moved to lenny. The problem is needing to replace a Netgear / Infrant NAS in production that was having memory leak problems at the time. I resolved those by not running munin-node on it, and forgot about it for a while.

Recently I started having a ton of NFS problems with the beast. Apparently someone discovered they got better NFS performance with a single server thread than the default eight, and Netgear decided that it would be a good reason to make that the default. Time to pick back up on getting rid of that gear for anything meaningful. Would be a really nice place to mirror ubuntu archives I think.

Running back through the normal commands lead to an error though:

# iscsiadm -m discovery –type sendtargets –portal x.x.x.x -P 1
# iscsiadm -m node -l
# iscsiadm -m session
iscsiadm: Could not get host for sid 1.
iscsiadm: could not get host_no for session 6.
iscsiadm: could not find session info for session1
iscsiadm: Can not get list of active sessions (6)

Turns out it’s a bug, LP #289470. This appears to allow a single session but you can’t view it’s status. Upgrading to a newer version fixes both the session status and the ability to mount multiple sessions again.

I wanted to grab the new package out of jaunty in a sane way. Adding jaunty to the sources.list alone would make apt want to upgrade all of the packages. Downloading the deb from the website and installing it by hand with dpkg wouldn’t handle any possible dependencies.

/etc/apt/preferences:
Package: *
Pin: release a=jaunty
Pin-Priority: 450

Package: *
Pin: release a=intrepid-updates
Pin-Priority: 900

Package: *
Pin: release a=intrepid-proposed
Pin-Priority: 400

Then I ran:

# apt-get install -t jaunty open-iscsi

Which failed with a few errors:

# snip
Setting up open-iscsi (2.0.870.1-0ubuntu1) ...
Installing new version of config file /etc/init.d/open-iscsi ...
Installing new version of config file /etc/iscsi/iscsid.conf ...
update-rc.d: /etc/init.d/remove: file does not exist
 * Starting iSCSI initiator service iscsid  [fail]
 * Setting up iSCSI targets
       iscsiadm: No records found!
       [ OK ]
invoke-rc.d: initscript open-iscsi, action "start" failed.
dpkg: error processing open-iscsi (--configure):
 subprocess post-installation script returned error exit status 1
Errors were encountered while processing:
 open-iscsi
E: Sub-process /usr/bin/dpkg returned an error code (1)

There’s a couple bugs in there. One is a failure to correctly run ‘update-rc.d -f open-iscsi remove’, LP #306678, the other is that the init script doesn’t work so hot in 2.0.865-1ubuntu4, LP #181188 (init script), LP #306693 (upgrade).

After this, the inital commands worked as expected.

Edit to add that automatic login works with:

iscsiadm -m node -T TARGET --op update -n node.startup -v automatic

db_recover for slapd on ubuntu intrepid

First find out what version of bdb (Berkely DB) slapd is using:
apt-cache show slapd
[snip]
Depends: libc6 (>= 2.4), libdb4.2, libgcrypt11 (>= 1.4.0), libgnutls26 (>= 2.4.0-0), libldap-2.4-2 (= 2.4.11-0ubuntu6), libltdl7 (>= 2.2.4), libperl5.10 (>= 5.10.0), libsasl2-2, libslp1, libtasn1-3 (>= 0.3.4), libwrap0 (>= 7.6-4~), unixodbc (>= 2.2.11-1), zlib1g (>= 1:1.1.4), coreutils (>= 4.5.1-1), psmisc, perl (>> 5.8.0) | libmime-base64-perl, adduser
[snip]

Then install the appropriate version of dbX.Y-util:
apt-get install db4.2-util
The utilities like db_recover and db_verify are actually named db4.2_verify and db4.2_recover. You can see a list with:
dpkg -L db4.2-util

rubygems server on Ubuntu Intrepid 8.10

Serving gem’s locally isn’t too hard these days.

sudo apt-get install rubygems

Populate /etc/init.d/gem-server with:

#!/bin/sh

### BEGIN INIT INFO
# Provides:          gem-server
# Required-Start:    $network $local_fs $remote_fs
# Required-Stop:     $network $local_fs $remote_fs
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: start local gem server
### END INIT INFO

PATH=/sbin:/bin:/usr/sbin:/usr/bin

DAEMON="/usr/bin/gem"
OPTIONS="server --daemon"

# clear conflicting settings from the environment
unset TMPDIR

# See if the daemon is there
test -x $DAEMON || exit 0

. /lib/lsb/init-functions

case "$1" in
	start)
    PID=$(ps ax -o pid,command | grep "gem server" | grep daemon | awk '{print $1}')
    if test -n "$PID" ; then
		  log_daemon_msg "Ruby Gem server already running : PID $PID" "gem-server"
    else
		  log_daemon_msg "Starting the Ruby Gem server" "gem-server"
		  $DAEMON $OPTIONS
    fi

		log_end_msg $?
		;;

	stop)
    PID=$(ps ax -o pid,command | grep "gem server" | grep daemon | awk '{print $1}')
    if test -n "$PID" ; then
	  	log_daemon_msg "Stopping the Ruby Gem server" "gem-server"
      kill $PID
    else
	  	log_daemon_msg "Ruby Gem server not running" "gem-server"
    fi

		log_end_msg $?
		;;

	restart|force-reload)
		$0 stop && sleep 2 && $0 start
		;;

	*)
		echo "Usage: /etc/init.d/gem-server {start|stop|restart|force-reload}"
		exit 1
		;;
esac

Then use the initscript to start it. If you want to serve gem’s out of a directory other than the default /var/lib/gems/1.8, then add “-d DIRECTORY” to the OPTIONS variable in the init script. Then install locally sourced gem’s with:

sudo gem install --source http://hostname:8808 gem_package