Monthly Archives: January 2009

Building scalable operations infrastructure with OSS

I’m the lead systems administrator at Widemile and run operations here. Sometimes I do other things, but operations is most interesting. My linkedin profile may give you an idea of where I’m coming from, but it misses all those late nights working with OSS because I enjoy it. I usually blog on my own site, but it often serves more as a technical journal than what we are up to at Widemile, which will be the differentiator. As a rule, I’m not a developer. Certain facts may convince you otherwise, but I try to stay out of our product. You’ll start to hear from Peter Crossley , our lead software architect, soon enough. Perhaps some other developers after that. I’ll christen this blog with a round-up of how we’re building our infrastructure at Widemile.

Most recently I’ve been in heavy development on Chef and Ohai. We’ve been using puppet for about a year and a half now. Check out my Infrastructure Engineering slide deck for where we were with that a few months ago. I’ve been happy with it except for a few issues which ended up being mostly major architectural changes to fix. Adam at Opscode has a good post entitled 9 things to like about Chef, that outlines some of these difference. There’s a lot of e-drama around Opscode’s decision to write a new product rather than usher changes into puppet. I won’t touch that other than to say that we had problems with puppet that chef fixes.

Almost all of our servers are in configuration management. Which means that no one-off work is done on the production servers so that all changes are self-documenting. Granted, we’re a small shop and sometimes I’ll do minor debugging on a production server, but any changes do end up in CM.

Our servers are almost all kvm guests under libvirt running on Dell blades. There’s some information about how we got here in a slidedeck I made for GSLUG entitled Virtual Infrastructure. Apparently using kvm in production isn’t all that heard of, but since we’re a small shop we’re able to leverage new technology very quickly to make the best of it. With the use of vmbuilder, libvirt, kvm and capistrano, we build out new nodes in a matter of minutes. More importantly, it’s just a couple commands.

Once Chef is intergrated with the libvirt API we expect to be able to further simplify our deployment. The idea behing is that it will be a ghetto version of Solo, which EY built using Chef. Eventually we’ll pull out capistrano. While it’s nice for interacting with multiple machines at once, it really was written for a different purpose than what we use it for. There will be replacement functionality in Chef shortly.

Learning to cook

The chef satire will never die. Adam posted 9 things to like about chef today, which is an expanded and much better version of my original blog post on chef. AJ had an intermediate post that tried to summarize a lot of contraversy and drama. Hopefully that silliness is settling down.

I’ve been coding a lot lately, contributing to both chef and ohai. We’ve been talking about trying to use chef in the NOC at Shmoocon so that next year we can reuse the recipes rather than build the servers again by hand. Most everything runs on borrowed hardware at Shmoocon, so you’re not guaranteed everything is the way you left it a year later. We use FreeBSD for some monitoring at Shmoocon, so I’ve been spending a lot of time getting chef/ohai ready for FreeBSD.

I don’t think I’ve ever contributed to a project to this degree before. Ohloh doesn’t think so either. The last time I can recall really adding code to a project that was more than a couple files was at an ISP in Maine back in the early 00’s. It was called Panax, and there’s the usual pile of silly isp shop history. It’s funny that while it’s been sucked into an ISP conglomerate the old color scheme has been maintained. We had an in-house system for user/account management, written in Perl. It had a web front end so none of the tech support folks had to log in to any of the systems to add, remove or manage users. Usually I’m just writing glue scripts, like a good SA. Regardless, it’s been fun and I’ve been learning a lot about Ruby and rspec.

An SE at my last job (who subscribes to python and I still haven’t convinced that CM will change his live) said going into development would be a natural move as I got bored of SA work. Is it that, or is this a shift in being an SA will mean? Configuration Management is still young, despite cfengine being out for some time now, and puppet getting a good following. It may take time for the old SAs to retire and the new deal to take hold. I think more and more as people work in shops with CM implemented, they’ll start to find how hard it is to live without it once you’ve had it. I noticed recently that Slashdot lacks any coverage on Configuration Management in the last few years, but I realize Slashdot is mostly fluffy news these days. While Slashdot is still talking about SCO every day, there is of course talk of new technologies in the new mediums.

The next few months will be exciting to see people pick up chef. There’s a few very helpful individuals in #chef on freenode who want to see this used and are perfectly willing to fix any bugs you find. So give it a shot and let me know what you think.

Replacing munin with ganglia

I’ve been using munin for some time for server trending. It works well out of the box, but it gets really difficult to get it to scale. The poller runs every five minutes and if it doesn’t finish, the next run is simply skipped. As you add more and more data points, this becomes more likely and more common. You simply can’t use SNMP with it (well, you CAN) because the poll is real time and so slow it increases the poller run time significantly.

Adam Jacob at HJK put together a replacement poller called Moonin, but they’ve been busy with chef and it appears in maintainence mode (or worse). We currently run Moonin, until we find a better solution. John Allspaw talks everywhere about using Ganglia at flickr, so I’ve been testing that.

Ganglia definitely lacks the community that munin has, but I like it’s design much better. It was written for monitoring clusters and supports all sorts of business like using multicast to share traffic data about the cluster. I also like that it’s interface for exchanging data is XML and opposed to the custom stuff in munin. This makes it easier to share data about. It’s fast though. When you write plugins for it using gmetric, you give the data to the monitoring daemon, gmond, instead of it polling. Then you collect the data from your clusters using gmetad, and eventually display the data with the web front end.

The lessons I’ve learned so far is that, at least as of 3.1.1, you can only have one cluster per multicast address/pair combination. Regardless of the setting in your gmond configuration, all nodes get reported as a part of the cluster that the machine running gmond is in when gmetad contacts it. I’ve had to deal with this by setting each cluster to use a different port. This isn’t a big deal, because I’m using chef so the gmond configuration file is a ruby template anyhow, but I consider it a bug. In the gmetad configuration you then poll a gmond in each cluster (you can poll multiple nodes in each cluster for redundancy) which forms a grid. Each gmetad instance only supports a single grid for now. The point is this is all very scalable.

The bonus of clusters for us is you can group each type of server, say all your front end web servers, into a cluster, and you get aggregate graphs out of the box. They are limited to a couple default metrics like CPU, but it’s nice. In regard to aggregates for other metrics, I don’t know yet if you can do it or how to go about it.

In my first attempt at adding additional metrics, I wrote a ruby script to poll jboss for statistics data, which you can then pass to gmetric using cron. I’m going to dump it here so it’s on the net. If I keep writing these I’ll put them on github or somewhere.


#!/usr/bin/ruby
#
# tomcat-stat - Collects statistics from tomcat via the status interface,
#   and provides the data for use in other scripts
#
# Copyright 2009 Bryan McLellan (btm@loftninjas.org)
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# To use with ganglia add a cron entry such as:
# * * * * * /usr/bin/gmetric -n 'tomcat threads max' -t uint32 -v `/usr/local/bin/tomcat-stat --thread-max`
#
require 'optparse'
require 'net/http'
require 'rexml/document'

include REXML

options = {}
OptionParser.new do |opts|
options[:host] = "localhost"
options[:port] = "8080"

opts.banner = "Usage: tomcat-stat [options]"

opts.on("-h HOST", "--host HOST", "Host to connect to") { |host| options[:host] = host }
opts.on("-p PORT", "--port PORT", "Port to connect to") { |port| options[:port] = port }

opts.separator " "
opts.separator "Choose one:"
opts.on("--memory-free", "Return free memory") { |free| options[:memoryfree] = free }
opts.on("--memory-total", "Return total memory") { |total| options[:memorytotal] = total }
opts.on("--memory-max", "Return max memory") { |max| options[:memorymax] = max }

opts.on("--thread-max", "Return max threads") { |max| options[:threadmax] = max }
opts.on("--thread-count", "Return count threads") { |count| options[:threadcount] = count }
opts.on("--thread-busy", "Return busy threads") { |busy| options[:threadbusy] = busy }

opts.on("--request-mtime", "Return max request time") { |mtime| options[:requestmtime] = mtime }
opts.on("--request-ptime", "Return request processing time") { |ptime| options[:requestptime] = ptime }
opts.on("--request-count", "Return request count") { |count| options[:requestcount] = count }
opts.on("--request-error", "Return error count") { |error| options[:requesterror] = error }
opts.on("--request-received", "Return bytes received") { |received| options[:requestreceived] = received }
opts.on("--request-sent", "Return bytes sent") { |sent| options[:requestsent] = sent }
end.parse!
# build a url from options
url = "http://#{options[:host]}:#{options[:port]}/status?XML=true"

# retrieve xml document
tomcat_xml = Net::HTTP.get_response(URI.parse(url)).body
doc = REXML::Document.new(tomcat_xml)

puts doc.elements["//jvm/memory"].attributes["total"] if options[:memorytotal]
puts doc.elements["//jvm/memory"].attributes["free"] if options[:memoryfree]
puts doc.elements["//jvm/memory"].attributes["max"] if options[:memorymax]

puts doc.elements["//connector[@name='http-0.0.0.0-#{options[:port]}']"].elements["threadInfo"].attributes["maxThreads"] if options[:threadmax]
puts doc.elements["//connector[@name='http-0.0.0.0-#{options[:port]}']"].elements["threadInfo"].attributes["currentThreadCount"] if options[:threadcount]
puts doc.elements["//connector[@name='http-0.0.0.0-#{options[:port]}']"].elements["threadInfo"].attributes["currentThreadsBusy"] if options[:threadbusy]

puts doc.elements["//connector[@name='http-0.0.0.0-#{options[:port]}']"].elements["requestInfo"].attributes["maxTime"] if options[:requestmtime]
puts doc.elements["//connector[@name='http-0.0.0.0-#{options[:port]}']"].elements["requestInfo"].attributes["processingTime"] if options[:requestptime]
puts doc.elements["//connector[@name='http-0.0.0.0-#{options[:port]}']"].elements["requestInfo"].attributes["requestCount"] if options[:requestcount]
puts doc.elements["//connector[@name='http-0.0.0.0-#{options[:port]}']"].elements["requestInfo"].attributes["errorCount"] if options[:requesterror]
puts doc.elements["//connector[@name='http-0.0.0.0-#{options[:port]}']"].elements["requestInfo"].attributes["bytesReceived"] if options[:requestreceived]
puts doc.elements["//connector[@name='http-0.0.0.0-#{options[:port]}']"].elements["requestInfo"].attributes["bytesSent"] if options[:requestsent]

configuration management with chef announced

Chef has been announced. Listen to this podcast at Cloud Cafe. There’s no way around comparing puppet and chef. Sure, they’re both configuration management tools. It’s simplest to put it this way:

We’re replacing puppet with chef.

And why? A little while ago I wrote about problems I’ve been having scaling puppet. Off the top of my head, the biggest issues for me working with puppet have been:

  1. Dependencies graphs
  2. Limited capabilities of the language (DSL)
  3. Templates are evaluated on the server

Dependency Graphs

There’s a talk about vertically scaling puppet, but not a lot of it about horizontally scaling. I tend to run everything under puppet. People argue that it’s too much work to put single servers in puppet, and you should only use it for systems you intend to clone. I disagree. Puppet recipe’s are self documenting. The same people who don’t want to take the time to write puppet recipes for the single services are the people you have to beat with a sucker rod to get to document anything. Sometimes if I don’t have the time to put into fully testing a puppet recipe for a new machine, I’ll at least write the recipe as I’m working to server both as documentation and a starting point for if/when I come back to it.

The point is that as I scale out in this fashion, more often puppet will fail with a dependency problem on one run, and be fine on the next.  I asked Luke about this at a BoF at OSCON 2008, and he basically told me that he really only focuses on the problems his paid customers have and was anxious to leave and get a beer. That’s fine, I understand it, but since it does nothing to fix my problem it drove me away from the puppet community.

While in theory having puppet do all this work to resolve depency issues seems fine, it is more complexity and trouble than it is worth. As a systems administrator I know what the dependancies are. As you build a system you simply write your recipe in the same order as the steps you’re taking.

Chef takes this idea and runs with it. Recipes are parsed top to bottom. If a package needs to be installed before a service is started, you simply put the package in the recipe first. This not only makes a lot of sense, it makes depencies in a complex recipe visually understandable. With puppet you can end up with spaghetti code remincisent of “goto”, jumping around a number of recipes in an order that’s difficult to understand.

Language

Before the recent 0.24.6, you could not even do:

if $ram > 1024 {
    $maxclient = 500
}

The support for conditionals was rudimentary. I run into a lot of languages and the biggest problem I have is remembering how to do the same thing in each language. The puppet language does not do what a lot of lot of other languages do. I didn’t need another language to learn, let alone one written from scratch. It was just silly doing something like:

  # iclassify script addes vmware-guest tag based on facter facts
  $is_vmware = tagged('vmware-guest')
  if $is_vmware {
    include vmware
  }

Chef uses ruby for it’s recipes. This makes the above stupidly simple with something like:

include_recipe "vmware" if node[:manufacturer] =~ /VMware/

Templates
Puppet evaluates recipes and templates on the server. I ended up with this block of code once when I need to specify the client node’s IP Address in a configuration file:

require '/srv/icagent/lib/iclassify'
ic = IClassify::Client.new("https://iclassify", iclassify_user, iclassify_password)
query = [ "hostname:", hostname].to_s
mip = nil
nodes = ic.search(query)
nodes.each do |node|
  # node.attribs is an array of hashes. keys is 'name' value is 'values'
  node.attribs.each do |attrib|
    if attrib[:name].match(/ipaddress/)
      ip = attrib[:values].to_s
      if ip.match(/10.0.0./)
        mip = ip
        break
      elsif ip.match(/10.0.1./)
        mip = ip
        break
      end
    end
  end
end

This was so much work. Of course with chef you can easily get this information in the recipe because it’s parsed on the node, let alone the ease of doing it in the template if that’s more appropriate. Since the template’s parsed on the client, you grab the information out of variables or directly from the system.

As time goes on I’ll surely write more about using chef. We’re using it production now, and happy with it. In the interim, come to #chef on freenode if you have any questions.