Author Archive for btm

Page 2 of 23

Knife one-liners

Knife’s exec sub-command makes it easier to interact with a Chef server from the command line. Let’s assume I’ve created a data bag named cluster as follows:

{
  "id": "www1",
  "cats": "lol",
  "hostname": "www1.example.org"
}
{
  "id": "www2",
  "cats": "lol",
  "hostname": "www2.example.org"
}
{
  "id": "www3",
  "cats": "plz",
  "hostname": "www3.example.org"
}

If I wanted to get a list of hostnames for each data bag item where the value of ‘cats’ is ‘lol’, I would run:

$ knife exec -E "search(:cluster, 'cats:lol').each {|host| puts host[:hostname] }"
www2.example.org
www1.example.org

Granted, I could get this data from the search sub-command as well:

$ knife search cluster cats:lol
{
  "start": 0,
  "total": 2,
  "rows": [
    {
      "id": "www2",
      "cats": "lol",
      "hostname": "www2.example.org"
    },
    {
      "id": "www1",
      "cats": "lol",
      "hostname": "www1.example.org"
    }
  ]
}

However, it is hard to manipulate the result of this data. For instance, if I wanted to to check the status of ntp on each of these nodes as a “one-off command”, I could run:

$ knife ssh -m \
"`knife exec -E "search(:cluster, 'cats:plz').each {|host| puts host[:hostname] }" | xargs`" \
'/etc/init.d/ntp status'
www1.example.org  * NTP server is running
www2.example.org  * NTP server is running

The quoting can get pretty tricky fast. Instead, if you leave off the -E flag to knife exec, you can pass a script file to knife where you can write clearer scripts, which makes it easier to do more.

# Script contents
$ cat /tmp/knife.exec
targets = Array.new
search(:cluster, 'cats:lol').each do |host|
  targets << host[:hostname]
end
puts targets.join(' ')

# Execute the script
$ knife exec /tmp/knife.exec
www2.example.org www1.example.org

What if you needed to reconcile your hardware support contracts with the systems currently deployed? It is no problem to get a list of hardware with chef and knife.

# Script contents
$ cat /tmp/dell.exec
search(:node, 'dmi_system_manufacturer:Dell*').each do |node|
  puts node[:dmi][:system][:serial_number] + "\t" + node[:fqdn]
end

# Execute the script
$ knife exec /tmp/dell.exec
XJS1NF1 www1.example.org
XJS1NF2 www2.example.org
XJS1NF3 www3.example.org

These are pretty simple examples, but hopefully you can see how easy it is with Chef to use knife scripts to create reports, collect data, and execute one-off commands.

Wrangling 32bit debs on a 64bit system

Typically directions for downloading a i386 version of a library for a x86_64 system link to a specific deb package and tell you to download it with wget. A new release of that package often breaks the link, so I wanted to document how to do this using apt. Unfortunately, it looks like apt won’t download a single deb if it can’t resolve dependencies, but aptitude will, so we use them together.

I use a separate sources.list here just to speed up the process, as we need to correct apt when we’re finished.

# Download 32bit list files from the mirror specified in /tmp/sources.list
apt-get -o=APT::Architecture="i386" -o=Dir::Etc::sourcelist="/tmp/sources.list" -o=Dir::Etc::sourceparts="/dev/null" update
# Download the single library. Set libstdc++5 to whatever library you want
aptitude -o Apt::Architecture=i386 download libstdc++5
# Return apts lists to their preconfigured state
apt-get update
# Optionally, install the package
dpkg --force-architecture -i libstdc++5_1%3a3.3.6-20~lucid1_i386.deb

Note that if you install the package, it would overwrite the 64bit version of the library if it is installed. 32bit packages meant for 64bit systems, like the ia32-libs package, install to /lib32 and /usr/lib32 to avoid this. You could also extract the package with ‘dpkg -x libstdc++5_1%3a3.3.6-20~lucid1_i386.deb’ and copy the libraries to where you like, then run ‘ldconfig’. The getlibs tool will try to repack debs more appropriately for you, if you like.

libvirtError: monitor socket did not show up

Sometimes errors don’t float to the top of stacks well.

Our virtualization stack is pretty automated wherein we have a custom script that uses vmbuilder to create the guest, register it with libvirt, create first boot scripts that will have it register with a chef server, and start the VM. We saw this error today libvirtError: monitor socket did not show up.: Connection refused, and I commented that my memory contained a lot of libvirt/kvm errors, and many resolutions, but the two don’t always stay connected. I checked the libvirt logs in /var/log/libvirt and even ran libvirt with LIBVIRT_DEBUG=1 libvirtd -v. When I tried running kvm by hand using the syntax in the logs, but with the -net options removed from the command line, kvm just spouted Aborted. After starting at it for a bit, I noticed that instead of -m 1024 KVM was trying to run with -m 1073741824 (1024^3). This was due to a small conversion bug in our custom script.

Amazon EC2 Network Subnets

For a project that exists both in Amazon Web Services EC2 US-EAST-1b and another cloud, I wanted to block network traffic between the two to ensure they didn’t affect each other. I started by doing an whois looking via ARIN for all of the IP addresses we are currently assigned in EC2, and I ultimately got the same list that I found registered to the AMAZO-4 contact with ARIN, with the exception of AMAZON-AES, which I presume is for Amazon Enterprise Solutions. I couldn’t tell you offhand if the same IP blocks are used in other AWS zones.

Network CIDR Netmask ARIN Name
72.44.32.0 /19 255.255.224.0 AMAZON-EC2-2
67.202.0.0 /18 255.255.192.0 AMAZON-EC2-3
75.101.128.0 /17 255.255.128.0 AMAZON-EC2-4
174.129.0.0 /16 255.255.0.0 AMAZON-EC2-5
204.236.128.0 /17 255.255.128.0 AMAZON-EC2-6
184.72.0.0 /15 255.254.0.0 AMAZON-EC2-7
50.16.0.0 /14 255.252.0.0 AMAZON-EC2-8

Here are the IOS commands:

name 72.44.32.0 EC2-2 description AMAZON-EC2-2
name 67.202.0.0 EC2-3 description AMAZON-EC2-3
name 75.101.128.0 EC2-4 description AMAZON-EC2-4
name 174.129.0.0 EC2-5 description AMAZON-EC2-5
name 204.236.128.0 EC2-6 description AMAZON-EC2-6
name 184.72.0.0 EC2-7 description AMAZON-EC2-7
name 50.16.0.0 EC2-8 description AMAZON-EC2-8
object-group network ec2-us-east
   network-object 174.129.0.0 255.255.0.0
   network-object 184.72.0.0 255.254.0.0
   network-object 204.236.128.0 255.255.128.0
   network-object 50.16.0.0 255.252.0.0
   network-object 67.202.0.0 255.255.192.0
   network-object 72.44.32.0 255.255.224.0
   network-object 75.101.128.0 255.255.128.0

Script hacks: waiting for the internet

Now and then the VMs (kvm, libvirt + vmbuilder) I was cranking out would start up too fast, and the “first boot” script would run before the host got an IP address and had internet access. Since the first thing I was doing was downloading the Rubygems source using wget (to install chef), and since wget lacks a retry for dns failure, I hacked up this script to wait for the internet a bit.

#!/bin/bash

# Wait for internet to come up (DHCP)
MAXWAIT=60
WAITTIME=0
host production.cf.rubygems.org > /dev/null

while [ $? == 1 ] && [ $WAITTIME -le $MAXWAIT ] ; do
  WAITTIME=$(($WAITTIME + 10))
  sleep 10
  echo -n .
  host production.cf.rubygems.org > /dev/null
done

DNS-SD, a printer, and a little luck

DNS SD, also known as Apple’s Bonjour, utilizes DNS as a configuration database for automatic service discovery. For the most part, it appears its used by devices more than people. The multicast implementation, or mDNS, is what makes printers automatically show up in OS X when you put them on your network. I recently moved such a printer from a flat network, to one where the wired and wireless workstations were on separate subnets. In an attempt to make the printer easy to find, I implemented DNS SD over unicast so OS X laptops in the office could detect the printer with Bonjour.

First, I set the Domain Name to “office.opscode.com” using DHCP, so I would have a nice sandbox to mess around with DNS without breaking anything. Then I created a few DNS records:

OfficejetPro8500.office.opscode.com A 172.28.0.5
lb._dns-sd._udp.office.opscode.com PTR office.opscode.com.
b._dns-sd._udp.office.opscode.com PTR office.opscode.com.
_printer._tcp.office.opscode.com PTR _OfficejetPro8500._pdl-datastream._tcp.office.opscode.com.
_pdl-datastream._tcp.office.opscode.com PTR _OfficejetPro8500._pdl-datastream._tcp.office.opscode.com.
_OfficejetPro8500._pdl-datastream._tcp.office.opscode.com SRV 0 0 9100 OfficejetPro8500.office.opscode.com.
_OfficejetPro8500._pdl-datastream._tcp.office.opscode.com TXT "txtvers=1" "note=Office Entry" "usb_MFG=HP" "usb_MDL=Officejet Pro 8500 A909g" "ty=HP Officejet Pro 8500"
  1. Specifies the internal IP address of the resource. We use this later in the SRV record.
  2. What domain the client should browse if they haven’t specified one.
  3. What domain a client in this domain should browse.
  4. Define a LPR/LPD printer. LPR is the “Flagship” protocol and “must” be defined (Port 515)
  5. Define a PDL printer, sometimes called raw (Port 9100)
  6. Specify the printer service. The last four fields there are priority, weight, port and host, per RFC 2782.
  7. Provide additional configuration information related to the printer

There isn’t a lot of clear information regarding how you should specify multiple key/value pairs in the TXT field. RFC 1035 specifies, <character-string> is a single length octet followed by that number of characters. <character-string> is treated as binary information, and can be up to 256 characters in length (including the length octet). For Microsoft DNS, check out this article. I was using DynInc’s Dynect, and was able to put all the key/value pairs in double quotes in the single input field. Also, if you are too, use the “Expert Editor” which is a menu option under the “Simple Editor,” it is a little easier to specify the multi-part hostnames this way. It sounds like in bind you put one key/value pair in double quotes per line, with the series wrapped in parenthesis.

Dynect wouldn’t let me specify the SRV record without a preceding underscore, which is a shame, because this is what OS X detects as the device name which also lower-cased it, making it a little difficult to read. You should be able to spaces in these names, but I wasn’t about to try escaping that. The key/value pairs in the TXT resource record are documented in the Apple Bonjour Printing specification.

  • txtvers / Define what version of this format we are using
  • note / User-readable information about the device, OS X displays this as Location
  • usb_MFG / the Manufacturer name that the USB driver would specify. I made educated guesses at these.
  • usb_MDL / the Model that the USB device would specify. Combined with the last field this will choose the driver for the user.
  • ty / a User-readable name for the device. I had hoped this would be used in the Printer Name field in the GUI, but it wasn’t.

virt-manager keymaps on OS X

I’m not crazy about the lack of a definitive package manager for OS X. I tried for about a day to work with Open Source on OS X, then I built an Ubuntu VM. I’ve been using ssh with X forwarding when I need a graphical interface; OS X has reasonable good built in support for X11. However, others have found that the keymap and meta keys are broken. While I got a kick out of “After some time I discovered that the number 8 is interpreted as Return,” I did need to log in to a guest to do some debugging.

The accepted solution to making Ctrl+Alt release keyboard focus correctly in the vncviewer spawned by virt-manager is to create a .Xmodmap file in your home directory with this content:

clear Mod1
keycode 66 = Alt_L
keycode 69 = Alt_R
add Mod1 = Alt_L
add Mod1 = Alt_R

I killed the X server by focusing on it and choosing quit, and it seemed to be read the .Xmodmap file okay without my needing to restart the entire system.

The workaround for the broken keymap pointed me in the right direction, but I wasn’t happy with the solution. A little digging around the libvirt domain xml reference pointed out that you can add a keymap as an attribute to the vnc element in the domain xml definition. Use ‘virsh edit’ to edit the domain XML and modify the vnc line to add this attribute so it looks like so:

<graphics type='vnc' port='5900' autoport='yes' listen='127.0.0.1' keymap='en-us'/>

I destroyed the guest and restarted it and the keyboard worked now without any “8 is now enter” trickery. I’m pretty sure you can choose any keymap from /usr/share/qemu/keymaps. If you use vmbuilder you will want to add this to /etc/vmbuilder/libvirt/libvirtxml.tmpl as well.

Motorola Backflip charging

Nightmare.

Chargers:

AC1) Motorola DC4050US0301 5.1V DC 850MA
AC2) AT&T 03577 5.0V 1000ma
DC1) AT&T USB VPC03578
DC2) AT&T USB + MiniUSB MV302927

Cables:

M1) Motorola SKN6378A
M2) “Motorola” SKN6238A
M3) Monoprice generic microusb

Dead Phone, AC1, M1 OR M2 OR M3
Green light on Phone, OS starts, displays charging battery

Dead Phone, AC2, M1 OR M2 OR M3
Blue light on AC2, Green light on Phone, OS starts
Green light / OS cycle every 15 seconds

Dead Phone, DC1, M1 OR M2 OR M3
White light on DC1, Green light on Phone, OS starts
Green light / OS cycle every 15 seconds

Dead Phone, DC2, M1 OR M2 OR M3
White light on DC2, Green light alternates on/off on Phone

Phone on, AC1, M1 OR M2 OR M3
Green light on, charge symbol in battery on display

Phone on, AC2, M1
Blue light on AC2 for five seconds

Phone on, AC2, M2 OR M3
Blue light on AC2
Green light on, no charge symbol in battery on display

I have an AT&T AC charger at work that I believe works as well as the stock Motorola. The AT&T AC charger here at home, listed above, is a “five star” model that consumes 0W when not charging, I assume that is what the blue light turning off indicates. Hopefully the combinations that keep the green light on the phone on are charging, just very slowly, and are still somewhat useful. More to come.

Munin Aggregation with Multigraph

Six months ago I made note to the pattern for referring to stacked graph data sources in munin:

load.double.stack one=localhost.localdomain:load.load two=localhost.localdomain:load.load

This syntax evaluates to:
graph.value.stack line=host.domain:plugin.value

I’ve been using multigraph more since then, which is a boon to performance, but it complicates stacked graphs a little. This hurts because it remains very difficult to tell why your graphs are not drawing when you incorrectly reference a data source. To debug, as the munin user (use ’su -l munin’, ’sudo -s -u munin’ or ‘chpst -u munin’) run:
/usr/share/munin/munin-graph --service 'load.double.stack' --debug
Be sure to replace “load.double.stack” with the name of the graph you’re trying to draw.

The munin wiki example for stacked graphs explains data source names as:

snmp_ups_current.inputtotal.sum \
---------------- ---------- ---
        |             |      |
        |             |      `-- The sum mechanism
        |             `--------- One of this virtual plugin's values
        `----------------------- The name of the virtual plugin

ups-5a:snmp_ups_ups-5a_current.inputcurrent \
ups-5b:snmp_ups_ups-5b_current.inputcurrent
------ ----------------------- ------------
   |               |                 |
   |               |                 `------ The "inputcurrent" value from the real plugin
   |               `------------------------ The real plugin's name (symlink)
   `---------------------------------------- The host name from which to seek information

However, with multigraph the name of the plugins symlink isn’t necessarily the name of the graph. The trick I found was to connect the the munin node and call the multigraph plugin, looking for the ‘multigraph’ line.

$ nc localhost 4949
# munin node at server.example.org
cap multigraph # tell munin-node that you are multigraph aware
cap multigraph
fetch diskstats # fetch the diskstats multigraph plugin
multigraph diskstats_latency
sdb_avgwait.value 0
multigraph diskstats_latency.sdb
avgwait.value 0
.

I’ve removed a significant portion of the returned data here. Pay attention to the fact that this plugin returned a “diskstats_latency” graph that contains data for all of the disks, as well as individual graphs for each disk, here “diskstats_latency.sdb” In this example your stacked graph configuration would be:

disk.double.stack \
  one=localhost.localdomain:diskstats_latency.sdb.avgwait \
  two=localhost.localdomain:diskstats_latency.sdb.avgwait
  -1- ----------2---------- -----------3--------- ---4---

(1) The alias and label for this host or data point
(2) The configured node name of the host
(3) The original graphs name, either the plugin or multigraph name
(4) The value from the plugin/graph

Notice that while the period is used to separate the value from the rest of the field, there may be periods in the rest of the field. Also keep in mind that in the past I have seen dashes in configured graph names become underscores at the end of the day.

Community Cooking

It’s been a year since the Opscode Cookbook site was launched and a recent project got me thinking about what parts of my hopes that I wrote about then have taken effect so far. I recently heard that a major Chef user has switched to Ubuntu from another Linux distribution because that is what most of the cookbooks that Opscode maintains are written for and tested on. Choice of distribution is typically something that is very dear to administrators and somewhere in the world there is a flame war on this topic every second. Consequently, this is huge and I’ve been thinking about it for a while.

It is one thing for a company to choose a distribution based on a software package that is significant to them; in the past I have had to battle against stakeholders that wanted to choose a particular distribution solely on the availability of support. Chef runs on a lot of platforms, but of course some get much more attention than others because that is where the community is. But here we’re seeing a company choose their distribution not because of Chef’s support for it, but for the community support for Cookbooks that run on it. This is clear evidence that what I wrote about is starting to happen.

I’ve been working on a Chef + Cobbler writeup in my spare time. I went out the other day and bought a consumer desktop to use as a libvirt/kvm host for this project. It often tends to be least painful to use cloud resources, but sometimes that which is taken care of for you is too unknown, too much of a “black box,” and you need the deep visibility into your infrastructure that building it yourself provides. There is indication that some have gotten Cobbler running on and deploying Ubuntu, but it doesn’t appear to have taken hold. There’s a launchpad spec claiming there is a package, but I couldn’t find it. Another spec makes it clear that files for debian packaging from upstream are not finished. It is here that I first ran into problems. I couldn’t get the init.d scripts provided in the debian/ directory of the upstream repository to work. They clearly needed some help, and after spending some time on them it became clear that they’re untested templates created by debhelper.

My goal wasn’t to fix these init scripts, I just wanted to get the cobbler server running. Then I remembered that we had a great existing runit cookbook that I was familiar with. The API for the cookbook site has progressed since release. Unfortunately the documentation for the cookbook API is a little behind, but the new functionality has been built into knife, the command line tool that interacts with the Chef server or Opscode platform, as well as multiple cloud providers. From within my personal chef-repo, I ran:

knife cookbook site vendor runit

This downloaded the runit cookbook from the Cookbooks site into a branch in my chef-repo git repository, then merged it into my current branch, allowing me to track changes between my local copy and the upstream copy using git. Then I added a couple templates and a call to the runit_service definition to my cookbook:

templates/default/sv-cobbler-run.erb:

#!/bin/sh
PATH=/usr/local/bin:/usr/local/sbin:/bin:/sbin:/usr/bin:/usr/sbin
exec 2>&1
exec /usr/bin/env cobblerd -F

templates/default/sv-cobbler-log-run.erb:

#!/bin/sh
exec svlogd -tt ./main

recipes/default.rb:

# SNIP
runit_service "cobbler"

And then cobblerd was running under runit. There’s beauty in being able to take something somewhat complex like runit, and make it easy. So easy, that I used it rather than fixing up an init script.

Then I found that cobbler wanted to called through apache as proxy. No problem though, I vendored the apache2 cookbook as well. I spent a few minutes determining that I needed a couple of Apache modules, as the Cobbler instructions are pretty centric to Redhat and I got the impression that they make assumptions about what that gives you. Then I used the apache2 cookbook to proxy cobbler by adding this to the top of my recipe:

recipes/default.rb:

include_recipe "apache2"
include_recipe "apache2::mod_proxy_http"
include_recipe "apache2::mod_python"

I had some permission problems with mod_proxy, again likely a difference between Redhat and Ubuntu, but it wasn’t out of my way to ship the apache config provided by upstream using Chef with a small modification:

<Proxy http://localhost:25151>
    AddDefaultCharset off
    Order deny,allow
    Allow from all
</Proxy>

I’ll write about the Cobbler cookbook more later. You can, of course, follow it on the Cookbook site in the interim. I want to emphasize how I used a single command to grab existing code and leverage it to make it easier for me to do what I was trying to do: get Cobbler running. The Cookbook site combined with the awesome Chef community made this possible. If you haven’t used the “cookbook site” subcommands in knife yet, take a moment to try them out.

$ knife cookbook site --help
Available cookbook site subcommands: (for details, knife SUB-COMMAND --help)

** COOKBOOK SITE COMMANDS **
knife cookbook site vendor COOKBOOK [VERSION] (options)
knife cookbook site show COOKBOOK [VERSION] (options)
knife cookbook site share COOKBOOK CATEGORY (options)
knife cookbook site search QUERY (options)
knife cookbook site list (options)
knife cookbook site download COOKBOOK [VERSION] (options)
knife cookbook site unshare COOKBOOK

Northeast travels

I’ll be traveling in the northeastern US from 10/28 through 11/13. Current plans put me in DC from 10/28 – 11/1. Then I’ll be in Boston on Tuesday 11/2 for the Boston Devops Meetup. On Wednesday 11/3, I will be in NYC for the NYC Devops Meetup at drop.io. Following that, I will be stationed back in my hometown in Maine until returning to Seattle on 11/13. If you’d like to catch up with me about operations with Chef while I am in the region, or have me stop by somewhere and give a presentation, send me an email and we can make arrangements.

Installing powershell on Window Server 2008 with Chef

Here is a simple example of automation on Windows using Chef. Powershell is our gateway to nearly all parts of modern Windows servers so it is one of the first packages we want to install.

This was tested on EC2 (ami-c3e40daa) running Microsoft Windows Server 2008 Datacenter edition. You will need to start with the Chef installation directions for Windows, with either your own Chef server built out or an Opscode Platform account.

Those familiar with Chef will notice this looks exactly like a recipe for a Linux system, provided that we were installing software without the benefit of a package manager. This recipe simply downloads the installer and then executes it with a switch to perform the installation silently. Also note that I pass the “/norestart” switch as well. MSU is a standalone microsoft update installer, rather than a plain executable. This installation will trigger hooks normally associated with updates, which also means that it triggers a restart. I hadn’t thought about this the first time I ran this recipe and thought for a moment that my system had crashed. In actuality it performed flawlessly and the system restarted with powershell installed.

The remote file resource (line nineteen) only downloads the installer if it isn’t present or has changed. If Microsoft changes the placement of this package on their website, you would have to update this resource. Alternately, you could place these important files on an internal web server or file server, or place them directly in the cookbook and transfer them to the server using the cookbook file resource.

The execute resource (line twenty-three) only runs the installer if the remote file resource has chosen to download this file. The “:nothing” action tells the resource to not run, specifically so that it wouldn’t run everything. However the subscribes attribute hooks into the prior remote file resource so that it will run if that resource performs an action. You can find more information regarding resources in the Chef wiki.

Windows servers no longer need to remain in the land of “as-built” documentation, where a wiki page or word doc specified the individual steps someone ran to build out this particular system, where additional builds took days as we found the time to run all of the software installers by hand. Chef recipes can both build out your systems and simultaneously document the process for you naturally.

More to come, stay tuned.

# Cookbook Name:: powershell
# Recipe:: default
# Author:: Bryan McLellan <btm@loftninjas.org>
#
# Copyright 2010, Opscode, Inc
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

remote_file "c:/windows/temp/Windows6.0-KB968930-x86.msu" do
  source "http://download.microsoft.com/download/F/9/E/F9EF6ACB-2BA8-4845-9C10-85FC4A69B207/Windows6.0-KB968930-x86.msu"
end

execute "c:/windows/temp/Windows6.0-KB968930-x86.msu /quiet /norestart" do
  action :nothing
  subscribes :run, resources(:remote_file => "c:/windows/temp/Windows6.0-KB968930-x86.msu")
end

Silent Ruby install on Windows

I dug up unattended ruby install directions while working on Chef installation directions for windows. Most of the secrets can be found in the RubyInstaller discussion group, such as here and here.

Grab the RubyInstaller for windows, then run: rubyinstaller-1.8.7-p302.exe /tasks="assocfiles,modpath" /silent. The tasks option checks (enables) the options to associated .rb files with ruby and adding the ruby binary directory to the path. You probably wouldn’t want these if you were installing multiple versions of ruby.

Joining Opscode

After a brief respite over the next few weeks, I’ll be joining Opscode in September. Not only am I excited about working with such a great group of people, but also for the incredible opportunity of getting to work on problems whose solutions are already beginning to permanently change how we build systems.

After all, we should be solving life’s more difficult problems, not passing our days as a cog in a machine of repetitious activity. When skilled and respected humans become mere automatons of deployment tasks; we’ve slipped into a dismal place. There is boundless room out there for innovation, but we need a dependable platform on which to build.

I’m joining Opscode to help craft this reality. I want to help you find meaning in these tools; how they will make your life easier. Don’t confuse this end as simply being able to work faster, it’s about working better.

Dependant Paradigms

The Systems Administrator is likely the closet technological trade to skilled manual labor. They troubleshoot complex systems that others take for granted, until they fail, with a deceptive ease. Explaining to another how they had a hunch to look at a certain part of the system is either a retrospective tale of why it made sense, or a sarcastic nod to magic. This tale attempts to work out how one could have deduced the solution, but even if someone assembled a collection of symptoms and solutions into a step-by-step guide, they would not be able to replace the role of a Systems Administrator. Like an automotive mechanic can detect a blown head gasket from the smell of the oil, a Systems Administrator can sense infrastructure issues from how systems are behaving. And like a fondness for a make of automobile, we grow attached to Linux distributions that have treated us well and editors whose dark secrets we can manipulate skillfully.

I once had a student who didn’t understand why we couldn’t repair board-level hardware issues ourselves as easily as replacing a stick of memory, as their uncle was capable of repairing any engine problems by opening up the hood and quite literally “jiggling some wires.” A mystic knowledge exists in both worlds that is challenging to articulate to the layman. It can be difficult enough to explain a single component, but when a part of a system falls over and causes cascading failures in other parts of a system, outsiders are tempted to believe that they’ve just learned a truth about the solution. That is, that when certain symptoms occur, it is always caused by the failure of a particular part and that this part should be restarted to ’solve’ the problem. Yet, the experienced know that this only resolves the symptoms and the still problem lurks, now with fewer hints as to its origin.

The future is already here – it is just unevenly distributed. — William Gibson

The trouble with paradigm shifts is that they aren’t necessarily direct improvements on existing technology with a clear lineage. Critics ask why the new ways are better than that which they replace, and we struggle to draw the path that led us to this new place of understanding. The struggle is because instead of making a choice at a clear intersection of a path, we stepped through the bushes to another path not as obviously traveled. This alternate path may lead us to the same end, but its course has an entirely different shape.

To further exacerbate the problem, new innovations stand on the shoulders of giants. Some people have been convinced of the merits of leveraging cloud computing on a strictly financial basis, and have missed the tenants of Undifferentiated Heavy Lifting (UHL), where running servers and building networks may not be ones core business and ultimately a distraction. Some have yet to grasp the concept of treating systems, even built on internal hardware, as disposable, still accustomed to legacy processes of maintaining a system for the lifetime of the hardware.

It is essential to realize that these new technologies are not minor improvements to business as usual. Like the birth of globalization changing business around the world, nursed by the multi-modal shipping container’s head fake as just another way of moving cargo, todays innovations will surely reshape the face of operations permanently, in substantial and non-incremental ways.