While at Haydrian, Adam Logghe sent me this article about startup operations which is sparked by O’Reilly’s rant on startup secret sauce. Not having built a completely automated windows operations deployment system before I can only postulate to a degree, but I disagree with the comment about Microsoft having a leg up on open source because their server team works with there operations team.
In many open source environments, the operations team also happens to be the server operating system team, that is, many operations people in open source are contributers. When starting at Widemile we had a plan to kick start operations. Some of the people here had worked with Adam from HJK in the past. These people are a great example, not only is HJK heavily involved with puppet, including successful deployments, they also develop open source tools like iclassify to tie into puppet and capistrano.
Last night I finished setting up the largest hump for me in our new ops platform. The design is this, servers on vmware guests, with the hosts running on blades with vlan trunking. Working with HJK’s help (I highly recommend these guys, just don’t everyone hire them at once, I like having access to them myself) we’ve got a full puppet deployment and last night I finished transitioning all of the the servers to vlan trunking. Need another web server? Check munin for a vmware host with available load, create a new guest (haven’t automated this yet) and to an automated network install. Then push puppet and iclassify (one command) out, tag the new node in iclassify ( a couple clicks) with it’s role, and puppet pushes out all the required software and configs for that server.
What else do you get out of this? One of the servers wasn’t working today, i couldn’t get to it on the network. I jumped on the console via the vmware server gui and saw one of the interfaces was bridged to the wrong vlan. Fortunately I can change which /dev/vmnet interface on the host the guest is tied to from the vmware management utility in real time without even rebooting the machine, and everything was fixed.
All the benefits of blades aside, the software solutions used here are wonderful. I’ve implemented a few hacks like using the vmware-server ‘backdoor’ to identify what host a guest is on, and have that become an iclassify attribute automatically, usable in iclassify, puppet and capistrano tasks. Now granted, all of this requires a very broad level of experience, but once you get it setup, it’s not much work to maintain. When you’re talking about having piles of servers dropping from the sky, this is what you want already setup, rather than a handful of admins manually doing configurations.