Monitoring Unicorn connections with munin

Unicorn doesn’t have any monitoring hooks. Typically folks either put nginx in front and monitor response time, do some backlog magic and track errors or make guesses based on other available information. I’ve been using a modified version of the unicorn_status munin plugin for a while. It tracks CPU time for a thread and considers that thread idle if it hasn’t changed after sleeping for a second. This doesn’t pan out under load. Still, here it is.

#!/usr/bin/env ruby
#
# unicorn_status - A munin plugin for Linux to monitor unicorn processes
#
#  Copyright (C) 2010 Shinji Furuya - shinji.furuya@gmail.com
#  Copyright (C) 2010 Opscode, Inc. - Bryan McLellan <btm@loftninjas.org>
#    - Specify pid file via environment variable
#    - Do not assume process names
#  Licensed under the MIT license:
#  http://www.opensource.org/licenses/mit-license.php
#

module Munin
  class UnicornStatus

    def initialize
      @pid_file = ENV['UNICORN_PID']
    end

    def master_pid
      File.read(@pid_file).to_i
    end

    def worker_pids
      result = []
      ps_output = `ps w --ppid #{master_pid}`
      ps_output.each_line do |line|
        chunks = line.strip.split(/\s+/, 5)
        pid = chunks[0]
        result << pid.to_i if pid =~ /\A\d+\z/
      end
      result
    end

    def worker_count
      worker_pids.size
    end

    def idle_worker_count
      result = 0
      before_cpu = {}
      worker_pids.each do |pid|
        before_cpu[pid] = cpu_time(pid)
      end
      sleep 1
      after_cpu = {}
      worker_pids.each do |pid|
        after_cpu[pid] = cpu_time(pid)
      end
      worker_pids.each do |pid|
        result += 1 if after_cpu[pid] - before_cpu[pid] == 0
      end
      result
    end

    def cpu_time(pid)
      usr, sys = `cat /proc/#{pid}/stat | awk '{print $14,$15 }'`.strip.split(/\s+/).collect { |i| i.to_i }
      usr + sys
    end
  end
end

case ARGV[0]
when "autoconf"
  puts "yes"
when "config"
  puts "graph_title Unicorn - Status"
  puts "graph_args -l 0"
  puts "graph_vlabel number of workers"
  puts "graph_category Unicorn"
  puts "total_worker.label total_workers"
  puts "idle_worker.label idle_workers"
else
  m = Munin::UnicornStatus.new
  puts "total_worker.value #{m.worker_count}"
  puts "idle_worker.value #{m.idle_worker_count}"
end

And the configuration file:

$ sudo cat /etc/munin/plugin-conf.d/unicorn
      [unicorn_*]
      user root
      env.UNICORN_PID /etc/sv/opscode-chef/supervise/pid

I wrote another plugin today that uses raindrops to collect information about the active and queued connections. It is interesting how greatly active connections fluctuates. Thus, active connections don’t produce a stable munin graph, but having the queue depth recorded is pretty useful for tracking down latency issues.

#!/usr/bin/env ruby
#  Copyright: 2011 Opscode, Inc.
#  Author: Bryan McLellan <btm@loftninjas.org>
#
#   Licensed under the Apache License, Version 2.0 (the "License");
#   you may not use this file except in compliance with the License.
#   You may obtain a copy of the License at
#
#       http://www.apache.org/licenses/LICENSE-2.0
#
#   Unless required by applicable law or agreed to in writing, software
#   distributed under the License is distributed on an "AS IS" BASIS,
#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#   See the License for the specific language governing permissions and
#   limitations under the License.

require 'rubygems'
require 'raindrops'

def collect(port)
  # raindrops requires an array of strings, even if it denies this 
  addr = [ "0.0.0.0:#{port}" ]
  stats = Raindrops::Linux.tcp_listener_stats(addr)

  puts "active.value #{stats[addr[0]].active}"
  puts "queued.value #{stats[addr[0]].queued}"
end

if ARGV[0] == "config"
  puts "graph_title Unicorn - connections"
  puts "graph_args -l 0"
  puts "graph_printf %6.0lf"
  puts "graph_vlabel connections"
  puts "graph_category Unicorn"
  puts "active.label active"
  puts "queued.label queued"
  exit 0
end

if $0 =~ /.*_(\d+)/
  # the munin wildcard format of plugin_value
  port = $1
elsif ARGV.size > 0
  port = ARGV[0]
else
  usage = "Usage: #$0 port or #{$0}_port"
  abort usage
end

collect(port)

Usage is the same as any wildcard munin plugin.

  1. Install the raindrops gem
  2. Drop the above code in “/usr/share/munin/plugins/unicorn_connections_”
  3. Create a link from “/etc/munin/plugins/unicorn_connections_UNICORNPORT” to the above script
  4. killall -HUP munin-node

Graphs should start showing up in five or ten minutes. You can always test like so:

$ nc localhost 4949
# munin node at unicorn.example.org
fetch unicorn_connections_6880
active.value 5
queued.value 0
.
quit

Of course, I use the Chef and the munin cookbook’s munin_plugin definition, so my application’s cookbook has this additional code:

# required for unicorn_connections_ munin plugin
gem_package "raindrops"

munin_plugin "unicorn_connections_" do
  plugin "unicorn_connections_6880"
  create_file true
end

4 thoughts on “Monitoring Unicorn connections with munin

  1. Dan

    2014/04/03-10:45:03 CONNECT TCP Peer: “[::ffff:127.0.0.1]:39903” Local: “[::ffff:127.0.0.1]:4949”
    2014/04/03-10:45:04 [4576] Error output from unicorn_memory_status:
    2014/04/03-10:45:04 [4576] /etc/munin/plugins/unicorn_memory_status:40:in `+’: nil can’t be coerced into Fixnum (TypeError)
    2014/04/03-10:45:04 [4576] from /etc/munin/plugins/unicorn_memory_status:40:in `total_memory’
    2014/04/03-10:45:04 [4576] from /etc/munin/plugins/unicorn_memory_status:79:in `’
    2014/04/03-10:45:04 [4576] Service ‘unicorn_memory_status’ exited with status 1/0.

    Looks familiar ? 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload the CAPTCHA.