Sunday, 6:55AM: The CEO of my last startup, which laid off the entire development team over a year ago calls my cellphone to report a total network outage. He has an international trip that he’s leaving on and needs to get into his email to coordinate his trip. Too bad they laid off everyone that could help, with prejudice.
I wrapped up my social plans and went in to the data center last night. The workstation in the lab couldn’t reach the domain controllers on another subnet, so it lacked a DHCP address as well. I narrowed it down to the Netgear GSM7324 L3 Switch that was being used in the core. A purchase of my predocessor there, I always looked at it funny. Netgear’s one foray into enterprise routing that I’ve ever seen, I didn’t care for it much. It’s CLI tried to be a Cisco, but was substantially different so it only served to confuse people with Cisco experience (the dell gear is much closer at the UI emulation, for the record).
I drag out a serial console to reveal:
Timebase: 33.000000 MHz, MEM: 132.000000 MHú NetGear Boot code...... Flash size = 16MB Testing 128MB SDRAM..............................Pass Unknown PCI devices. PCI devices found - Motorola MPC8245 Select an option. If no selection in 10 seconds then operational code will start. 1 - Start operational code. 2 - Start Boot Menu. Select (1, 2): Operational Code Date: Thu Aug 3 22:43:40 2006 Uncompressing..... 50% 100% |||||||||||||||||||||||||||||||||||||||||||||||||| Attaching interface lo0...done Adding 36274 symbols for standalone. Unknown box topology
This is apparently common for people that take the plunge and try to save some money over buying a tried and true piece of equipment for their core. There lineup has a few newer, more expensive, models. The GSM7324 still sits at the lowest price point for a Netgear L3 switch though, luring those in thinking that there’s no tradeoff in price.
So apart came all the trunks and redundant switch links. There was enough redundancy in the physical cabling to the edge switches that I could switch to access links for each subnet. I chained all the switches back together, like it was when I first started, and set up routing on a [Juniper] Netscreen 50 instead, being the only alternative. Everything started coming back up as I dig through the network in search of the original static routing entries that I never found the time to upgrade.
How important is network administration? Too important for system administrators to get away with not knowing. A colleague was recently complaining that he couldn’t get an interviewee to answer why having two physical separate switches is better than having one. I find all of this unfortunate and trying, when I have had to re-architect every network I’ve inherited since moving to Seattle. Sometimes I think we should go back to yelling about the 5-4-3 rule on a soapbox so we’ll at least get a sane switch topology. Hopefully by the time they realize why the 5-4-3 rule doesn’t apply anymore, they’ll have picked up why switch topology is too important to be a matter of just plugging switches together like they’re power strips. Because that’s a great pet peeve too.