Consumer hardware in enterprise environments

Sunday, 6:55AM: The CEO of my last startup, which laid off the entire development team over a year ago calls my cellphone to report a total network outage. He has an international trip that he’s leaving on and needs to get into his email to coordinate his trip. Too bad they laid off everyone that could help, with prejudice.

I wrapped up my social plans and went in to the data center last night. The workstation in the lab couldn’t reach the domain controllers on another subnet, so it lacked a DHCP address as well. I narrowed it down to the Netgear GSM7324 L3 Switch that was being used in the core. A purchase of my predocessor there, I always looked at it funny. Netgear’s one foray into enterprise routing that I’ve ever seen, I didn’t care for it much. It’s CLI tried to be a Cisco, but was substantially different so it only served to confuse people with Cisco experience (the dell gear is much closer at the UI emulation, for the record).

I drag out a serial console to reveal:

Timebase: 33.000000 MHz, MEM: 132.000000 MHĂș

NetGear Boot code......

Flash size = 16MB

Testing 128MB SDRAM..............................Pass

Unknown PCI devices.
PCI devices found -
Motorola MPC8245
Select an option. If no selection in 10 seconds then
operational code will start.

1 - Start operational code.
2 - Start Boot Menu.
Select (1, 2):

Operational Code Date: Thu Aug  3 22:43:40 2006
Uncompressing.....

50%                     100%
||||||||||||||||||||||||||||||||||||||||||||||||||
Attaching interface lo0...done

Adding 36274 symbols for standalone.

Unknown box topology

This is apparently common for people that take the plunge and try to save some money over buying a tried and true piece of equipment for their core. There lineup has a few newer, more expensive, models. The GSM7324 still sits at the lowest price point for a Netgear L3 switch though, luring those in thinking that there’s no tradeoff in price.

So apart came all the trunks and redundant switch links. There was enough redundancy in the physical cabling to the edge switches that I could switch to access links for each subnet. I chained all the switches back together, like it was when I first started, and set up routing on a [Juniper] Netscreen 50 instead, being the only alternative. Everything started coming back up as I dig through the network in search of the original static routing entries that I never found the time to upgrade.

How important is network administration? Too important for system administrators to get away with not knowing. A colleague was recently complaining that he couldn’t get an interviewee to answer why having two physical separate switches is better than having one. I find all of this unfortunate and trying, when I have had to re-architect every network I’ve inherited since moving to Seattle. Sometimes I think we should go back to yelling about the 5-4-3 rule on a soapbox so we’ll at least get a sane switch topology. Hopefully by the time they realize why the 5-4-3 rule doesn’t apply anymore, they’ll have picked up why switch topology is too important to be a matter of just plugging switches together like they’re power strips. Because that’s a great pet peeve too.

2 thoughts on “Consumer hardware in enterprise environments

  1. Tom H

    Hi Bryan,

    This is hilarious. Jesus.. go cheap on anything.. anything.. except your core network devices. The whole network is useless if you have a netgear pos that gets broken all the time. Juniper is a much better product for routing than the el cheapo netgear L3 switch. It seems like IT decision makers never get it.. if you want reliability and quality, your going to have to pay for it. I don’t understand why it seems IT decision makers always seem to fail to make good purchasing choices when it comes to budget vs the needs of the network.

    Why on earth would you setup routing on an unreliable l3 switch when the juiper box is much better suited for this task? Speed I would guess?

  2. btm Post author

    @Tom,

    Yeah. There was an “Executive Security” attitude there. That is, the executives wanted to be told everything was above-average in terms of security, regardless of the reality of the situation. I’ll save my stories about that for some time over a beer.

    The NS50 was just there to be a firewall. It wasn’t doing anything that a netgear home router wouldn’t do, besides having four interfaces instead of a couple.

    I rebuilt the bits last night. There’s now a Cisco 2801 on the edge, trunking back to one of the Netgear L2 Switches, a GSM7248 I think, in the ‘router on a stick’ model. It’s much cleaner, although I’m always forgetting VLANs on Netgear fake-cisco CLI’s.

    To configure a VLAN trunk it’s “vlan participation include VLAN” and “vlan tagging VLAN”, plus something I didn’t use that you can tell it to only accept inbound frames that have vlan tags. The second command I always forget. I always assume that when a frame comes in an interface with “vlan pvid VLAN” set, you don’t have to tell the hardware to preserve the tags, but you do.

    I don’t know what thinking was involved, but I assume it was something like, “if we use a gigabit L3 switch, data can move between subnet’s really fast”. Whereas now, data between say ‘business’ and ‘engineering’ has to go through a single gigabit interface to the router, rather than whatever speed the backplane of an L3 switch is.

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload the CAPTCHA.