Brocade supportsave via BNA or DCFM

If you work with Brocade gear you might have come across their management tool called Brocade Network Advisor. A very nifty piece of software which lets you do almost 99% of all things you want to do with your fabrics whether is FOS or NOS based.

When you work in support you often need logs and the Brocade switches provide these in two flavours: the useful and useless. (ie supportsave vs supportshow respectively)

If you need a reference of your configuration and want to do some checking on logs and configuration settings a supportSHOW is useful for you. When you work in support you most often need to dig a fair bit deeper and thats where the supportSAVE comes into play. Basically it’s a collection method of the entire status of the box including ASIC and linux panic dumps, stack-traces etc. This process runs on the CP itself so its not a screengrab of tekst as you can imagine.

Normally when you collect these you’ll log into the switch and type “supportsave”, fill in the blanks and some directory on your ftp or ssh system fills up with these files which you then zip up and send to your vendor.

If you have a large fabric please make sure you collect these files per switch in a subfolder and upload these seperatly !!!!!

If you are in the luck position you have BNA you also can collect these via this interface. The BNA process will create the subfolders for you and zip these up for later.

When you look in the zip-file you’ll see that  all filenames have the following structure:

“Supportinfo-date-time\switcname-IPaddress-switchWWN\

Obviously if you work on a Windows box to extract all this Windows will create all subfolders for you. Given the fact I do my work on a Linux box I get one large dump of files with the same file-names as depicted above without the directory structure so every files from every switch is still located in the same folder. This is due to the fact Linux (or  POSIX in general) allows the “\” as a valid character in the file name and the “/” is used for folder separation. 🙁

I used to come across this every so now and then and it didn’t anoy me to such an extent to fix it but lately the use of this collection method (BNA style)  seems to increase so i needed something simple to fix it.

Since Linux can do almost everything i wrote this little oneliner to create a subfolder per switch and move all files belonging to that switch into that folder:

1. #!/bin/bash
2. pushd $@
3. for i in `ls`;do x=$(echo $i | awk -F\\ ‘{print $3 }’);y=$(echo $i | awk -F\\ ‘{print $2 }’); mkdir -p $y; mv $i $y/$x; done
4. popd

Link this to a Nautilus-script softlink and a right click in the Nautilus file manager will do the trick for me.

Voila, lots of time saved. Hope it helps somebody.

Cheers,
Erwin

The first law of the Time Lords | Aussie Storage Blog

A buddy of mine posted this article and it reminded me of the presentation I did for the Melbourne VMUG back in April of this year.

The first law of the Time Lords | Aussie Storage Blog:

If you have ever worked in support (or had the need to check on events in general as an administrator) you know how important it is to have an accurate timestamp. Incorrect clock settings are a nightmare if you want to correlate event that are logged on different times and dates.

When you look at the hyper scale of virtualised environments you will see that the vertical stack of IO options is almost 10 fold. Lets have a look from top to bottom where you can set the clock.

  1. Application
  2. VM, the virtual machine
  3. Hypervisor
  4. network switches
  5. The 1st tier storage platform (NAS/iSCSI)
  6. A set of FC switches
  7. The second tier storage platform
  8. Another set of FC switches
  9. The virtualised storage array
Which in the end might look a bit like this. (Pardon my drawing skills)
As you can imagine it’s hard enough to start figuring out where an error has occurred but when all of these stacks have different time settings it’s virtually impossible to dissect the initial cause.
So what do you set on each of these? That brings us to the question of “What is time”. A while ago I watched a video of a presentation by Jordan Sissel (who is working full-time on a open-source project called LogStash). One of his slides outlines the differences in timestamps.:
So besides the different time-formats you encounter in the different layers of the infrastructure imagine what it is like to first get all these back into a human readable format and then aligned across the entire stack. 
While we’re not always in a position to modify the time/date format we can make sure that at least the time setting is correct. In order to do that make sure you use NTP and also set the correct timezone. This way the clocks in the different layers of the stack across the entire infrastructure say aligned and correct. 
You will help yourself and your support organisation a great deal.
Thanks,
Erwin van Londen

ViPR – Frankenstorage Revisited | Architecting IT

Fellow blogger and keen dissector of fluff Chris Evans really hit the nail on the head with this one.

ViPR – Frankenstorage Revisited | Architecting IT:

Even after reading the announcement from EMC a couple of times I really struggle with finding out what is actually announced. It looks like they crammed every existing technology available in the storage-world and overlay that with every other piece of existing technology in the storage world.

I am wondering if EMC market watches have been under a stone for the past 6 years but the ability to virtualise storage in multiple north-south bound manners have been done by HDS for a long time. With the introduction of HCP (Hitachi Content Platform) HDS introduced multiple access methods (http/NFS/REST) to object based storage which can retrieve and store information from multiple sources including block/nfs/etc. HNAS added a high performance file based platform to that. The storage virtualization stack was invented by them and the overall management is a single pane of glass to manage all of this even from three generations back. The convergence of storage products and protocols and making them available to hook into cloud platforms like OpenStack, vSphere, EC3 etc is a natural evolution however it seems that EMC needs an entire marketing department to convolute the fact they have a very disparate set of products which were either designed by themselves thru different engineering teams who do not talk to each other or, and that has been the prefered way for EMC, by obtaining foreign technologies via acquisitions which by design never ever talk to other technologies to make sure to have a market differentiating product.

As Chris said, for years EMC have been contemptuous around storage virtualisation and didn’t have an answer to what the rest of the industry, most notably HDS and IBM, were doing. The fact that cloud platforms ramp up significantly left them in a state where they would lose out on customers who were looking into this for their next generation cloud and storage platforms. The speed in which is now brought to market is almost evidence that the quality of the whole is severely less than the sum of its components which leads you to ask “Do I really want this?”

ViPR needed a significant project (Bourne) to a.) try and tie all this stuff together from all the EMC product dungeons and b.) ramp up the entire marketing department and create a soup which has been stirred long enough to look massive and new but leaves a very bitter after-taste.

I would go for some nice Japanese sushi which is well balanced, thought through, looks greats and prevents the need to take stomach pills.

Cheers,
Erwin

ipSpace.net: FCoE between data centers? Forget it!

Cisco guru and long time networking expert Ivan Pepelnjak  hits the nail on its head with the below post. It is one more addition to my series of why you should stay away from FCoE. FCoE is only good for ONE thing: to stay confined in the space of the Cisco UCS platform where Cisco seems to obtain some benefits.

Today I got a question around the option for long-distance replication between enterprise arrays over a FCoE link. If there is ONE thing that I would put towards the HDS arrays is that they are virtually bulletproof with regards to data-replication and 6x6x6 across 3 data-centres replication scenarios in cascaded and multi-target topologies are not uncommon. (yes, read that again and absorb the massive scalability of such environments.)

If however you then start to cripple such environments with a greek trolley from 1000BC (ie FCoE) for getting traffic back and forth you’re very much out of luck.

Read the below from Ivan and you’ll see why.

ipSpace.net: FCoE between data centers? Forget it!: Was anyone trying to sell you the “wonderful” idea of running FCoE between Data Centers instead of FC-over-DWDM or FCIP? Sounds great … un…

He also refers to a Cisco whitepaper (a must read if you REALLLYY need to deploy FCoE in your company) which outlines the the technical restrictions from an protocol architectural point of view.

The most important parts are that the limitation is there and Cisco has no plans to solve this. Remember though I referred to Cisco in this article all the other vendors like Brocade and Juniper have the same problem. Its an Ethernet PFC restriction inherent to the PAUSE methodology.

So when taking all this into consideration you have a couple of options.

  • Accept business continuity to be less than zero unless the hurricane strikes with a 50 meter diameter. (small chance. :-))
  • Use FibreChannel with dark-fibre or DWDM/CWDM infrastructure
  • Use FCIP to overcome distances over 100KM
So far another rant of the FCoE saga which is stacking up one argument after another of why NOT to adopt it and which is getting more and more support across the industry by very respected and experienced networking people.
Cheers,
Erwin

Surprise: Cisco is back on the FC market

The Cisco MDS 9506,9509 and 9513 series director class FC switches have been in for the long haul. They were brought to market back in the early 2000’s (2003 to be exact) and have run a separate code-base from the rest of the Cisco switching family in the form of the Catalyst Ethernet switches. The storage and Ethernet side have always been a separate stream and for good reason. When Cisco opened up the can of worms with FCoE a few years back, they needed a platform which could serve both the Ethernet and fibre-channel side.

Thus the Nexus was born and a new generation of switches that could do it all. All ???? …. no, not really. For storage support you still needed native FC connectivity and hook this up to the Nexus which could then switch it via FCoE to either CNA’s or other, remote, FC switches to targets.

It seems the bet on FCoE didn’t go that well as the uptake on the entire FCoE protocol and convergence bandwagon has not taken off. (I wonder who predicted that :-)) This left Cisco a bit in a very nasty situation in the storage market since the MDS 9500 platforms where a bit at there end from an architectural standpoint. The updated supervisors to revision 2a plus the cross-bars were only able to handle up to 8G FC speeds plus the huge limitation of massive over-subscription on the high port-count modules (Marketing called it “host-optimised ports”) did not stack up against the requirements that came into play when the huge server virtualisation train took off with lightning speed. Their biggest competitor Brocade was already shipping 16G switches and directors like hot cakes and Cisco did not really have an answer. The NX5000 was limited regarding portcount and scalability whilst the NX7000 cost and arm and a limb so only for companies with a fair chunk of money who had the need to refresh both their storage switching as well as their network switching gear at the same time in addition to the fact they had enough confidence in the converged stack of FCoE this platform was a viable choice. Obviously I don’t have the exact number of NX7K’s sold but my take is that the ones that are sold are primarily used in high density server farms like the UCS (great platform) with separate storage networks hanging off of that via the MDS storage switches and maybe even these are direct connected to the server farms.

So it seems Cisco sat a bit in limbo in the datacentre storage space. It’s a fairly profitable market so throwing in the towel would have been a huge nut to crack so Cisco, being the company with one of the biggest treasure chests in Silicon Valley, pulled up their sleeves and cranked out the new box announce a few weeks ago, the all new MDS9710 based on new ASIC technology which now also delivers 16G FC speeds.

I must say after going through the data-sheets I’m not overly impressed. Yes, it’s a great box with all the features you need (with a couple of exceptions, see below) but it seems it was rushed to market from an engineering perspective to catch up with what was already available from Brocade. It looks like one or more execs switched into sub-state panic mode and ordered the immediate development of the MDS95xx successor.

Lets have a short look.
I’ll compare the MDS9710 and the Brocade DCX8510-8 since these are most aligned from a business market and technical perspective.

Both support all the usual FC protocol specs so not differentiation there. This includes the FC-SP, and FC-SB for security and Ficon (support for Ficon was not yet available at launch but I think that’s just due to qualification. It should be there somewhat later in the year.) What does strike me is that no FCoE support is available at launch of the product. Again, they could save that money and spend it more wisely.

There is not that much difference from a performance perspective. According to the spec sheets the MDS provide around 8.2Tbps FC bandwidth from a front-end perspective as does the DCX. Given the architectural difference between the MDS and the DCX you cannot really compare the overall switching capabilities given the fact the MDS switches frames on the cross-bar (or fabric-card as they seems to call it now) and the DCX on the ASIC level. From a performance standpoint you might gain a benefit by doing it the brocade way if you have the ability to have all ASIC locality however this also imposes limitations w.r.t. functionality. An example is link aggregation ie port-channel according to Cisco and trunking according to Brocade. (Do not confuse the Cisco “trunking” with the Brocade one.) Brocade requires you to have all members of a trunk to sit in the same port-group on a single ASIC whereas with Cisco’s method you don’t really care. The members can be all over the chassis so in case an entire slot fails the port-channel itself still keeps going. It seems Cisco learned its lesson from the over-subscription ratios that is hampering the high point-count modules on the 95xx series. The overall back-end switching capability of the 9710 seems to be more than sufficient to cater for a triple jump generation of FC and it seems likely the MDS could cated for 40G Ethernet in the not so distant future without blinking an eye. The 100G Ethernet implementation will take a while so I think this 97xx generation will sing out the current and next generation of 32G FC and 40G Ethernet. Since I don’t have insights in roadmaps I’ll direct you to the representatives of Cisco.

There is another thing that is bothering me and that is power consumption. The MDS (fully populated) draws a whopping 4615 watts of juice. Comparing that to the DCX8510-8, which needs less than half of that amount, it looks like somebody requested the ability to fry eggs on the box.

As for the software side Cisco bumped the major NX-OS level to 6.2 ( I don’t know what happened to 6.00 and 6.1 so don’t ask). Two of the major fabulous features that I like of NX-OS are VOQ and build in FC analysis with the fcanalyser tool. The VOQ (Virtual Output Queue) provides some sort of separate “Ingress” buffer per port so that in the case of a lack of credits on the destination of the frame this port does not cause back-pressure causing all sorts of nasty-ness.

From a troubleshooting perspective the fcanalyser tool is invaluable and just for that feature alone I would buy the MDS. It allows you to capture FC frames and store them in PCAP format for analysis via a 3rd party tool (like wireshark) or output to stdio raw text format so you can execute a screen capture on a terminal program. The later requires you know exactly how to interpret a FC frame which can be a bit daunting for many people.

As I mentioned in the beginning I’m not quite sure yet what to think of the new MDS as the featureset seems fairly limited w.r.t. portcount and innovation in NX-OS. It looks like they wanted to catch up to Brocade in which they very well succeeded but they have come very late to market with the 16G FC adoption. I do think this box has been developed with some serious future proofing build in so new modules and features can be easily adopted in a non-disruptive way so the transition to 32G FC is likely to be much quicker than they have done with the jump from 8G to 16G.

I also would advise Cisco to have a serious look into their hardware design regarding power-consumption. It is over the top to require that amount of watts to keep this amount of FC ports going whilst your biggest competitor can do with less than half.

To conclude I think Brocade has a serious competitor again but Cisco would really need to crank up the options with regards to port and feature modules as well as Ficon support.

Cheers,
Erwin