Features and licenses

Ever seen these?

If not you might just have thrown away one of the most expensive pieces of you purchase. This is the envelope that contains the transaction key which allows you to generate the license for the specific feature you bought. If you have discarded it I must say that is not really smart. It’s like throwing away the car-keys after you bought a brand new vehicle.

DO NOT THROW AWAY THIS ENVELOPE

The transaction key together with the unique switch WWN creates the license key needed. If you have a MyBrocade account (or an OEM Custcomer service engineer) at hand you can generate the license yourself.

The Unit Unique ID is obtained via the “licenseidshow” command. Fill int the form as shown above and the license will be emailed to you.

From all the features that Brocade has to offer I would recommend buying the “FabricVision” (what’s in a name) license. This license enables the FLOW and MAPS features which allow for measuring, monitoring and alerting on a switch level. This feature/license replaces the FabricWatch and Advanced Performance Monitor. When used correctly you can build a bullet proof storage network with advanced pre-test options and post event alerting capabilities.

“Compulsory settings”

Maybe the title of this post is incorrect and should state “Compulsory configuration options”. In this section I will dive into the options you need to set for better reliability and availability and to improve on recovery scenarios that might impact traffic flow. Many minor problems which could have had massive impact were prevented by using these option and setting them correctly.

The principal.

I’ve mentioned before it is wise to set a fixed fabric-principal switch. In our case I will use the DCX for this.

“SW02:>fabricprincipal –enable -p 0x01“

I keep the edge switches as a backup principal in case some sort of segmentation arises for whatever reason. Especially when this fabric is branched out with additional leaf switches (in blade-centres for example) we need to maintain consistency in having the most powerful switches to act as the principal. In effect you create a ring layer topology where the core-switches are configured with the highest priority and each ring that is branched off gets a lower priority. I would advise to use a stepping model of 0x10 to get some flexibility if newer switches are introduced in due course.

In the example I configure the edge switches with the value 0x0A.

“SW01:>fabricprincipal –enable -p 0x0A”

“SW03:>fabricprincipal –enable -p 0x0A“

Even if this fabric becomes meshed and an event happens whereby just the two edge switches are joined the normal principal selection method will be used. even though it seems the configured values would cause a collision they are ignored in that case and the switch with the lowest WWN will become principal.

Switches located on the far edge of the fabric like blade-centre switches need to be excluded for the entire selection process and should be prevented from ever becoming a principal switch.

“SW0X:>fabricprincipal –enable 0xFF“

Time and Date settings.

I’ve spend hours on digging through logs where the time and date were not set correctly on each device. This requires a huge amount of extra time to be able to align these events and try to figure out whats going on after, or even during, a catastrophic event. Do yourself and your supportorganisation a favor and set the correct time/date and timezone.

For keeping everything in sync use NTP as I described over here. Also set the correct timezone. The commands to use are:

“tsclockserver <ipaddress>(,<ipaddress>)”

“tstimezone –interactive” (this provides you with a selection list of timezones.)

SupportFTP

This is a command that is far underutilized but provides such an enormous convenience when needing to collect supportsave data and core-files. When set correctly the values provided here are used by the supportsave and tracedump commands in such a way you dont have to remember ftp-server, credentials and folder locations.

In our case we set this to:

SW01:>supportftp -s -h 10.10.10.11 -u <ftpserver-username> -p <that very long password> -d “/uploads/Fabric-1/SW01” -l (s)ftp

This is done for each switch and obviously the “upload-folder” correlates to each specific switch name.

Secondly we want to enable the switch to automatically upload the, so called, trace-dumps. If a process has a problem and panics it will create a core-dump. For debug analysis and, hopefully, fix the problem these files are needed by the developers. It would be a shame if these were lost for some reason. So after the upload-parameter have been set enable the automatic transfer of these file with:

“supportftp -e”

The last setting is to have the switch check every half day if the FTP server in combination with the specified parameters is still reachable. Enable this with:

supportftp -e”

That’s it.

Buffer Credit Recovery.

I’ve spend numerous articles on buffer to buffer flow control so I’m not going into details here.

In fairly recent FOS-levels Brocade has build a few options into the code which check and, if needed, recover from lost buffer credits. These need to be turned on and I’ll advise you to do so.

In FOS 7.2 and higher use the following commands:

“creditrecovmode –cfg onLrOnly”

“creditrecovmode –fe_crdloss on –be_crdloss on –be_losync”

Bottleneck Monitoring

The “bottleneckmon” command is obsolete since FOS 7.4 and instead the “creditrecovmode” command should be used for credit-recovery purposes. The monitoring part has been moved to MAPS so for being able to send alerts and log events use that.

DLS/IOD/RTE

Some cryptic acronyms here.

RTE is the routing engine being used. For open systems environments use Exchange Based routing:

SW0X:>aptpolicy 3

For Ficon only environments use Port-based routing.

SW0X:>aptpolicy 1

If you have an environment with Open-Systems and Mainframe connected to the same switch use device based routing policy:

SW0X:>aptpolicy 2

Set this the same on all switches in the fabric.

DLS stands for Dynamic Load Sharing. It arranges hows frames are routed through a fabric in multi same-cost path scenario’s. Turn this on in each logical switch.

“SW0X:>dlsset –enable -lossless”

Make sure you use the “-lossless” parameter to prevent frames being dropped in fabric-rerouting scenario’s. The “-lossless” parameter cannot be used when XISL’s are enabled in the logical switch on FOS codes older than 7.1.

IOD – In Order Delivery

This is a mandatory setting in Ficon environments. In Open-System environments you need to turn this off.

“SW0X:>iodreset”

Default Zone

In my blog post here I explained what the “defzone” is. Make sure you have this enabled and set to “noaccess”

“SW0X:>defzone –noaccess”

Frame logging

If you ever been in a support role you know there is nothing more valuable then having a complete FC-frame including SCSI CDB if something goes wrong. You’re then able to dissect the frame to see if it correlates to a particular error on a host.

That entry looks a bit like this:

Aug 21 01:57:56 7/44 7/42 
timeout 81 01 ac 00 00 01 aa 00 00 09 00 00 7c 00 00 00 
 00 3c ff ff 00 00 00 00 af b5 a5 f6 00 00 00 00 
 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
 00 00 00 00 00 00 00 00 e5 e5 e5 e5 e5 e5 e5 e5

Since FOS 7 Brocade is able to log frames that have been discarded due to a timeout on all 8G (Condor 2 ASIC) platforms. On the 16G platforms (Condor 3 ASICs) it is extended to also have the ability to log frames due to a routing issue.

On 8G turn this on via “framelog –enable” and on 16G via “framelog –enable -type <timeout|du|unroute>”. (Turn them on individually ie these are 3 commands.)

EvL Consulting

6 – Non Standard configuration options