Tag Archives: support

In 2008 Brocade announced the 8G director class switches DCX and DCX-4S. A rather impactful release of a new piece of hardware where the next generation FC ASIC saw light. The Condor.The marketing department of Brocade back then had probably been smoking something as the introduction of the platform was accompanied with a rather cringy “Marvel-like” super-hero called DCX-man. (uhhhhh… shivers….)

Continue reading →

Crackdown on FOS support

04/07/2019

Erwin van Londen

If you’ve read my articles over the last decade or so you’ve seen I’m keen on maintenance. Both from a physical hardware as well as software perspective a storage environment needs to be kept in tip-top shape at all times.

Continue reading →

BNA before 14.4.1 is no longer supported

26/06/2019

Erwin van Londen

Brocade Brocade Technical Storage Networking No Comments

I’ve already mentioned that BNA was End-of-Life and will/is (be) replaced by SANNav.

As of the time of this writing any BNA version older than 14.4.1 is no longer supported. This basically means that your BNA (or any OEM version) older than that release is not being looked at when yo have a problem with it.

Continue reading →

Brocade “compatible” SFP’s will disrupt your fabric.

09/07/2015

Erwin van Londen

Brocade Config Guide Troubleshooting No Comments

As with all good stuff there are always people and companies that try to jump on the bandwagon and make some dollars by using cheap kit and sell it as “Original” or “Compatible” or “Vendor equivalent”. Don’t fall for this trap and simply buy Original Vendor Branded & Supported equipment. One example is FiberStore.

Continue reading →

How to obtain a Brocade SupportSave without BNA or DCFM

24/10/2013

Erwin van Londen

Brocade No Comments

In some of my previous articles here and here I explained how to obtain a supportsave via BNA (Brocade Network Advisor) and/or DCFM (Data Centre Fabric Manager) and which one to grab. But what happens if you don’t have BNA or are not able to manage these switches via BNA. The best way to do this is to implement the “supportftp” settings. Continue reading →

5-minute initial troubleshooting on Brocade equipment

09/08/2013

Erwin van Londen

Brocade Config Guide Troubleshooting No Comments

Very often I get involved in cases whereby a massive amount of host logs, array dumps, FC and IP traces are taken which could easily add up to many gigabytes of data. This is then accompanied by a very synoptic problem description such as “I have a problem with my host, can you check?”.
I’m sure the intention is good to provide us all the data but the problem is the lack of the details around the problem. We do require a detailed explanation of what the problem is, when did it occur or is it still ongoing?

There are also things you can do yourself before opening a support ticket. In many occasions you’ll find that the feedback you get from us in 10 minutes results in either the problem being fixed or a simple workaround has made your problem creating less of an impact. Further troubleshooting can then be done in a somewhat less stressful time frame.

This example provides some bullet points what you can do on a Brocade platform. (Mainly since many of the problems I see are related to fabric issues and my job is primarily focused on storage networking.)

First of all take a look at the over health of the switch:

switchstatusshow
Provides an overview of the general components of the switch. These all need to show up HEALTHY and not (as shown here) as “Marginal”

Sydney_ILAB_DCX-4S_LS128:FID128:admin> switchstatusshow
Switch Health Report Report time: 06/20/2013 06:19:17 AM
Switch Name: Sydney_ILAB_DCX-4S_LS128
IP address: 10.XXX.XXX.XXX
SwitchState: MARGINAL
Duration: 214:29

Power supplies monitor MARGINAL
Temperatures monitor HEALTHY
Fans monitor HEALTHY
WWN servers monitor HEALTHY
CP monitor HEALTHY
Blades monitor HEALTHY
Core Blades monitor HEALTHY
Flash monitor HEALTHY
Marginal ports monitor HEALTHY
Faulty ports monitor HEALTHY
Missing SFPs monitor HEALTHY
Error ports monitor HEALTHY

All ports are healthy

switchshow
Provides a general overview of logical switch status (no physical components) plus a list of ports and their status.

The switchState should alway be online.
The switchDomain should have a unique ID in the fabric.
If zoning is configured it should be in the “ON” state.

As for the ports connected these should all be “Online” for connected and operational ports. If you see ports showing “No_Sync” whereby the port is not disabled there is likely a cable or SFP/HBA problem.

If you have configured FabricWatch to enable portfencing you’ll see indications like here with port 75

Obviously for any port to work it should be enabled.

Sydney_ILAB_DCX-4S_LS128:FID128:admin> switchshow
switchName: Sydney_ILAB_DCX-4S_LS128
switchType: 77.3
switchState: Online
switchMode: Native
switchRole: Principal
switchDomain: 143
switchId: fffc8f
switchWwn: 10:00:00:05:1e:52:af:00
zoning: ON (Brocade)
switchBeacon: OFF
FC Router: OFF
Fabric Name: FID 128
Allow XISL Use: OFF
LS Attributes: [FID: 128, Base Switch: No, Default Switch: Yes, Address Mode 0]

Index Slot Port Address Media Speed State Proto
============================================================
0 1 0 8f0000 id 4G Online FC E-Port 10:00:00:05:1e:36:02:bc “BR48000_1_IP146” (downstream)(Trunk master)
1 1 1 8f0100 id N8 Online FC F-Port 50:06:0e:80:06:cf:28:59
2 1 2 8f0200 id N8 Online FC F-Port 50:06:0e:80:06:cf:28:79
3 1 3 8f0300 id N8 Online FC F-Port 50:06:0e:80:06:cf:28:39
4 1 4 8f0400 id 4G No_Sync FC Disabled (Persistent)
5 1 5 8f0500 id N2 Online FC F-Port 50:06:0e:80:14:39:3c:15
6 1 6 8f0600 id 4G No_Sync FC Disabled (Persistent)
7 1 7 8f0700 id 4G No_Sync FC Disabled (Persistent)
8 1 8 8f0800 id N8 Online FC F-Port 50:06:0e:80:13:27:36:30
75 2 11 8f4b00 id N8 No_Sync FC Disabled (FOP Port State Change threshold exceeded)
76 2 12 8f4c00 id N4 No_Light FC Disabled (Persistent)

sfpshow slot/port
One of the most important pieces of a link irrespective of mode and distance is the SFP. On newer hardware and software it provides a lot of info on the overall health of the link.

With older FOS codes there could have been a discrepancy of what was displayed in this output as to what actually was plugged in the port. The reason was that the SFP’s get polled so every now and then for status and update information. If a port was persistent disabled it didn’t update at all so in theory you plug in another SFP but sfpshow would still display the old info. With FOS 7.0.1 and up this has been corrected and you can also see the latest polling time per SFP now.

The question we often get is: “What should these values be?”. The answer is “It depends”. As you can imagine a shortwave 4G SFP required less amps then a longwave 100KM SFP so in essence the SFP specs should be consulted. As a ROT you can say that signal quality depends ont he TX power value minus the link-loss budget. The result should be withing the RX Power specifications of the receiving SFP.

Also check the Current and Voltage of the SFP. If an SFP is broken the indication is often it draws no power at all and you’ll see these two dropping to zero.

Sydney_ILAB_DCX-4S_LS128:FID128:admin> sfpshow 1/1
Identifier: 3 SFP
Connector: 7 LC
Transceiver: 540c404000000000 2,4,8_Gbps M5,M6 sw Short_dist
Encoding: 1 8B10B
Baud Rate: 85 (units 100 megabaud)
Length 9u: 0 (units km)
Length 9u: 0 (units 100 meters)
Length 50u (OM2): 5 (units 10 meters)
Length 50u (OM3): 0 (units 10 meters)
Length 62.5u:2 (units 10 meters)
Length Cu: 0 (units 1 meter)
Vendor Name: BROCADE
Vendor OUI: 00:05:1e
Vendor PN: 57-1000012-01
Vendor Rev: A
Wavelength: 850 (units nm)
Options: 003a Loss_of_Sig,Tx_Fault,Tx_Disable
BR Max: 0
BR Min: 0
Serial No: UAF110480000NYP
Date Code: 101125
DD Type: 0x68
Enh Options: 0xfa
Status/Ctrl: 0x80
Alarm flags[0,1] = 0x5, 0x0
Warn Flags[0,1] = 0x5, 0x0
Alarm Warn
low high low high
Temperature: 25 Centigrade -10 90 -5 85
Current: 6.322 mAmps 1.000 17.000 2.000 14.000
Voltage: 3290.2 mVolts 2900.0 3700.0 3000.0 3600.0
RX Power: -3.2 dBm (476.2uW) 10.0 uW 1258.9 uW 15.8 uW 1000.0 uW
TX Power: -3.3 dBm (472.9 uW) 125.9 uW 631.0 uW 158.5 uW 562.3 uW

State transitions: 1
Last poll time: 06-20-2013 EST Thu 06:48:28

porterrshow
For link state counters this is the most useful command in the switch however there is a perception that this command provides a “silver” bullet to solve port and link issues but that is not the case. Basically it provides a snapshot of the content of the LESB (Link Error Status Block) of a port at that particular point in time. It does not tell us when these counters have accumulated and over which time frame. So in order to create a sensible picture of the statuses of the ports we need a baseline. This baseline can be created to reset all counters and start from zero. To do this issue the “statsclear” command on the cli.

There are 7 columns you should pay attention to from a physical perspective.

enc_in – Encoding errors inside frames. These are errors that happen on the FC1 with encoding 8 to 10 bits and back or, with 10G and 16G FC from 64 bits to 66 and back. Since these happen on the bits that are part of a data frame that are counted in this column.

crc_err – An enc_in error might lead to a CRC error however this column shows frames that have been market as invalid frames because of this crc-error earlier in the datapath. According to FC specifications it is up to the implementation of the programmer if he wants to discard the frame right away or mark it as invalid and send it to the destination anyway. There are pro’s and con’s on both scenarios. So basically if you see crc_err in this column it means the port has received a frame with an incorrect crc but this occurred further upstream.

crc_g_eof – This column is the same as crc_err however the incoming frames are NOT marked as invalid. If you see these most often the enc_in counter increases as well but not necessarily. If the enc_in and/or enc_out column increases as well there is a physical link issue which could be resolved by cleaning connectors, replacing a cable or (in rare cases) replacing the SFP and/or HBA. If the enc_in and enc_out columns do NOT increase there is an issue between the SERDES chip and the SFP which causes the CRC to mismatch the frame. This is a firmware issue which could be resolved by upgrading to the latest FOS code. There are a couple of defects listed to track these.

enc_out – Similar to enc_in this is the same encoding error however this error was outside normal frame boundaries i.e. no host IO frame was impacted. This may seem harmless however be aware that a lot of primitive signals and sequences travel in between normal data frame which are paramount for fibre-channel operations. Especially primitives which regulate credit flow. (R_RDY and VC_RDY) and signal clock synchronization are important. If this column increases on any port you’ll likely run into performance problems sooner or later or you will see a problem with link stability and sync-errors (see below).

Link_Fail – This means a port has received a NOS (Not Operational) primitive from the remote side and it needs to change the port operational state to LF1 (Link Fail 1) after which the recovery sequence needs to commence. (See the FC-FS standards specification for that)

Loss_Sync – Loss of synchronization. The transmitter and receiver side of the link maintain a clock synchronization based on primitive signals which start with a certain bit pattern (K28.5). If the receiver is not able to sync its baud-rate to the rate where it can distinguish between these primitives it will lose sync and hence it cannot determine when a data frame starts.

Loss_Sig – Loss of Signal. This column shows a drop of light i.e. no light (or insufficient RX power) is observed for over 100ms after which the port will go into a non-active state. This counter increases often when the link-loss budget is overdrawn. If, for instance, a TX side sends out light with -4db and the receiver lower sensitivity threshold is -12 db. If the quality of the cable deteriorates the signal to a value lower than that threshold, you will see the port bounce very often and this counter increases. Another culprit is often unclean connectors, patch-panels and badly made fibre splices. These ports should be shut down immediately and the cabling plant be checked. Replacing cables and/or bypassing patch-panels is often a quick way to find out where the problem is.

The other columns are more related to protocol issues and/or performance problems which could be the result of a physical problem but not be a cause. In short look at these 7 columns mentioned above and check if no port increases a value.

============================================
too_short/too_long – indicates a protocol error where SOF or EOF are observed too soon or too late. These two columns rarely increase.

bad_eof – Bad End-of-Frame. This column indicates an issue where the sender has observed and abnormality in a frame or it’s transceiver whilst the frameheader and portions of the payload where already send to its destination. The only way for a transceiver to notify the destination is to invalidate the frame. It truncates the frame and add an EOFni or EOFa to the end. This signals the destination that the frame is corrupt and should be discarded.

F_Rjt and F_Bsy are often seen in Ficon environments where control frames could not be processes in time or are rejected based on fabric configuration or fabric status.

c3timout (tx/rx) – These are counters which indicate that a port is not able to forward a frame in time to it’s destination. These either show a problem downstream of this port (tx) or a problem on this port where it has received a frame meant to be forwarded to another port inside the sames switch. (rx). Frames are ALWAYS discarded at the RX side (since that’s where the buffers hold the frame). The tx column is an aggregate of all rx ports that needs to send frames via this port according to the routing tables created by FSPF.

pcs_err – Physical Coding Sublayer – These values represent encoding errors on 16G platforms and above. Since 16G speeds have changed to 64/66 bits encoding/decoding there is a separate control structure that takes care of this.

As a best practise is it wise to keep a trace of these port errors and create a new baseline every week. This allows you to quickly identify errors and solve these before they can become an problem with an elongated resolution time. Make sure you do this fabric-wide to maintain consistency across all switches in that fabric.

Sydney_ILAB_DCX-4S_LS128:FID128:admin> porterrshow
frames enc crc crc too too bad enc disc link loss loss frjt fbsy c3timeout pcs
tx rx in err g_eof shrt long eof out c3 fail sync sig tx rx err
0: 100.1m 53.4m 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1: 466.6k 154.5k 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2: 476.9k 973.7k 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3: 474.2k 155.0k 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Make sure that all of these physical issues are solved first. No software can compensate for hardware problems and all OEM support organizations will give you this task anyway before commencing on the issue.

In one of my previous articles I wrote about problems, the cause and the resolution of physical errors. You can find it over here

Regards,
Erwin

Brocade supportsave via BNA or DCFM

23/05/2013

Erwin van Londen

Brocade 4 Comments

If you work with Brocade gear you might have come across their management tool called Brocade Network Advisor. A very nifty piece of software which lets you do almost 99% of all things you want to do with your fabrics whether is FOS or NOS based.

When you work in support you often need logs and the Brocade switches provide these in two flavours: the useful and useless. (ie supportsave vs supportshow respectively)

If you need a reference of your configuration and want to do some checking on logs and configuration settings a supportSHOW is useful for you. When you work in support you most often need to dig a fair bit deeper and thats where the supportSAVE comes into play. Basically it’s a collection method of the entire status of the box including ASIC and linux panic dumps, stack-traces etc. This process runs on the CP itself so its not a screengrab of tekst as you can imagine.

Normally when you collect these you’ll log into the switch and type “supportsave”, fill in the blanks and some directory on your ftp or ssh system fills up with these files which you then zip up and send to your vendor.

If you have a large fabric please make sure you collect these files per switch in a subfolder and upload these seperatly !!!!!

If you are in the luck position you have BNA you also can collect these via this interface. The BNA process will create the subfolders for you and zip these up for later.

When you look in the zip-file you’ll see that all filenames have the following structure:

“Supportinfo-date-time\switcname-IPaddress-switchWWN\“

Obviously if you work on a Windows box to extract all this Windows will create all subfolders for you. Given the fact I do my work on a Linux box I get one large dump of files with the same file-names as depicted above without the directory structure so every files from every switch is still located in the same folder. This is due to the fact Linux (or POSIX in general) allows the “\” as a valid character in the file name and the “/” is used for folder separation. 🙁

I used to come across this every so now and then and it didn’t anoy me to such an extent to fix it but lately the use of this collection method (BNA style) seems to increase so i needed something simple to fix it.

Since Linux can do almost everything i wrote this little oneliner to create a subfolder per switch and move all files belonging to that switch into that folder:

1. #!/bin/bash
2. pushd $@
3. for i in `ls`;do x=$(echo $i | awk -F\\ ‘{print $3 }’);y=$(echo $i | awk -F\\ ‘{print $2 }’); mkdir -p $y; mv $i $y/$x; done
4. popd

Link this to a Nautilus-script softlink and a right click in the Nautilus file manager will do the trick for me.

Voila, lots of time saved. Hope it helps somebody.

Cheers,
Erwin

Brocade FOS 7.1 and the cool features

30/04/2013

Erwin van Londen

Brocade No Comments

After a very busy couple of weeks I’ve spent some time to dissect the release notes of Brocade FOS 7.1 and I must say there are some really nice features in there but also some that I REALLY think should be removed right away.

It may come to no surprise that I always look very critical to whatever come to the table from Brocade, Cisco and others w.r.t. storage networking. Especially the troubleshooting side and therefore the RAS capabilities of the hardware and software have a special place in my heart so if somebody screws up I’ll let them know via this platform. 🙂

So first of all some generics. FOS 7 is supported on the 8 and 16G platforms which cover the Goldeneye2,Condor2 and Condor 3 ASICs plus the AP blades for encryption, SAN extension and FCoE. (cough, cough)….Be aware that it doesn’t support the blades based on the older architecture such as the FR4-18i and FC10-6 (which I think was never bought by anyone.) Most importantly this is the first version to support the new 6520 switch so if you ever think of buying one it will come shipped with this version installed.

Software

As for the software features Brocade really cranked up the RAS features. I especially do like the broadening of the scope for D-ports (diagnostics port) to include ICL ports but also between Brocade HBA’s and switch ports. One thing they should be paying attention to though is that they should sell a lot more of these. :-). Also the characteristics of the test patterns such as test duration, frame-sizes and number of frames can now be specified. Also FEC (Forward Error Correction) has been extended to access gateways and long distance ports which should increase stability w.r.t. frame flow. (It still doesn’t improve on signal levels but that is a hardware problem which cannot be fixed by software).

There are some security enhancements for authentication such as extended LDAP and TACACS+ support.

The 7800 can now be used with VF albeit not having XISL functionality.

Finally the E_D_TOV FC timer value is propagated onto the FCIP complex. What this basically means that previously even though an FC frame had long timed-out according to FC specs (in general 2 seconds) it could still exist on the IP network in a FCIP packet. The remote FC side would discard that frame anyway thus wasting valuable resources. With FOS 7.1 the FCIP complex on the sending side will discard the frame after E_D_TOV has expired.

One of the most underutilised features (besides Fabric Watch) is FDMI (Fabric Device Management Interface). This is a separate FC service (part of the new FC-GS-6 standard) which can hold a huge treasure box of info w.r.t. connected devices. As an example:

FDMI entru

————————————————-

switch:admin> fdmishow

Local HBA database contains:

10:00:8c:7c:ff:01:eb:00

Ports: 1

10:00:8c:7c:ff:01:eb:00

Port attributes:

FC4 Types: 0x0000010000000000000000000000000000000000000000000000000000000000

Supported Speed: 0x0000003a

Port Speed: 0x00000020

Frame Size: 0x00000840

Device Name: bfa

Host Name: X3650050014

Node Name: 20:00:8c:7c:ff:01:eb:00

Port Name: 10:00:8c:7c:ff:01:eb:00

Port Type: 0x0

Port Symb Name: port2

Class of Service: 0x08000000

Fabric Name: 10:00:00:05:1e:e5:e8:00

FC4 Active Type: 0x0000010000000000000000000000000000000000000000000000000000000000

Port State: 0x00000005

Discovered Ports: 0x00000002

Port Identifier: 0x00030200

HBA attributes:

Node Name: 20:00:8c:7c:ff:01:eb:00

Manufacturer: Brocade

Serial Number: BUK0406G041

Model: Brocade-1860-2p

Model Description: Brocade-1860-2p

Hardware Version: Rev-A

Driver Version: 3.2.0.0705

Option ROM Version: 3.2.0.0_alpha_bld02_20120831_0705

Firmware Version: 3.2.0.0_alpha_bld02_20120831_0705

OS Name and Version: Windows Server 2008 R2 Standard | N/A

Max CT Payload Length: 0x00000840

Symbolic Name: Brocade-1860-2p | 3.2.0.0705 | X3650050014 |

Number of Ports: 2

Fabric Name: 10:00:00:05:1e:e5:e8:00

Bios Version: 3.2.0.0_alpha_bld02_20120831_0705

Bios State: TRUE

Vendor Identifier: BROCADE

Vendor Info: 0x31000000

———————————————-

and as you can see this shows a lot more than the fairly basic nameserver entries:

——————————————-

N 8f9200; 3;21:00:00:1b:32:1f:c8:3d;20:00:00:1b:32:1f:c8:3d; na

FC4s: FCP

NodeSymb: [41] “QLA2462 FW:v4.04.09 DVR:v8.02.01-k1-vmw38”

Fabric Port Name: 20:92:00:05:1e:52:af:00

Permanent Port Name: 21:00:00:1b:32:1f:c8:3d

Port Index: 146

Share Area: No

Device Shared in Other AD: No

Redirect: No

Partial: No

LSAN: No

——————————————

Obviously the end-device needs to support this and it has to be enabled. (PLEASE DO !!!!!!!!) It’s invaluable for troubleshooters like me….

One thing that has bitten me a few times was the SFP problem. There has long been a problem that when a port was disabled and a new SFP was plugged in the switch didn’t detect that until the port was enabled and it had polled for up-to-date information. In the mean time you could get old/cached info of the old SFP including temperatures, db values, current, voltage etc.. This seems to be fixed now so thats one less thing to take into account.

Some CLI improvements have been made on various commands with some new parameters which lets you filter and select for certain errors etc.

The biggest idiocracy that has been made with this version is to allow the administrator change the severity level of event-codes. This means that if you have a filter in BNA (or whatever management software you have) to exclude INFO level messages but certain ERROR or CRITICAL messages start to annoy you you could change the severity to INFO and thus they don’t show up anymore. This doesn’t mean th problem is less critical so instead of just fixing the issue we now just pretend it’s not there. From a troubleshooting perspective this is disastrous since we look at a fair chuck of sup-saves each day and if we can’t rely on consistency in a log file it’s useless to have a look in the first place. Another one of those is the difference in deskew values on trunks when FEC is enabled. Due to a coding problem these values can differ up to 40 therefore normally depicting a massive difference in cable length. Only by executing a d-port analysis you can determine if that is really the case or not. My take is that they should fix the coding problem ASAP.

A similar thing that has pissed me off was the change in sfpshow output. Since the invention of the wheel this has been the worst output in the brocade logs so many people have scripted their ass off to make it more readable.

Normally it looks like this:

=============

Slot 1/Port 0:

=============

Identifier: 3 SFP

Connector: 7 LC

Transceiver: 540c404000000000 2,4,8_Gbps M5,M6 sw Short_dist

Encoding: 1 8B10B

Baud Rate: 85 (units 100 megabaud)

Length 9u: 0 (units km)

Length 9u: 0 (units 100 meters)

Length 50u: 5 (units 10 meters)

Length 62.5u:2 (units 10 meters)

Length Cu: 0 (units 1 meter)

Vendor Name: BROCADE

Vendor OUI: 00:05:1e

Vendor PN: 57-1000012-01

Vendor Rev: A

Wavelength: 850 (units nm)

Options: 003a Loss_of_Sig,Tx_Fault,Tx_Disable

BR Max: 0

BR Min: 0

Serial No: UAF11051000039A

Date Code: 101212

DD Type: 0x68

Enh Options: 0xfa

Status/Ctrl: 0xb0

Alarm flags[0,1] = 0x0, 0x0

Warn Flags[0,1] = 0x0, 0x0

Alarm Warn

low high low high

Temperature: 31 Centigrade -10 90 -5 85

Current: 6.616 mAmps 1.000 17.000 2.000 14.000

Voltage: 3273.4 mVolts 2900.0 3700.0 3000.0 3600.0

RX Power: -2.8 dBm (530.6uW) 10.0 uW 1258.9 uW 15.8 uW 1000.0 uW

TX Power: -3.3 dBm (465.9 uW)125.9 uW 631.0 uW 158.5 uW 562.3 uW

and that is for every port which basically makes you nuts.

So with some bash,awk,sed magic I scripted the output to look like this:

Port  Speed   Long  Short  Vendor     Serial            Wave   Temp   Current  Voltage   RX-Pwr   TX-Pwr
              wave  wave              number            Length   
1/0   8G      NA    50 m   BROCADE    UAF11051000039A   850    31     6.616    3273.4   -2.8      -3.3      
1/1   8G      NA    50 m   BROCADE    UAF110510000387   850    32     7.760    3268.8   -3.6      -3.3      
1/2   8G      NA    50 m   BROCADE    UAF1105100003A3   850    30     7.450    3270.7   -3.3      -3.3      etc....

From a troubleshooting perspective this is so much easier since you can spot issues right away.

Now with FOS 7.1.x the FOS engineers screwed up the SFPshow output which inherently screwed up my script which necessitates a load more work/code/lines to get this back into shape. The same thing goes for the output on the number of credits on virtual channels.

Pre-FOS 7.1 it looks like this:

C:—— blade port 64: E_port ——————————————

C:0xca682400: bbc_trc 0004 0000 002a 0000 0000 0000 0001 0001

With FOS 7.1 it looks like this:

bbc registers

=============

0xd0982800: bbc_trc 20 0 0 0 0 0 0 0

(Yes, hair pulling stuff, aaarrrcchhhh)

Some more good things. The fabriclog now contains the direction of link resets. Previously we could only see an LR had occurred but we didn’t see who initiated it. Now we can and have the option to figure out in which direction credit issues might have been happening. (phew..)

The CLI history is now also saved after reboots and firmware-upgrades. Its been always a PITA to figure out who had done what at a certain point-in-time. This should help to try and find out.

One other very useful thing that has been added and it a major plus in this release is the addition of the remote WWNN of a switch in the switchshow and islshow output even when the ISL has segmented for whatever reason. This is massively helpful because normally you didn’t have a clue what was connected so you also needed to go through quite some hassle and check cabling or start digging through the portlogdump with some debug flags enabled. Always a troublesome exercise.

The bonus points from for this release is the addition of the fabretrystats command. This gives us troubleshooters a great overview of statistics of fabric events and commands.

SW_ILS

————————————————————————————————————–

E/D_Port ELP EFP HA_EFP DIA RDI BF FWD EMT ETP RAID GAID ELP_TMR GRE ECP ESC EFMD ESA DIAG_CMD

————————————————————————————————————–

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

69 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

71 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

79 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

131 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

140 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

141 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

148 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

149 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

168 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

169 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

174 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

175 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

This release also fixes a gazillion defects so its highly advisable to get to this level better sooner than later. Check with your vendor for the latest supported release.

So all in all good stuff but some things should be reverted, NOW!!!. and PLEASE BROCADE: don’t screw up more output in such a way it breaks existing analysis scripts etc…

Cheers

Erwin

EvL Consulting

Tag Archives: support

The DCX man has retired.

Crackdown on FOS support

BNA before 14.4.1 is no longer supported

Brocade “compatible” SFP’s will disrupt your fabric.

How to obtain a Brocade SupportSave without BNA or DCFM

5-minute initial troubleshooting on Brocade equipment

Brocade supportsave via BNA or DCFM

Brocade FOS 7.1 and the cool features