Brocade trunks and the lack thereof

I had an interesting case the week. According to the customer there had been some maintenance activities on their cabling infrastructure and shortly thereafter the ISL’s would come up but there was absolutely no way these two would form a trunk. First thought is what happened in the configuration, DWDM/CDWM changes, switch configuration modified etc etc. The customer had a passive CWDM solution in place with just optical splitters so no TDM devices or any other interference on the FC link layer. The switch configuration was also correct on both sides and we had confirmation that the link length was absolutely the same. I went on checking if there was anything on the physical layer and when I looked at the SFP output and something baffled me.

Prt Sp LW Vendor W-Len RX-Pwr TX-Pwr

Switch 1
46 8G 25.5 km SmartOptics 1470 -10.7 2.9
47 8G 25.5 km SmartOptics 1490 -4.3 2.8

Switch 2
46 8G 25.5 km SmartOptics 1470 -3.1 3.0
47 8G 25.5 km SmartOptics 1490 -10.8 2.1

So this output showed some discrepancies in db drop-off values. The switch 1 tx-side of port 46 had a db value of 2.9 and that signal came in with a value of -3.1db. Port 47 of switch 1 sent the signal out with a db value of 2.8 but that one dropped of to a value of -10.8db. The other way around for port 47 was the same. This led me to believe there is either a very bad link or a very long link and something has been cabled incorrectly.

It almost looked like something was cabled this way:

incorrect-cabling

whereby the link between the CWDM equipment had a long and a short line.

Now this in itself should not be a reason for the trunking problem since both links observed the same issue and thus the same length. So this required additional digging which led me to the fabriclog. (Very useful piece of info.) Normally when a port comes up as E-port it sends out an EMT (Exchange Mark Timestamp). The remote should send an ACC (Accept) and when this arrive at the originator you have a good indication of the round-trip-time.

Switch 1 port 47 sent the EMT at

00:36:55.031305 *EMT Send                                   D2,I0  D2,I0  47    0x39ed

which arrived at switch 2 on

11:16:30.075425 *EMT Rcv                                    F2,P2  F2,T0  47    0x39ed

The ACC got send at

11:16:30.076477 EMT Snd ACC                                 F2,T0  F2,T0  47    0x39ed

which arrived on switch 1 at

00:36:55.070439 *EMT ACC Rcv                                D2,I0  D2,I1  47    0x39ed

This completed the exchange 0x39ed and this took 0.039134 seconds to complete.

On port 46 the result was different:

00:36:55.641044 *EMT Send                                   D2,I0  D2,I0  46    0x39fd 11:16:30.683121 *EMT Rcv                                    A0,P2  A0,T0  46    0x39fd
00:36:55.655352 *EMT ACC Rcv                                D2,I0  D2,I1  46    0x39fd 11:16:30.683903 EMT Snd ACC                                 A0,T0  A0,T0  46    0x39fd

Then I looked at the differences between the two ports:

Port 47 : 55.070439 – 55.031305 = 0.039134

Port 46: 55.655352 – 55.641044 = 0.014308

So the timing difference was 0.024826 second which (when you do the speed of light maths) translates to around a 5KM cable length difference.

This is obviously too much for trunking to work. I’ve advised the customer to review the inter-site cabling infrastructure. Results to follow. 🙂

Regards

Erwin van Londen

 

 

Print Friendly, PDF & Email

About Erwin van Londen

Master Technical Analyst at Hitachi Data Systems
Brocade , ,