Brocade trunks and the lack thereof

I had an interesting case the week. According to the customer there had been some maintenance activities on their cabling infrastructure and shortly thereafter the ISL’s would come up but there was absolutely no way these two would form a trunk. First thought is what happened in the configuration, DWDM/CDWM changes, switch configuration modified etc etc. The customer had a passive CWDM solution in place with just optical splitters so no TDM devices or any other interference on the FC link layer. The switch configuration was also correct on both sides and we had confirmation that the link length was absolutely the same. I went on checking if there was anything on the physical layer and when I looked at the SFP output and something baffled me.

Prt Sp LW Vendor W-Len RX-Pwr TX-Pwr

Switch 1
46 8G 25.5 km SmartOptics 1470 -10.7 2.9
47 8G 25.5 km SmartOptics 1490 -4.3 2.8

Switch 2
46 8G 25.5 km SmartOptics 1470 -3.1 3.0
47 8G 25.5 km SmartOptics 1490 -10.8 2.1

So this output showed some discrepancies in db drop-off values. The switch 1 tx-side of port 46 had a db value of 2.9 and that signal came in with a value of -3.1db. Port 47 of switch 1 sent the signal out with a db value of 2.8 but that one dropped of to a value of -10.8db. The other way around for port 47 was the same. This led me to believe there is either a very bad link or a very long link and something has been cabled incorrectly.

It almost looked like something was cabled this way:

incorrect-cabling

whereby the link between the CWDM equipment had a long and a short line.

Now this in itself should not be a reason for the trunking problem since both links observed the same issue and thus the same length. So this required additional digging which led me to the fabriclog. (Very useful piece of info.) Normally when a port comes up as E-port it sends out an EMT (Exchange Mark Timestamp). The remote should send an ACC (Accept) and when this arrive at the originator you have a good indication of the round-trip-time.

Switch 1 port 47 sent the EMT at

00:36:55.031305 *EMT Send                                   D2,I0  D2,I0  47    0x39ed

which arrived at switch 2 on

11:16:30.075425 *EMT Rcv                                    F2,P2  F2,T0  47    0x39ed

The ACC got send at

11:16:30.076477 EMT Snd ACC                                 F2,T0  F2,T0  47    0x39ed

which arrived on switch 1 at

00:36:55.070439 *EMT ACC Rcv                                D2,I0  D2,I1  47    0x39ed

This completed the exchange 0x39ed and this took 0.039134 seconds to complete.

On port 46 the result was different:

00:36:55.641044 *EMT Send                                   D2,I0  D2,I0  46    0x39fd 11:16:30.683121 *EMT Rcv                                    A0,P2  A0,T0  46    0x39fd
00:36:55.655352 *EMT ACC Rcv                                D2,I0  D2,I1  46    0x39fd 11:16:30.683903 EMT Snd ACC                                 A0,T0  A0,T0  46    0x39fd

Then I looked at the differences between the two ports:

Port 47 : 55.070439 – 55.031305 = 0.039134

Port 46: 55.655352 – 55.641044 = 0.014308

So the timing difference was 0.024826 second which (when you do the speed of light maths) translates to around a 5KM cable length difference.

This is obviously too much for trunking to work. I’ve advised the customer to review the inter-site cabling infrastructure. Results to follow. 🙂

Regards

Erwin van Londen

 

 

Print Friendly, PDF & Email

Subscribe to our newsletter to receive updates on products, services and general information around Linux, Storage and Cybersecurity.

The Cybersecurity option is an OPT-OUT selection due to the importance of the category. Modify your choice if needed.

Select list(s):

4 responses on “Brocade trunks and the lack thereof

  1. Bacil

    Hello,

    This may be slightly off topic. In another case the extended distance trunks(invovled DWDM) were formed successfully. However, if is being noticed that it is utilizing almost one link of the trunk as against distributing it dynamically(and in-order) between the links(2 extended ISL links).We are sure though that it will re-direct in case the first one fails. but why not distributing it already in the first place… Please advise….Thanks Bacil.

    1. Erwin van Londen

      Hi Bacil,

      As you are aware traffic is dispersed over ISL when Exchange based routing is enabled. When an ISL consists of trunks for FSPF this is one link so from an official Fibre-Channel standards point of view there is not much that can be done and it is up to the implementation of the vendor.

      As for you question as of FOS 6.2.(something) Brocade changed the switching algorithm in the ASIC driver in a way that a threshold mechanism was introduced regarding load balancing. The challenge has always been to keep frames in order for each FC exchange. If you just have a single link (non-trunked ISL) there is not really a problem as FSPF will take care of that as the entire exchange is mapped onto a single ISL. If you have two or more physical links on a single ISL you need to make sure to prevent Out-of-Order frames as much as you can. The easiest way to do this is to keep frames using the same physical link as it is therefore impossible to get frames out-of-order. On 1, 2 and 4G speeds the lentgh of each individual frame was long enough for the receiving side to keep things in order as it would not overlap if the cable length difference was short enough. On 8G and now 16G this may not be 100% the case and preventative measures where taken by the Brocade FOS engineering team.

      The driver now makes sure to map all frame onto the same physical link until a certain utilisation in the form of bandwidth or latency has been reached and then starts using the other links in that trunk on a per exchange basis. This way the chances of getting Out-of-order frames in the same exchange is next to nothing.

      I got this question a lot in the past and it is just a cosmetic thing when you see one link utilised. In reality there is no difference in the total amount of data being pushed through the entire trunk. It should just be seen on the whole trunk and not the individual members.

      Hope this explains it a bit.

      Regards,
      Erwin