I attended Cisco Live this week in Melbourne. Since it was very close to home and Cisco was kind enough to provide me with an entry ticket. (Many thanks for this.)
While strolling around the expo floor I ran into the nice people from Fluke Networks who were showing their testing equipment and of course I was very interested in the optical side of the fence. (I haven't seen wireless storage networks yet so I'll save that part of their impressive toolkit for later. :-)).
Since I'm doing troubleshooting as a day to day job I see many issues which have characteristics of a physical nature. This can be a bad cable, patch panel, SFP or anything in that nature.
Just when I wanted to start this blog post I saw that my Melbournian buddy Anthony Vandewerdt just beat me to it and wrote the article "Semmelweiss could see the problem" in which he described the problem of unclean cables and where it might lead to. (read this first and then come back here.)
In order to complement that article I'll try to explain why this is so important.
I'm pretty sure that everyone these days know that computers work with bits which are either a 1 or 0. To be able to communicate with other computers (or devices in general) we use transmission of bits with either an on or off signal whether this being an electrical current or an optical wave. Electrical use has the nasty habit that the energy is partially stored in the capacitance of the electrical cable so it has a certain drop zone before it becomes a capacitor with 0 value. You can see this very well if you use a laptop charger with a small led. When you unplug it from the wall-socket it takes a couple of seconds before the current is completely gone from the capacitors in the transformer. This is also one of the primary reasons FC uses a 8b/10b encoding decoding schema to keep a balanced DC value.
The optical to electrical transformers have the same issue albeit not being in the cable itself but more in the physics characteristics of the circuitry. There is a certain fall-off and ramp-up time before current becomes completely zero and completely one respectively. This is very important since this depicts when a receiver should determine if the incoming bit should be seen as a 1 or are 0.
The optics people and companies represented in IEEE and T11-2 do write up the official metrics so this is all being done for you. There is nothing on a switch, array or other network equipment where you can tune this.
The measurement and characteristics of a signal can be measured with an oscillator. The result you see looks like this:
The blue lines show the voltage on the oscillator and this shows the, so called, eye-pattern. The hexagon in the middle is determine by the folks of IEEE and T11-2 and can be loaded as a software feature for ease of use on most equipment. (Note: be aware that this differs per technology and optical characteristic like FC, Ethernet, DWDM etc. )
The above picture shows a perfect eye-pattern since it show that the ramp-up time (from the bottom blue line to the top) is way before the "decision point" on becoming a 1 and the fall-off time is way after the decision point of becoming a 0.
"So what does this have to do with my fibre-cable" you may ask.
When connectors are not clean the light may be reflected back in to the cable causing jitter. It is this jitter that can significantly close the eye-pattern to a point where the receiver can no longer determine if an incoming light should be determined as a 1 or 0. The below picture show that this comes pretty close.
By default it will keep the same value it had on the previous clock cycle. This means that a one remains a 1 even though it actually should have been a 0 and vice versa. The result will be that the bitstream from the receiver buffer into the serdes chip will be incorrect thereby causing a decoding error. For FC it means that the er_enc_out or er_enc_in value on the LESB (Link Error Status Block) is incremented by one (depending if the 10-bit transmission word was part of a FC frame or not). On a Brocade switch in the porterrshow output this is shown in the enc_in or enc_out column.
If this happens on a bit which was part of a, normally valid, FC frame the frame now contains an invalid byte. If we not would have a fall-back mechanism this would have led to an invalid byte being send to the operating system and application causing corruption and even system failures. Since we also do a CRC check on the entire frame the destination port will discard it entirely and the upper layer SCSI stack (or whatever protocol resides on the FC4 layer) retry the IO.
The problem is that with distance you get loss of power (remember that light is measured in db's). Depending on the type of cable (OM1,2,3,4) this budget loss on the cable is fixed. Every connection or splice (two optical cables welded together) adds to the link loss and decreases the optical power received on the other side of the link. The problem with dirty connections is that it significantly decreases the optical power which can cause the problem that the value in db the receiver can detect falls outside the specification of that particular SFP. This can cause link losses and port flapping causing all sorts of other nasty issues.
The link loss budget can be calculated based on the launch power of the transmitter, the number of connectors and splices in the cable-plant plus the margin on the receiver side.If this all falls below the receiver sensitivity mark the receiver will drop the link and the ports will go offline.
On a Brocade switch you can see the transmitter and receiver value with the "sfpshow" command:
The specifications of the SFP determine what the transmit and receive power should be. If the actual values of the RX power fall outside the specification of the SFP you should start to look at you cable's, connectors and start cleaning them. If this doesn't help there might be another problem like a crack in the cable or the SFP has a broken laser. In this case either replace the cable and/or SFP.
Hope this may help to explain why you might see strange things in your fibre channel network if the connectors are not clean and your support organisation is really stressing to fix and maintain your cable plant. I did mention I work in support and I see many connectivity issues resulting in flapping ports, overall performance issues and even data-loss or corruption.
If you want to know the characteristics of optical cables or SFP's I suggest you have a look at the JDSU, Finisar or Avago websites. Also check out the FOA Youtube
channel who uploaded some nice video's which explain in detail the ins and outs of fibre optics.