Fill Words. What are those, what do they do and why are they needed

There has been quite some confusion around the use of fill words with the adoption of the 8G fibre-channel standard. Some admins have reported that they have problems connecting devices on this speed as well as numerous headaches in long-distance replication especially when DWDM/CWDM equipment is involved.

An ordered set is a transmission word used to perform control and signaling functions. There are 3 types of ordered sets defined:

1. Frame delimiters. These identify the start and end of frames.
2. Primitive signals. These are normally used to indicate events or actions (like IDLE)
3. Primitive Sequences which are used to indicate state or condition changes and are normally transmitted continuously until something causes the current state to chance. Examples are NOS,OLS,LR,LRR

So what is a fill-word? A fill-word is a primitive signal which is needed to maintain bit and word synchronization between two adjacent ports. Is doesn’t matter what port type (F-port,E-port,N-Port etc) it is. They are not data frames in the sense that they transport user-data but instead they communicate status messages between these two ports. If no user-data is transmitted the ports will send so called IDLE frames. These are just frames with some bit pattern where the ports are able to keep there synchronization on a bit-level as well as a word level. The IDLE primitive is a 10-bit transmission character on the wire, as any ordered set starts with K28.5 which is a fibre-channel notation for 8B10B encoding and three data words of which the last 20 bits are 1010101010….etc. Depending on the content of these transmission characters it’s either a fill-word or non-fillword.

Examples of fillwords are IDLE, ARB(F0), ARB(FF) and non-fillword are R_RDY, VC_RDY etc.

So what happened recently with the introduction of the 8G standard.

In the 1,2 and 4G standard the IDLE primitive signal was used to keep bit and word synchronization. This bitpattern was OK on those speeds however it has been observed that when increasing the clock speed this pattern caused high emissions which in turn could cause problems on adjacent ports and links. In order to reduce that the standard now requires links that are using 8G speed to use the ARB(ff) fill-word. This is a different bit-pattern which doesn’t have this high emission characteristic.

You might wonder what does this have to do with my connection problem? If links negotiate on 8G speed they both have to use the ARB(FF) fill-word. If that doesn’t happen for some reason then the ports cannot maintain word synchronisation and therefore cannot change the port into the active state. This causes both ports to be in some sort of deadlock situation and although you may see that there is a green status light on your HBA and switch port it still is not able to transfer data.

The standard defines that ports who connect on 8G speed first have to initialize with IDLE fill-words and as soon as the port changes to the active state it should change the fill-word to ARB(FF).

It becomes even more complicated with DWDM and CWDM equipment particularly when multiplexers are used. These TDM devices normally crack open the fibre-channel link on a frame boundary level and then are able to multiplex this on a higher clock-rate so they are able to send data from multiple links into one wavelength. If however these TDM devices cannot open the fibre-channel link because they only look for IDLE fillwords then the end-to-end link will fail.

Verify with you manufacturer if you use TDM devices and if so do they support ARB(FF) fillwords. If not than you may have to force the linkspeed to a lower level like 4G.

The importance of clean fibre optics

I attended Cisco Live this week in Melbourne. Since it was very close to home and Cisco was kind enough to provide me with an entry ticket. (Many thanks for this.)

While strolling around the expo floor I ran into the nice people from Fluke Networks who were showing their testing equipment and of course I was very interested in the optical side of the fence. (I haven’t seen wireless storage networks yet so I’ll save that part of their impressive toolkit for later. :-)).

Since I’m doing troubleshooting as a day to day job I see many issues which have characteristics of a physical nature. This can be a bad cable, patch panel, SFP or anything in that nature.

Just when I wanted to start this blog post I saw that my Melbournian buddy  Anthony Vandewerdt just beat me to it and wrote the article “Semmelweiss could see the problem” in which he described the problem of unclean cables and where it might lead to.  (read this first and then come back here.)

In order to complement that article I’ll try to explain why this is so important.

I’m pretty sure that everyone these days know that computers work with bits which are either a 1 or 0. To be able to communicate with other computers (or devices in general) we use transmission of bits with either an on or off signal whether this being an electrical current or an optical wave. Electrical use has the nasty habit that the energy is partially stored in the capacitance of the electrical cable so it has a certain drop zone before it becomes a capacitor with 0 value. You can see this very well if you use a laptop charger with a small led. When you unplug it from the wall-socket it takes a couple of seconds before the current is completely gone from the capacitors in the transformer. This is also one of the primary reasons FC uses a 8b/10b encoding decoding schema to keep a balanced DC value.

The optical to electrical transformers have the same issue albeit not being in the cable itself but more in the physics characteristics of the circuitry. There is a certain fall-off and ramp-up time before current becomes completely zero and completely one respectively. This is very important since this depicts when a receiver should determine if the incoming bit should be seen as a 1 or are 0.

The optics people and companies represented in IEEE and T11-2 do write up the official metrics so this is all being done for you. There is nothing on a switch, array or other network equipment where you can tune this.

The measurement and characteristics of a signal can be measured with an oscillator. The result you see looks like this:

 
The blue lines show the voltage on the oscillator and this shows the, so called, eye-pattern. The hexagon in the middle is determine by the folks of IEEE and T11-2 and can be loaded as a software feature for ease of use on most equipment. (Note: be aware that this differs per technology and optical characteristic like FC, Ethernet, DWDM etc. )
 
The above picture shows a perfect eye-pattern since it show that the ramp-up time (from the bottom blue line to the top) is way before the “decision point” on becoming a 1 and the fall-off time is way after the decision point of becoming a 0.
 
“So what does this have to do with my fibre-cable” you may ask.
When connectors are not clean the light may be reflected back in to the cable causing jitter. It is this jitter that can significantly close the eye-pattern to a point where the receiver can no longer determine if an incoming light should be determined as a 1 or 0. The below picture show that this comes pretty close.
 
 
 
By default it will keep the same value it had on the previous clock cycle. This means that a one remains a 1 even though it actually should have been a 0 and vice versa. The result will be that the bitstream from the receiver buffer into the serdes chip will be incorrect thereby causing a decoding error. For FC it means that the er_enc_out or er_enc_in value on the LESB (Link Error Status Block) is incremented by one (depending if the 10-bit transmission word was part of a FC frame or not). On a Brocade switch in the porterrshow output this is shown in the enc_in or enc_out column.
 
If this happens on a bit which was part of a, normally valid, FC frame the frame now contains an invalid byte. If we not would have a fall-back mechanism this would have led to an invalid byte being send to the operating system and application causing corruption and even system failures. Since we also do a CRC check on the entire frame the destination port will discard it entirely and the upper layer SCSI stack (or whatever protocol resides on the FC4 layer) retry the IO.
 
The problem is that with distance you get loss of power (remember that light is measured in db’s). Depending on the type of cable (OM1,2,3,4) this budget loss on the cable is fixed. Every connection or splice (two optical cables welded together) adds to the link loss and decreases the optical power received on the other side of the link. The problem with dirty connections is that it significantly decreases the optical power which can cause the problem that the value in db the receiver can detect falls outside the specification of that particular SFP. This can cause link losses and port flapping causing all sorts of other nasty issues.
 
The link loss budget can be calculated based on the launch power of the transmitter, the number of connectors and splices in the cable-plant plus the margin on the receiver side.If this all falls below the receiver sensitivity mark the receiver will drop the link and the ports will go offline.
 
 
 
 
On a Brocade switch you can see the transmitter and receiver value with the “sfpshow” command:
 
 
The specifications of the SFP determine what the transmit and receive power should be. If the actual values of the RX power fall outside the specification of the SFP you should start to look at you cable’s, connectors and start cleaning them. If this doesn’t help there might be another problem like a crack in the cable or the SFP has a broken laser. In this case either replace the cable and/or SFP.
 
Hope this may help to explain why you might see strange things in your fibre channel network if the connectors are not clean and your support organisation is really stressing to fix and maintain your cable plant. I did mention I work in support and I see many connectivity issues resulting in flapping ports, overall performance issues and even data-loss or corruption.
 
If you want to know the characteristics of optical cables or SFP’s I suggest you have a look at the JDSU, Finisar or Avago websites. Also check out the FOA Youtube channel who uploaded some nice video’s which explain in detail the ins and outs of fibre optics.
Regards,
Erwin