In the storage world there mainly have been two environments : 1 FCP and 2. SBCC which stands for Fibre Channel – Protocol and Single Byte Command Code Set.
For those of you less indulged in the FC protocol stack this is basically SCSI and Ficon for open systems and mainframe respectively.
These two environment have traditionally a very different approach to storage. SCSI is by design a protocol that have initiators “discover” targets and logical units. Ficon is the opponent of that which is deterministic in the sense that initiators (or Channel Adapters in mainframe terminology) are statically configured to connect to LDEV’s (Logical Devices). This includes the end-to-end path with FC domain id’s and ports.
This approach has its benefits but also drawbacks. The biggest benefit is that connectivity schema’s and trafficflow can be configured and monitored at a very granular level. This allows you to adjust configurations in the operating system (HCD) without having to touch the fabrics or storage arrays.
SCSI is quite the opposite. The SCSI stack in the operating systems simply “look” around and discover targets plus LUN’s. It will take ownership of each LUN it sees without scruples. (Cluster environments aside).
By configuring zones in a fabric and masks on a storage array you can configure which HBA sees which LUN and build your storage environment that way. It also provides you with the flexibility that IO operations can be switched per command, or in FC terms “exchange”. If there are more inter-switch links (ISL’s) between two switches it is up the switch to decide which physical link it will send this series of frames belonging to this exchange over. This is called “Exchange Based Routing” or EBR.
Given the fact in Ficon environments you have to configure each CHA, domain and port id on the ingress side and the destination domain id, the switch is therefore unable to make decisions based on workload or other characteristics. This is called “Port-Based Routing”. The switch assigns an ISL egress port on a round-robin basis and the route remains static there after. This results in the phenomenon that an imbalanced ISL traffic flow can be observed.
When looking on an open-systems environment where exchange based routing is active the switch checks the destination, source and originator exchange ID (OXID) in the FC frame header. Based on these three characteristics an IO is sent over an ISL. The next IO from that HBA to the target port has a different OXID and thus can traverse a different ISL.
A while ago Brocade came up with a compromise between the two distinctly different mechanisms. “Device Based Routing”. This mechanism utilized a feature called DLS (Dynamic Load Sharing) to determine the egress path on a switch ISL. This allowed the switch to balance traffic flows more dynamically. From a switching perspective it does not use the OXID so as soon as the routes are set they remain there unless something changes.
Both port-based routing and device-based routing could still result in relatively unbalanced ISL utilization up to a point where a CHA or control unit may choke because a single ISL is overflowing. When the remaining ISL’s are simply picking their nose (figure of speech….) you can imagine this a unwanted situation.
The IBM mainframe guys acknowledged this problem a while ago but where more or less bound to interoperability and backwards compatibility straight-jackets and were not able to switch to the EBR policy until January last year (2015) when they introduced the z13 with Ficon Express 16S channel adapters. The EBR routing functionality that IBM z/OS now utilizes is called FIDR (Ficon Dynamic Routing).
Although it greatly enhances the utilization of existing real-estate by definition it also is subject to issues currently only observed by open-systems SANs. Problems like slow-drain devices, ISL errors and other problems that could be determined relatively easy from an OS perspective, is now more subject to diagnostic functions and features in the fabrics.
Nevertheless I think moving to a single routing mechanism for both open-systems and Ficon is great move forward and will result in more consolidated environments and better efficiency.
For more information IBM release a whitepaper around FIDR written by PJ Catalano and good friend Dr Steve Guendert. The paper can be downloaded here.
Another very good (Brocade branded) whitepaper around Ficon can be downloaded here. Written by three Ficon / Mainframe gurus David Lytle, Fred Smit and, as above, Dr Steve Guendert.