Preventing Information Overload in SAN Management

The standpoint of getting notified when events happen in SAN environments at all costs may need to be reviewed. As many of yo know the Broadcom and Cisco switches can customise various thresholds in their respective FOS and NX-OS operating systems.

Brocade uses the MAPS framework which follows a policy based setup.

Read more »

Brocade, Uncategorized , , ,

FPIN – The Holy Grail of SAN Stability

As some of you may recall about a decade ago I made a proposal to incorporate more intelligence into the end-devices to be able to better react to changing conditions in fabrics. I called it the “Error Reporting with Integrated Notification” framework (mind the acronym here. :-))

Basically the intention was to have end-devices check for errors along paths which their frames traverse by sending a “query-frame” to the remote device. Each hop along the way could then add its values (errors, counters) to that frame and the remote device would, upon reception of that frame, also add its counters, reverse the SID (Source ID) and DID (Destination ID) and send that same frame back to the original sender. That sender would then be able to make decisions whether to use that same path for subsequent frames or if it would hold of using it temporarily or not at all. Read on.

Read more »

Fibre Channel , , , ,

Interesting Flow Vision bug

Ok, truth be told it is not a Flow Vision bug but in FOS 9.0.0 it is flagged as such under defect 653188.

As you know Flow Vision can be configured to monitor certain traffic flows. SCSI commands such as read, write, xfer-rdy,, status etc can be viewed with the flow –show command.

Read more »

Brocade Technical, Fibre Channel , ,

Using systemd-resolved to optimise DNS resolution.

When you work from home and are required to use the corporate network you’re often shoved into a dilemma where the VPN configuration that is pushed to your PC results in one of two modes, Full-tunnel or Split-Tunnel.

Digging tunnels

A full-tunnel configuration is by far the most dreadful especially when your VPN access-point is on the other side of the planet. Basically all traffic to and from your system is pushed through that tunnel. This is even the case when a web-page is hosted next-door from where you are sitting. Your requests to that webserver will first traverse via your VPN connection to the other side of the planet where your companies proxies will retrieve the page via the public web only to send it back to you via that VPN again. Obviously the round-trip and other delays will basically result in abominable performance and a user-experience that is excruciatingly painful.

A split-tunnel however is far more friendly. As I explained in one of my previous articles (here) only traffic destined for systems inside your corporate network will be routed over the VPN and requests to other systems will just traverse the public interweb. 

Domain Name resolution

There is however one exception DNS i.e. the name to (IP) number translation. Traditionally Linux uses a system-wide resolver that looks in “/etc/resolve.conf” what you DNS servers are and which domains to search for plus a few other options. That basically means that as soon as you have any VPN tunnel active you would always need to use your corporate DNS servers for any request as your system does not really know which server is located where. There may even be a situation that your corporate DNS servers point to a different host for the same domain. You often see this where employees get additional functionality than external users or credential verification may be bypassed as you already have an authorised session to the internal systems.

The drawback is however that sites outside your corporate network are also resolved via your companies DNS servers. This may not only be a limitation on performance from a resolver standpoint, remember that these DNS requests also have to traverse the same VPN tunnel, but the resulting system to where you end up may also not be the most appropriate one.

As an example.

If you have an active VPN to your, even DNS queries for a web-site in your country will first go “Corp DNS” who, if it does not already have a cached address itself, will forward that request to whatever “Corp DNS” has configured as its upstream DNS server. (In this case Google). As you can see you could’ve asked Googles DNS servers yourselves but as you VPN session has set your resolver to use the Corp DNS that does not happen. An additional point of attention is that you have to be aware of is that no matter which website you visit your company will have a record of that as most corporate regulations stipulate that actions done on their systems will be logged for whatever purpose they deem necessary. This may sometime conflict with different privacy policies in different countries but that is most often shuffled under the carpet and hidden in legal obscurity.

The above also means that when you have requests for sites that span geographies, you may not always get to the most optimal system. Many DNS system are able to determine where the request is coming from and subsequently provide a IP address of a system that is closest to the requestor. As your request is fulfilled by your companies’ DNS server on the other side of the planet, that web-server may also be there. Not to panic as many of these environments have build in smarts to re-direct you to a more local system it nevertheless means this situation is far from optimal. What you’re basically after is to have the ability to, in addition to that split-tunnel configuration, direct DNS queries to DNS servers which actually host the domains behind that VPN and nothing else.

In the above case your Linux system has two interfaces. One physical (WIFI or Ethernet) and one virtual (VPN most often called tunX where X is the VPN interface number)

Meet systemd-resolved

There are some Linux (or Unix) purists who shudder at the sight of systemd based services but I think most of them are actually pretty OK. Resolved is one of them.

What resolved allows you to do is assign specific DNS configurations to different interfaces in addition to generic global options.

As an example

LLMNR setting: yes
MulticastDNS setting: yes
DNSOverTLS setting: no
DNSSEC setting: allow-downgrade
DNSSEC supported: no
Fallback DNS Servers:

Link 22 (tun0)
Current Scopes: DNS LLMNR/IPv4 LLMNR/IPv6
DefaultRoute setting: yes
LLMNR setting: yes
MulticastDNS setting: no
DNSOverTLS setting: no
DNSSEC setting: no
DNSSEC supported: no
Current DNS Server:
DNS Servers:
DNS Domain:

Link 3 (wlp0s20f0u13)
Current Scopes: DNS LLMNR/IPv4 LLMNR/IPv6
DefaultRoute setting: yes
LLMNR setting: yes
MulticastDNS setting: no
DNSOverTLS setting: no
DNSSEC setting: yes
DNSSEC supported: yes
Current DNS Server:
DNS Servers:
DNS Domain: ~.

As you can see it has three sections. The global section caters for many default settings which can be superseded by per-interface settings. I think overview speaks for itself. All requests to domain “” and “” will be sent to one of the two DNS servers with the 10.15.230.[6-7] adress. All my home internal requests as defined by the “” domain are sent to the address. The “~.” means all other requests.

That will result in queries being returned like:

[1729][[email protected]:~]$ resolvectl query 10.xx.16.9 -- link: tun0
10.xx.16.8 -- link: tun0
172.xx.24.164 -- link: tun0
172.xx.24.162 -- link: tun0
10.xx.100.4 -- link: tun0
10.xx.148.66 -- link: tun0
10.xx.7.221 -- link: tun0
10.xx.7.34 -- link: tun0
10.xx.7.33 -- link: tun0
10.xx.100.5 -- link: tun0

-- Information acquired via protocol DNS in 243.1ms.
-- Data is authenticated: no

If I would use an external DNS system for that domain it would return different addresses.

[1733][[email protected]:~]$ dig @ +short

(The above are not my real domains I queried but I think you get the drift)

Queries to non-corporate websites will be retrieved via the WIFI interface (wlp0s20f0u13)

[1733][[email protected]:~]$ resolvectl query 2404:6800:4006:809::200e -- link: wlp0s20f0u13 -- link: wlp0s20f0u13

-- Information acquired via protocol DNS in 121.0ms.
-- Data is authenticated: no

As my home router has a somewhat more sophisticated setup this also allows me to have all external DNS requests, not destined to or, use a DNSoverHTTPS or DNSoverTLS configuration to bypass any ISP mangling.


Systemd-resolved is a systemd service (duhh) which needs to be enabled first with “systemctl enable systemd-resolved“. The configuration files are located in /etc/systemd/resolved.conf or in a .d subdirectory of that where individual configuration files can be stored.

-rw-r--r-- 1 root root 784 Oct 20 14:32 resolved.conf
drwxr-xr-x 2 root root 4096 Oct 20 14:24 resolved.conf.d/

The settings can also be applied interactively via the “resolvectl” command which I have done. If your distro has NetworkManager installed then NM can also automatically configure resolved via D-bus calls.

There is more involved than I can easily simplify here as it would pretty quickly become a re-wording of the man-page which I try to avoid. At least I hope it has given you some information of what you can do with “systemd-resolved

Kind regards,


Linux , , , , , ,

Enabling Verbose Logging on Linux with Emulex Host Bus Adapters

Where did my disks go?

So no and then you may run into an issue which cannot be explained properly by just looking at the standard events that show up in “/var/log/messages“.

Issues such as

Oct 7 18:24:20 centos8 kernel: lpfc 0000:81:00.0: 0:1305 Link Down Event xc received Data: xc x20 x800110 x0 x0
Oct 7 18:24:24 centos8 kernel: rport-11:0-4: blocked FC remote port time out: removing target and saving binding
Oct 7 18:24:24 centos8 kernel: lpfc 0000:81:00.0: 0:(0):0203 Devloss timeout on WWPN 50:06:0e:80:07:c3:70:00 NPort x01ee40 Data: x0 x8 x2

are fairly common and the above simply shows a Link Down event. These are the most easy to troubleshoot when you remote switchlog tell you

18:26:59.565715 SCN Port Offline;rsn=0x10004,g=0x12 A2,P0 A2,P0 93 NA 
18:26:59.565721 *Removing all nodes from port A2,P0 A2,P0 93 NA 
18:28:07.998318 SCN LR_PORT(0);g=0x12 A2,P0 A2,P0 93 NA 
18:28:08.006029 SCN Port Online; g=0x12,isolated=0 A2,P0 A2,P1 93 NA 
18:28:08.007307 Port Elp engaged A2,P1 A2,P0 93 NA 
18:28:08.007331 *Removing all nodes from port A2,P0 A2,P0 93 NA 
18:28:08.007594 SCN Port F_PORT A2,P1 A2,P0 93 NA 
18:28:08.099107 SCN LR_PORT(0);g=0x12 LR_IN A2,P0 A2,P0 93 NA 
18:28:20.669283 SCN Port Offline;rsn=0x10004,g=0x14 A2,P0 A2,P0 93 NA 
18:28:20.669288 *Removing all nodes from port A2,P0 A2,P0 93 NA

as a result of

Wed Oct 7 18:28:07 2020 admin, FID 43,, portenable 4/29
Wed Oct 7 18:28:20 2020 admin, FID 43,, portdisable 4/29

Diagnostics becomes more problematic when is just the events that show the links bounce but show no further information. Obtaining extended information from the HBA drivers may then be very helpful.

Update Drivers and Firmware

As you know I’m very picky when it comes to maintenance. If I see cases where System and/or Storage administrators have basically been slacking for a long time the chances are very high that I will tell you that and commence diagnosing issues as soon as these things are all up to date. You wouldn’t know the sheer amount of issues that have been resolved in firmware and drivers over any given time-period.

That being said going to the Linux side of the Emulex (or Broadcom) drivers for the LP31000/LP32000 cards which are very popular in many form-factors.

The driver will show as an lpfc module and is by default compiled into a ramfs image when installed. This will allow the card to be used in a boot-from-san variation if needed. The module will load as such and register with the scsi-subsystem

lpfc 978944 81
nvmet_fc 32768 1 lpfc
nvme_fc 45056 1 lpfc
scsi_transport_fc 69632 1 lpfc

With the most recent versions of the driver it will also provide an NVMe_oF initiator and target so that NVM equipment can be utilized when attached to a FC fabric.


Loggin with an Emulex card can be done on the driver level as well as the HBA firmware. Unless you get some instructions to do so leave the firmware logging as is. Mainly because changing these parameters will require a reload of the driver that basically instructs the firmware logging facility to capture data in some host memory region. Obviously that will involve some engineering efforts to diagnose anyway so that will not be very helpful to yourself or your OEM support-organisation unless it needs escalating to Emulex.

Changing the logging verbosity of the driver itself is much easier but may also incur some performance impact so don’t just flick on the “0xFFFFFFFF debug” button. The driver logging facility is a bitmap value based on the below table:

LOG Message Verbose Mask Definition Verbose Bit Verbose Description
LOG_ELS 0x00000001 ELS events
LOG_DISCOVERY 0x00000002 Link discovery events
LOG_MBOX 0x00000004 Mailbox events
LOG_INIT 0x00000008 Initialization events
LOG_LINK_EVENT 0x00000010 Link events
LOG_IP 0x00000020 IP traffic history
LOG_FCP 0x00000040 FCP traffic history
LOG_NODE 0x00000080 Node table events
LOG_TEMP 0x00000100 Temperature sensor events
LOG_BG 0x00000200 BlockGuard events
LOG_MISC 0x00000400 Miscellaneous events
LOG_SLI 0x00000800 SLI events
LOG_FCP_ERROR 0x00001000 Log errors, not underruns
LOG_LIBDFC 0x00002000 Libdfc events
LOG_VPORT 0x00004000 NPIV events
LOG_SECURITY 0x00008000 Security events
LOG_EVENT 0x00010000 CT,TEMP,DUMP, logging
LOG_FIP 0x00020000 FIP events
LOG_FCP_UNDER 0x00040000 FCP underruns errors
LOG_SCSI_CMD 0x00080000 ALL SCSI commands
LOG_NVME 0x00100000 NVME general events
LOG_NVME_DISC 0x00200000 NVME discovery/connect events
LOG_NVME_ABTS 0x00400000 NVME ABTS events
LOG_NVME_IOERR 0x00800000 NVME I/O Error events
LOG_EDIF 0x01000000 External DIF events
LOG_AUTH 0x02000000 Authentication events

If you don’t know what these mean, or have no clue on how to interpret the output, it’s not much use mucking around with these. The output will only confuse you and if you don’t know what the commands and responses should be it’s only a bunch of hex values.

The values as displayed above can be summed depending on which verbose logging needs to be enabled. For instance if your OEM asks you for Link events, ELS and Initialiation events you may get asked to enable verbose logging with either the “hbacmd” or via “sysfs”. The value of the parameter will than be “0x19”

hbacmd or sysfs

If you have hbacmd installed any change done in the logging preferences also automatically kicks of dracut and builds a new boot image. The command has a few additional parameters

hbacmd setdriverparam 10:00:00:90:fa:c7:cd:f9 G P log-verbose 0x135661

The first three a fairly obvious. Command setting driver parameters for PWWN 10:xxxxxx. The G stands for Global basically meaning it is valid for all adapters and the P stands for Permant. That ensures the parameter that follows is also applied after reboots. The log-verbose parameter is basically the configuration what we’re adjusting. The 0x135661 is a combination of values obtained via the table above.

The value can also dynamically be applied via sysfs in the “/sys/class/scsi_host/host<X>” (where <X> is the adapter ID) directory. The LPFC driver will create the system file as appropriate in that folder and one of which is indeed the “lpfc_log_verbose” file. The 0x<123456> value can be echoed to that file and the driver will dynamically pick this up.

[[email protected] host11]# cat lpfc_log_verbose 
[[email protected] host11]# echo 0x135661 > lpfc_log_verbose

The change is immediate logged

Oct 8 17:03:15 centos8 kernel: lpfc 0000:81:00.0: 0:(0):3053 lpfc_log_verbose changed from 0 (x0) to 1267297 (x135661)

When you change all of them with

[[email protected] scsi_host]# echo 0x135661 > host11/lpfc_log_verbose 
[[email protected] scsi_host]# echo 0x135661 > host12/lpfc_log_verbose 
[[email protected] scsi_host]# echo 0x135661 > host13/lpfc_log_verbose 
[[email protected] scsi_host]# echo 0x135661 > host14/lpfc_log_verbose

The messagelog will show something similar like this

Oct 8 17:28:28 centos8 kernel: lpfc 0000:81:00.0: 0:(0):3053 lpfc_log_verbose changed from -1 (xffffffff) to 1267297 (x135661)
Oct 8 17:28:50 centos8 kernel: lpfc 0000:81:00.1: 1:(0):3053 lpfc_log_verbose changed from 1267297 (x135661) to 1267297 (x135661)
Oct 8 17:28:58 centos8 kernel: lpfc 0000:83:00.0: 2:(0):3053 lpfc_log_verbose changed from 1267297 (x135661) to 1267297 (x135661)
Oct 8 17:29:04 centos8 kernel: lpfc 0000:83:00.1: 3:(0):3053 lpfc_log_verbose changed from 1267297 (x135661) to 1267297 (x135661)

The interesting past is that the paths to the adapter entries are used here. This is reflected in the “0000:81:00.0:”, 0000:83:00:0:” etc entries.

Remember that in normal circumstances you would not need to change these values. The basics are logged anyway and only in specific circumstances you would need to adjust that. Also be aware that using a debug value of 0xFFFFFFFF can incurr a significant performance overhead on busy systems as a lot needs to be logged.

Another thing that I get often queried about is which HBA port belongs to which SCSI number.

Identifcation of the respective HBA’s can be done by looking at the adapter entries int he eventlog as mentioned above. In this case the 81 and 83 values are a reflection of the PCI id and the 00:.0 and 00.1 are the individual ports on those adapters.

81:00.0 Fibre Channel: Emulex Corporation LPe31000/LPe32000 Series 16Gb/32Gb Fibre Channel Adapter (rev 01)
81:00.1 Fibre Channel: Emulex Corporation LPe31000/LPe32000 Series 16Gb/32Gb Fibre Channel Adapter (rev 01)
83:00.0 Fibre Channel: Emulex Corporation LPe31000/LPe32000 Series 16Gb/32Gb Fibre Channel Adapter (rev 01)
83:00.1 Fibre Channel: Emulex Corporation LPe31000/LPe32000 Series 16Gb/32Gb Fibre Channel Adapter (rev 01)

You can see these entries coming back in the /sys/class/fc_host directory where logical links to the PCI devices are created

lrwxrwxrwx. 1 root root 0 Sep 4 15:08 host11 -> ../../devices/pci0000:80/0000:80:03.0/0000:81:00.0/host11/fc_host/host11
lrwxrwxrwx. 1 root root 0 Sep 4 15:08 host12 -> ../../devices/pci0000:80/0000:80:03.0/0000:81:00.1/host12/fc_host/host12
lrwxrwxrwx. 1 root root 0 Sep 4 15:08 host13 -> ../../devices/pci0000:80/0000:80:03.2/0000:83:00.0/host13/fc_host/host13
lrwxrwxrwx. 1 root root 0 Sep 4 15:08 host14 -> ../../devices/pci0000:80/0000:80:03.2/0000:83:00.1/host14/fc_host/host14

As soon as you know this you can associate the respective WWN of the adapter to the one you see on the switch:

[[email protected] ~]# cat /sys/class/fc_host/host11/port_name
Sydney_ILAB_X6_43_TEST:FID43:admin> switchshow
switchName: Sydney_ILAB_X6_43_TEST
switchType: 165.0

Index Slot Port Address Media Speed State Proto
66 4 2 01ef40 id N8 Online FC F-Port 50:06:0e:80:10:13:b5:b8 
93 4 29 010000 id 16G Online FC F-Port 10:00:00:90:fa:c7:cd:e8

The above shows you when you see errors happening as part of a SAN attached disk where to look and how to assocaite the Emulex adapters to the respective WWN’s on your SAN.

From there on you can also identify which disks are presented to that adapter. As you’ve seen above the PCI subsystem creates a host interface per FC port. In my case these are host11 to host14.

A simple way to check is to just do an “ls” on /sys/class/scsi_disk/device/block tree.

[[email protected] scsi_disk]# ls */device/block/




As you can see the 11:xxxxx entries will list the respetive “/dev/sd*” entries that is being used for mounting volumes, MPIO listings etc.

Obviously there are heaps of tools available to ease your troubleshooting efforts. I would advise to install the Emulex OCMananger tools that are provided as a free separate package. It can be installed as an agent and agent-less feature. Other tools like “lsblk”, “blockdev”, sg-tools package and a few more are there to make your life a bit easier so you don’t have to crawl thru the sysfs tree yourself.

Let me know if this was helpful. You feedback is much appreciated.



General Info, Linux, Storage , , , ,

Reducing MFA/2FA requests on cloud apps


Third party authentication and authorisation providers like okta, azure, gcs or aws often have a trusted connection to the tenants. This sometimes allows that authentication requests via MFA/2FA options can be bypassed as the authentication has already occured from inside the tennants network.
When employees work from remote locations they can set up a VPN to their companies network in one of two modes.

  • Full Tunnel – this causes ALL traffic to travers the VPN to the companies network and then is propagated to their internal server or via firewalls and proxies to the internet.
  • Split Tunnel – Only traffic destined for the subnet routes that get pushed from the vpn server will traverse the vpn tunnel.

The full tunnel setup may be helpful if you only work with systems inside your corporate network. Given the fact vast amount of application are now published in some obscure place called “The Cloud” you basically have no clue where it resides.

I’ve created a script pushed to github (over here) that creates specific routes based on your settings that may result in a reduction on your MFA/2FA requests to be validated.

Have a look at the “README” for more info.


General Info, Linux , , ,

NXDOMAIN hijacking and ISP behaviour

Something none storage related. This article at “The Register” triggered me to write this post and explain why I don’t see this behaviour in my household. The trick is to configure DNS-over-HTTPS in your network.

For the non-technical people who read this the title may already be an incentive to not read any further but please bear with me.

Read more »

General Info, Linux , , ,

FOS version 9 – Gen 7 Fibre-Channel is here

Yahooo… (No not the company) FOS version 9 is here. The one that starts to support Gen 7 (64/256 Gbit) Fibre Channel. Now, just in case you’re getting excited and want to go <<<< hold on!!.

Read more »

Brocade, Brocade Technical , , ,

PortFencing – the hard or soft way?

As you can read in my previous articles (here, here and here) having a physical issue on any of you FC links is detrimental to your entire FC infrastructure. Not only does it corrupt frames and primitives but is also resulting in traffic flow issues which may even propagate to other fabrics which even have a so called air-gap. (See here)

Read more »

Brocade Technical, Uncategorized , , ,

Dynamic connectivity overview in “switchshow” output

In a Brocade environment the “switchshow” is one of the most used commands out there. It provides a quick overview of what the state of the switch is, switch name, switch attributes and a list of all ports and states. It had however its limitations which, with later codelevels, can be corrected.

Read more »

Brocade, Brocade Technical , , , ,