Category Archives: General Info

Troubleshooting Linux Storage 2023

As you may have seen, the first release of my book was published in late September, and I’ve received a great deal of positive feedback. To the ones who have provided me the constructive feedback, I am most grateful, and I’ll make sure these are addressed in the 2023 release.

One of the most asked questions was if I could expand on the Fibre-Channel and NVMeoFC side as that seems to be an area where many Linux administrators, who also deal with storage infrastructure management, have problems with. The main reason people asked is that I’ve been doing this for over 20 years so I must have some decent knowledge on this. They’ve followed my blog for a long time and would like to see the correlation of issues in a FC network and how this propagates onto the various layers of the operating system. Whether this is related to path management, IO issues, security, discovery or other problems that show up on Linux hosts, when it originates somewhere in the FC network it is often difficult to pinpoint the exact location of the issue.

I would be very happy to expand on this and share the knowledge that I have and provide examples with problems and resolutions.

Hardware

Even though I’ve worked with the most complex and expensive equipment out there, I do not have a $100K home-lab sitting in my study. Recent FC equipment is relatively expensive when compared to Ethernet and there is no such thing as a free-bee Wireshark that can do line-rate FC traffic capturing or injecting errors like we have with “tc qdisc” options. Host bus adapters, a 16G or newer switch that can talk NVMeoF and has FPIN capabilities would already need to have some recent chipset and software. The same thing goes for an FC array.

I’m currently in touch with some good friends in the industry to see what the options are and if are able to accommodate my request. I know from experience that there are hurdles and roadblocks in the form of financial or legal restrictions so I need to take things as they come. I’m grateful for any effort people take to help me out.

If there are past, current or future customers who have “spare/superfluous” equipment in this area and are able/willing to help I would be extremely pleased.

It looks like this post seems to have turned out as some begging exercise, but that is not the intention. I am really committed to provide the best information that I can give to my readers and hope that they are able to prevent, or resolve, storage related issues in a Linux environment as much as possible. Having the proper tools to do that is obviously a prerequisite to achieve that.

If you are able to help and want to get in touch to see what we can do, just email me or make an appointment via the contact page over here.

Kind regards,

Erwin

TP-Link TL-SG1218MPE Small Business switch (Product review)

TL-SG1218MPE

A while ago, I planned to update my home network for a couple of reasons. As I’ve been working from home for a while, most of the interactions I had with customers ran over secured VPN links but still over the same local Wi-Fi network as everything I had hooked up including a 2 dozen IoT devices of various sorts, media server, speakers etc. As the communication with external parties changed a bit with having a private, employee as well as business links, I needed to change the way I worked. I decided on a few things.

Continue reading

Using systemd-resolved to optimise DNS resolution.

When you work from home and are required to use the corporate network you’re often shoved into a dilemma where the VPN configuration that is pushed to your PC results in one of two modes, Full-tunnel or Split-Tunnel.

Digging tunnels

A full-tunnel configuration is by far the most dreadful especially when your VPN access-point is on the other side of the planet. Basically all traffic to and from your system is pushed through that tunnel. This is even the case when a web-page is hosted next-door from where you are sitting. Your requests to that webserver will first traverse via your VPN connection to the other side of the planet where your companies proxies will retrieve the page via the public web only to send it back to you via that VPN again. Obviously the round-trip and other delays will basically result in abominable performance and a user-experience that is excruciatingly painful.

A split-tunnel however is far more friendly. As I explained in one of my previous articles (here) only traffic destined for systems inside your corporate network will be routed over the VPN and requests to other systems will just traverse the public interweb. 

Domain Name resolution

There is however one exception DNS i.e. the name to (IP) number translation. Traditionally Linux uses a system-wide resolver that looks in “/etc/resolve.conf” what you DNS servers are and which domains to search for plus a few other options. That basically means that as soon as you have any VPN tunnel active you would always need to use your corporate DNS servers for any request as your system does not really know which server is located where. There may even be a situation that your corporate DNS servers point to a different host for the same domain. You often see this where employees get additional functionality than external users or credential verification may be bypassed as you already have an authorised session to the internal systems.

The drawback is however that sites outside your corporate network are also resolved via your companies DNS servers. This may not only be a limitation on performance from a resolver standpoint, remember that these DNS requests also have to traverse the same VPN tunnel, but the resulting system to where you end up may also not be the most appropriate one.

As an example.

If you have an active VPN to your corp.com, even DNS queries for a web-site in your country will first go “Corp DNS” who, if it does not already have a cached address itself, will forward that request to whatever “Corp DNS” has configured as its upstream DNS server. (In this case Google). As you can see you could’ve asked Googles DNS servers yourselves but as you VPN session has set your resolver to use the Corp DNS that does not happen. An additional point of attention is that you have to be aware of is that no matter which website you visit your company will have a record of that as most corporate regulations stipulate that actions done on their systems will be logged for whatever purpose they deem necessary. This may sometime conflict with different privacy policies in different countries but that is most often shuffled under the carpet and hidden in legal obscurity.

The above also means that when you have requests for sites that span geographies, you may not always get to the most optimal system. Many DNS system are able to determine where the request is coming from and subsequently provide a IP address of a system that is closest to the requestor. As your request is fulfilled by your companies’ DNS server on the other side of the planet, that web-server may also be there. Not to panic as many of these environments have build in smarts to re-direct you to a more local system it nevertheless means this situation is far from optimal. What you’re basically after is to have the ability to, in addition to that split-tunnel configuration, direct DNS queries to DNS servers which actually host the domains behind that VPN and nothing else.

In the above case your Linux system has two interfaces. One physical (WIFI or Ethernet) and one virtual (VPN most often called tunX where X is the VPN interface number)

Meet systemd-resolved

There are some Linux (or Unix) purists who shudder at the sight of systemd based services but I think most of them are actually pretty OK. Resolved is one of them.

What resolved allows you to do is assign specific DNS configurations to different interfaces in addition to generic global options.

As an example

Global
LLMNR setting: yes
MulticastDNS setting: yes
DNSOverTLS setting: no
DNSSEC setting: allow-downgrade
DNSSEC supported: no
Fallback DNS Servers: 9.9.9.9
DNSSEC NTA: 10.in-addr.arpa
16.172.in-addr.arpa
168.192.in-addr.arpa
<snip>
31.172.in-addr.arpa
corp
d.f.ip6.arpa
home
internal
intranet
lan
local
private
test

Link 22 (tun0)
Current Scopes: DNS LLMNR/IPv4 LLMNR/IPv6
DefaultRoute setting: yes
LLMNR setting: yes
MulticastDNS setting: no
DNSOverTLS setting: no
DNSSEC setting: no
DNSSEC supported: no
Current DNS Server: 10.15.230.6
DNS Servers: 10.15.230.6
10.15.230.7
DNS Domain: corp.com
internal.corpnew.com

Link 3 (wlp0s20f0u13)
Current Scopes: DNS LLMNR/IPv4 LLMNR/IPv6
DefaultRoute setting: yes
LLMNR setting: yes
MulticastDNS setting: no
DNSOverTLS setting: no
DNSSEC setting: yes
DNSSEC supported: yes
Current DNS Server: 192.168.1.1
DNS Servers: 192.168.1.1
DNS Domain: ~.
ourfamily.int

As you can see it has three sections. The global section caters for many default settings which can be superseded by per-interface settings. I think overview speaks for itself. All requests to domain “corp.com” and “internal.corpnew.com” will be sent to one of the two DNS servers with the 10.15.230.[6-7] adress. All my home internal requests as defined by the “ourfamily.int” domain are sent to the 192.168.1.1 address. The “~.” means all other requests.

That will result in queries being returned like:

[1729][erwin@monster:~]$ resolvectl query zzz.com
zzz.com: 10.xx.16.9 -- link: tun0
10.xx.16.8 -- link: tun0
172.xx.24.164 -- link: tun0
172.xx.24.162 -- link: tun0
10.xx.100.4 -- link: tun0
10.xx.148.66 -- link: tun0
10.xx.7.221 -- link: tun0
10.xx.7.34 -- link: tun0
10.xx.7.33 -- link: tun0
10.xx.100.5 -- link: tun0

-- Information acquired via protocol DNS in 243.1ms.
-- Data is authenticated: no

If I would use an external DNS system for that domain it would return different addresses.

[1733][erwin@monster:~]$ dig @9.9.9.9 +short zzz.com
169.xx.75.34

(The above are not my real domains I queried but I think you get the drift)

Queries to non-corporate websites will be retrieved via the WIFI interface (wlp0s20f0u13)

[1733][erwin@monster:~]$ resolvectl query google.com
google.com: 2404:6800:4006:809::200e -- link: wlp0s20f0u13
216.58.203.110 -- link: wlp0s20f0u13

-- Information acquired via protocol DNS in 121.0ms.
-- Data is authenticated: no

As my home router has a somewhat more sophisticated setup this also allows me to have all external DNS requests, not destined to corp.com or corpnew.com, use a DNSoverHTTPS or DNSoverTLS configuration to bypass any ISP mangling.

Setup

Systemd-resolved is a systemd service (duhh) which needs to be enabled first with “systemctl enable systemd-resolved“. The configuration files are located in /etc/systemd/resolved.conf or in a .d subdirectory of that where individual configuration files can be stored.

-rw-r--r-- 1 root root 784 Oct 20 14:32 resolved.conf
drwxr-xr-x 2 root root 4096 Oct 20 14:24 resolved.conf.d/

The settings can also be applied interactively via the “resolvectl” command which I have done. If your distro has NetworkManager installed then NM can also automatically configure resolved via D-bus calls.

There is more involved than I can easily simplify here as it would pretty quickly become a re-wording of the man-page which I try to avoid. At least I hope it has given you some information of what you can do with “systemd-resolved

Kind regards,

Erwin

Enabling Verbose Logging on Linux with Emulex Host Bus Adapters

Where did my disks go?

So now and then you may run into an issue which cannot be explained properly by just looking at the standard events that show up in “/var/log/messages“.

Issues such as

Oct 7 18:24:20 centos8 kernel: lpfc 0000:81:00.0: 0:1305 Link Down Event xc received Data: xc x20 x800110 x0 x0
Oct 7 18:24:24 centos8 kernel: rport-11:0-4: blocked FC remote port time out: removing target and saving binding
Oct 7 18:24:24 centos8 kernel: lpfc 0000:81:00.0: 0:(0):0203 Devloss timeout on WWPN 50:06:0e:80:07:c3:70:00 NPort x01ee40 Data: x0 x8 x2

are fairly common and the above simply shows a Link Down event. These are the most easy to troubleshoot when the remote switchlog tell you

Continue reading

Reducing MFA/2FA requests on cloud apps

Intro

Third party authentication and authorisation providers like okta, azure, gcs or aws often have a trusted connection to the tenants. This sometimes allows that authentication requests via MFA/2FA options can be bypassed as the authentication has already occured from inside the tennants network.
When employees work from remote locations they can set up a VPN to their companies network in one of two modes.

  • Full Tunnel – this causes ALL traffic to travers the VPN to the companies network and then is propagated to their internal server or via firewalls and proxies to the internet.
  • Split Tunnel – Only traffic destined for the subnet routes that get pushed from the vpn server will traverse the vpn tunnel.

The full tunnel setup may be helpful if you only work with systems inside your corporate network. Given the fact vast amount of application are now published in some obscure place called “The Cloud” you basically have no clue where it resides.

I’ve created a script pushed to github (over here) that creates specific routes based on your settings that may result in a reduction on your MFA/2FA requests to be validated.

Have a look at the “README” for more info.

 

Preventing client DNS leaking on OpenWRT

A while ago I wrote an article whereby I provided an OpenDNS resolver server via DHCP to the computers, tablets and phones of my kids. (See here). This worked very well and I have been able to keep the nastiness of the web out of sight. Plus it gave me the option to block certain sites which were not captured under a certain category or, if those domains fell under a category that also included a lot of useful domains, exclude them.

Continue reading

Getting rid of whitespace

No, not storage related but more towards coding scripts etc and assuring your git repositories do not show up with huge diff sections you need to correct. Just a little tip and a “note to self”.

If you’ve event been keen enough to not use an IDE for whatever language you use and kept to a real editor (VIM obviously.. :-)) you may have encountered the phenomenon that whitespace at the end of lines is a nasty thing to look at when you start putting stuff into version control repositories like Subversion or GIT. A little change from some copy or past action may leave you with a “git diff” of a couple of hundred lines you need to correct.

To fix that simply let VIM clear out all empty whitespace (tabs, spaces, etc.) by having these removed before the actual write to disk.

To do that simply add

autocmd BufWritePre *.sh :%s/\s\+$//e

to your ~/.vimrc and with every :w the substitute function driven by the regex after the colon will remove it all in all shell scripts (*.sh). Obviously you can add every extension you need here.  Very handy.

Cheers,

Erwin

Brocade is no more

Exactly 20 years ago I installed my first Fibre Channel switch. It was from a startup company founded by Kumar Malavalli, Paul Bonderson and Seth Neiman who wanted to dive into the storage networking business based on a protocol developed by one of my long term mentors and the god-father of Fibre-Channel Horst Truestedt together with a bunch of highly skilled engineers.

Continue reading