Tag Archives: performance

FPIN – The Holy Grail of SAN Stability

As some of you may recall about a decade ago I made a proposal to incorporate more intelligence into the end-devices to be able to better react to changing conditions in fabrics. I called it the “Error Reporting with Integrated Notification” framework (mind the acronym here. :-))

Basically the intention was to have end-devices check for errors along paths which their frames traverse by sending a “query-frame” to the remote device. Each hop along the way could then add its values (errors, counters) to that frame and the remote device would, upon reception of that frame, also add its counters, reverse the SID (Source ID) and DID (Destination ID) and send that same frame back to the original sender. That sender would then be able to make decisions whether to use that same path for subsequent frames or if it would hold of using it temporarily or not at all. Read on.

Continue reading

Performance misconceptions on storage networks

The piece of spinning Fe3O4 (ie rust) is by far the slowest piece of equipment in the IO stack. Heck, they didn’t invent SSD and Flash for nothing, right. To overcome the terrible latency, involved when a host system requests a block of data, there are numerous layers of software and hardware that try to reduce the impact of physical disk related drag.

One of the most important is using cache. Whether that is CPU L2/L3 cache, DRAM cache or some hardware buffering device in the host system or even huge caches in the storage subsystems. All these can, can will, be used by numerous layers of the IO stack as each cache-hit means it prevents fetching data from a disk. (As in intro into this post you might read one I’ve written over here which explains what happens where when a IO request reaches a disk.)

Continue reading

1.1 – MAPS – Know what’s going on.

I’ve written about Fabric Watch quite a lot and I have always stressed the usefulness of this licensed add-on as a feature in FOS. This post will outline the major characteristics of MAPS and why you should migrate now. As of FOS 7.2 there has been a transition from Fabric Watch to MAPS (Monitoring and Alerting Policy Suite) and over the past few FOS versions it has seen a huge improvement in overall RAS (Redundancy, Availability and Serviceability) monitoring features. As of FOS 7.4 FabricWatch is no longer incorporated in FOS and as such MAPS is the only option you have if you want to use it.  MAPS is one section of a two part suite called Fabric Vision together with its performance companion “Flow-vision”. The MAPS part can interact with flow-vision based on criteria you specify and monitor/alert on performance related events.

Continue reading

Cross-fabric collateral damage

Ever since the dawn of time the storage administrators have been indoctrinated with redundancy. You have to have everything at least twice in order to maintain uptime and be able to achieve this to a level of around 99,999%. This is true in many occasions however there are exceptions when even dual fabrics (ie physically separated) share components like hosts, arrays or tapes.. If a physical issue in one fabric is impacting that shared component the performance and host IO may even impact totally unrelated equipment on another fabric and spin out of control.

Continue reading

Short stroking disk drives to improve performance

Reading a post from Hans DeLeenheer (VEEAM) which ramped up quite a bit including responses from Calvin Zito (HP), Alex McDonald (NetApp) and Nigel Poulton. The discussion started on a comment that XIO had “special” firmware which improved IO performance. Immediately the term “short-stroking” came up which leads to believe X-IO is cheap-skating on their technologies. I was under the same impression at first right until the moment I saw that Richard Lary is (more or less) the head of tech at X-IO together with Clark Lubbers and Bill Pagano who also come out of the same DEC stable. For those of you who don’t know Richie, he’s the one who ramped up Digital StorageWorks back in the late 70’s/early 80’s and also stood at the cradle of VAX-VMS. (Yeah yeah, I’m getting old, google it if you don’t know what I’m talking about.)

Continue reading

Queue-depth

I recently was involved in a discussion around QD settings and vendor comparisons. It almost got to a point where the QD capabilities were inherently linked to the quality and capabilities of the arrays. Total and absolute nonsense of course. Let me make one thing clear “QUEUING IS BAD“. In the general sense that is. Nobody want to wait in line nor does an application.

Whenever an application is requesting data or is writing results of a certain process it will go down the I/O-stack. In the storage world there are multiple locations where such a data portion can get queued.When looking at the specifics from a host level the queue-depth is set on two levels.

(p.s. I use the terms device, port, array interchangeably but they all refer to the device receiving commands from hosts.)

Continue reading