SoE, SCSI over Ethernet.

It may come as no surprise that I’m not a fan of FCoE. Although I have nothing against the underlying thought of converged networking I do feel that the method of encapsulating multiple protocols in yet another frame is overkill, adds complexity, requires additional skills, training and operating methods and introduces risk so as far as I’m concerned it shouldn’t be needed. The main reason FCoE is invented is to have the ability to traverse traffic from Fibre Channel environments through gateways (called FCF’s) to an Ethernet connected Converged Network Adapter in order to save on some cabling. Yeah, yeah I know many say you’ll save a lot more but I’m not convinced.
After staring at some ads from numerous vendors I still wonder why they never came up with the ability to directly map the SCSI protocol on Ethernet in the same way they do with IP. After all with the introduction of 10G Ethernet all issues of reliability appear to have gone (have they??) so it shouldn’t be such a problem to directly address this. This was the main reason why Fibre Channel was invented in the first place. I think from a development perspective this should be an evenly amount of effort to have SCSI directly transported on Ethernet compared to Fibre Channel.From an interface perspective it shouldn’t be such a problem as well. I think storage would be as happy to shove in an Ethernet port in addition to FC. They wouldn’t need to use any difficult FCoE or iSCSI mechanisms.

Since all, or at least a lot, development efforts these days seem to have shifted to Ethernet why still invest in Fibre Channel. Ethernet still has a 7 layer OSI stack but you should be able to just use three, the physical, datalink, and networking layer. This should be enough to shove frames back and forth in a flat Ethernet network (or Ethernet Fabric as Brocade calls it).For other protocol like TCP/IP this is no problem since they already use the same stack but just travel a bit higher up. This then allows you to have a routable iSCSI environment (over IP) as well as a native SCSI protocol running on the same network. The biggest problem is then security. If SCSI runs on a flat Ethernet network there is no way (yet) to secure SCSI packets arriving at all ports in that particular network segment. This would be the same as having no zoning active as well as disabling all LUN masking on the arrays. The only way to circumvent this is to invent some sort of “Ethernet Firewall” mechanism. (I’m not aware of a product/vendor who provides this but I’ve never heard of it.) I’ts pretty easy to spoof a MAC address so that’s no good as a security precaution. 

As usual this should then also have all the other security features like authentication, authorisation etc etc. Fibre Channel already provides authentication based on DH-CHAP which is specified in the FC-SP standard. Although DH-CHAP exists in the Ethernet world it is strictly tied to higher layers like TCP. It would be good though to see this functionality on the lower layers as well.

I’m not an expert on Ethernet so I would welcome comments that would provide some more insight of the options and possibilities.

Food for thought.

Regards,
Erwin

Why disk drives have become slower over the years

What is the first question vendors get (or at least used to get) when a customer (non-technical) calls???
I’ll spare you the guesswork: “What does a TB of disks do at your place ??”. Usually I’ll goofle around for the cheapest disk at a local PC store and say “Well Sir, that would be about 80 dollars”. I then hear somebody falling of the chair, trying to get up again, reach for the phone and with a resonating voice asking “Why are your competitors so expensive then?”. “They most likely did not gave a direct answer to your question.”, I reply.
The thing is an HDD should be evaluated on multiple factors and when you spend 80 bucks on a 1TB disk you get capacity and that’s about it. Don’t expect performance or extended MTBF figures let alone all the stuff than comes with enterprise arrays like large caches, redundancy in every sense and a lot more. This is what makes up the price per GB.


“Ok, so why have disk drives become so much slower in the past couple of years?”. Well, they haven’t. The RPM, seek time and latency have remained the same over the last couple of years. The problem is that the capacity has increased so much that the so called “access density” has increased linearly so the disk has to service a massive amount of bytes with the same nominal IOPS capability.


I did some simple calculations which shows the decrease in performance on larger disks. I didn’t assume any raid or cache accelerators.


I first calculated a baseline based on a 100GB disk drive (I know, they don’t exist but it just for the calculations) with 500GB of data that I need to read or write.


The assumption was to have a 100% random read profile. Although the host can read or write in increments of 512 bytes IO size theoretically this doesn’t mean the disk will write this IO in one sequential stroke. An 8K host IO can be split up in the smallest supported sector size on disk which is currently around 512 bytes. (Don’t worry, every disk and array will optimize this but again this is just to show the nominal differences)


So when I have 100GB disk drive this translates to a little over 190 million sectors.  In order to read 500 GB of data this would take a theoretical 21.7 minutes. The number of disks are calculated based on the capacity required for that 500GB (Also remember that disks use a base10 capacity value whereas operating systems,memory chips and other electronics use a base2 value so that’s 10^3 vs 2^10.)

Baseline
Sectors RPM Avrg delay in ms Max IOPS Disks required 6
100 190,734,863 10000 8 125 Num IOPS 750
Time Required 1302
in minutes 21.7

 If you now take this baseline and map this to some previous and current disk types and capacities you can see the differences.

GB Sectors RPM # Disk per A29 Num IOPS Time required in sec in min %pcnt of base line * times base value
9 17,166,138 7200 57 4731 206 3.44 15.83 6.32
18 34,332,275 7200 29 2407 406 6.77 31.19 3.21
36 68,664,551 10000 15 1875 521 8.69 40.02 2.50
72 137,329,102 10000 8 1000 977 16.29 75.04 1.33
146 278,472,900 10000 4 500 1953 32.55 150 0.67
300 572,204,590 10000 2 250 3906 65.1 300 0.33
450 858,306,885 10000 2 250 3906 65.1 300 0.33
600 1,144,409,180 10000 1 125 7813 130.22 600.08 0.17

You can see here that capacity wise to store the same 500GB on 146 GB disks you need less disks but you also get fewer total IOPS. This then translates into slower performance. As an example a 300GB drive with 10000RPM triples the time compared to the baseline disk to read this 500 gigabyte.


Now these a re relatively simple calculations however they do apply to all disks including the ones in your disk array.


I hope this also makes you start thinking about performance as well as capacity. I’m pretty sure your business finds it most annoying when your users need to get a cup of coffee after every database query. 🙂

Why not FCoE?

You may have read my previous articles on FCoE as well as some comments I’ve posted on Brocade’s and Cisco’s blog sites. It won’t surprise you that I’m no fan of FCoE. Not for the technology itself but for the enormous complexity and organisational overhead involved.

So lets take a step back and try to figure out why this has become so much of a buzz in the storage and networking world.

First lets make it clear that FCoE is driven by the networking folks and most notably Cisco. The reason for this is that Cisco has around 90% market share of the data centre networking side but they only have around 10 to 15% of the storage side. (I don’t have the actual numbers at hand but I’ m sure it’s not far off). Brocade with their FC offerings have that part (storage) pretty well covered. Cisco hasn’t been able to eat more out of that pie for quite some time so they had to come up with something else. So FCoE was born. This allowed them (Cisco) to slow but steady get the foot in the storage door by offering a, so called,  “new” way of doing business in the data centre and convince customers to go “converged”.

I already explained that their is no or negligible benefit from an infrastructural and power/cooling perspective so cost-effectiveness from a capex perspective is nil and maybe even negative. I also showed that the organizational overhaul that has to be accomplished is tremendous. Remember you’re trying to glue two different technologies together by adding a new one. The June-2009 FC-BB-5 document (where FCoE is described) is around 1.9 MB and 180 pages give or take a few. FC-BB-6 is 208 pages and 2.4 MB thick. How does this decrease complexity?
Another part that you have to look at is backward compatibility. The Fibre Channel standard went up to 16Gb/s a while ago and most vendors have released product for it already. The FC standard does specify backward compatibility to 2Gb/s. So I’m perfectly safe when linking up an 16G SFP with a 8Gb/s or 4 Gb/s SFP and the speed will be negotiated to the highest possible. This means I don’t have to throw away some older, not yet depreciated, equipment. How does Ethernet play in this game? Well, it doesn’t, 10G Ethernet is incompatible with 1G so they don’t marry up. You have to forklift your equipment out of the data center and get new gear from top to bottom. How’s that for investment protection? The network providers will tell you this migration process comes naturally with equipment refresh but how do you explain that if you have to refresh one or two director class switches were your other equipment can’t connect to it this is a natural process? This means you have buy additional gear that bridges between the old and the new; resulting in you paying even more. This is probably what is meant by “naturally”. “Naturally you have to pay more.”

So it’s pretty obvious that Cisco needs to pursue this path will it ever get more traction in the data center storage networking club. They’ve also proven this with UCS, which looks like to fall off the cliff as well when you believe the publications in the blog-o-sphere. Brocade is not pushing FCoE at all. The only reason they are in the FCoE game is to be risk averse. If for some reason FCoE does take off they can say they have products to support that. Brocade has no intention of giving up an 80 to 85% market share in fibre channel just to be at risk to hand this over the other side being Cisco Networking. Brocade’s strategy is somewhat different than Ciscos’. Both companies have outlined their ideas and plans on numerous occasions so I’ll leave that for you to read on their websites.

“What about the other vendors?”  you’ll say. Well that’s pretty simple. All array vendors couldn’t care less. For them it’s just another transport mechanism like FC and iSCSI and there is no gain nor loss if FCoE makes it or not. They won’t tell you this in your face of course. The other connectivity vendors like Emulex and Qlogic have to be on the train with Cisco as well as Brocade however their main revenue comes out of the server vendors who build products with Emulex or Qlogic chips in them. If the server vendors demand an FCoE chip either party builds one and is happy to sell it to any server vendor. For the connectivity vendors like these it’s just another revenue stream they link into and cannot afford to be outside a certain technology if the competition is picking it up. Given the fact there is some significant R&D required w.r.t. chip development these vendors also have to market their kit to have some ROI. This is normal market dynamics.

“So what alternative do you have for a converged network?” was a question that was asked to me a while ago. My response was “Do you have a Fibre Channel infrastructure? If so, then you already have a converged network.” Fibre Channel was designed from the bottom up to transparently move data back and forth irrespective of the upper protocol used including TCP/IP. Unfortunately SCSI has become the most common but there is absolutely no reason why you couldn’t add a networking driver and the IP protocol stack as well. I’ve done this many times and never have had any troubles with it.

The question is now: “Who do you believe?” and “How much risk am I willing to take to adopt FCoE?”. I’m not on the sales side of the fence not am I in marketing. I work in a support role and have many of you on the phone when something goes wrong. My background is not in the academic world. I worked my way up and have been in many roles where I’ve seen technology evolve and I know when to spot bad ones. FCoE is one of them.

Comments are welcome.

Regards,
Erwin

HP ends Hitachi relationship

Well, this maybe a bit premature and I don’t have any insights in Leo’ s agenda but when you apply some common sense and logic you cannot draw another conclusion than within the foreseeable future this will happen. “And why would that be?” you say, “They (HP) have a fairly solid XP installed base and they seem to do sell enough to make it profitable and they also have embarked on the P9500 train”.

Yes, indeed, however take a look at it from the other side. HP has currently 4 lines of storage products, the MSA inherited thru the Compaq merger which comes out of Houston and specifically targeted at the SMB market, the EVA, from the Digital/Compaq StorageWorks stable, which has been the only HP owned modular array which has done well in the SME space, the XP/P9500 obviously thru their Hitachi OEM relationship and, since last year, the 3-Par kit. When you compare these products they do have a lot of overlap in many areas especially in the open systems space. It is therefore that the R&D budgets for all the 4 products eat up a fair amount of dollars. Besides that, HP also has to set aside a huge amount of money for Sales, Pre-Sales, Services and Customer support in training, marketing etc to be able to provide a solution of which a customer will only choose the one which fits their needs. So just from a product perspective there is a 1:4 sales ratio. I don’t even mention the choices customers have from the competition. For the lower part of the pie (MSA & small EVA) HP heavily relies on their channel but from a support and marketing perspective this still requires a significant investment to keep those product lines alive. HP just has released their latest generation of the EVA but as far as I know has not commented on future generations. It is to be expected that as long as the EVA sells like it has always done the development of it will continue.

With the acquisition of 3-Par last year HP has dived very deep in their money pit and paid 2.3 billion dollars for them. You don’t make such an investment to just keep a certain product out of the hands of a competitor (Dell in this case). You do want this product to sell like hotcakes to be able to shorten your ROI as much as possible. Leo has quite some shareholders to answer to. It then depends where you get the most margins from and it is very clear that when you combine the ROI needs of 3-Par and the margins they will obviously make on that product HP will most likely prefer to sell 3-Par before XP/P9500 even if the latter would be a better fit for the solution needed by the customer. When you put it all together you’ll notice that even within the storage division of HP there is a fair amount of competition between the product lines and no R&D department for either of those want to loose. So who needs to give??

There are two reasons why HP would not end their relation ship with Hitachi, Mainframe and Customer demand. Neither of the native HP product have Mainframe support so if HP decides to end the Hitachi relationship they will certainly loose that piece as well as obtaining the risk that same customer chooses the competition for the rest of the stack as well. Also if XP/P9500 customers already have made significant investment investment in Hitachi based products, they most certainly will not like a decision like this. HP, however is also not reluctant to make these harsh decisions. History proves they’ve done it before. (Abruptly ending and OEM relationship with EMC as an example.)

So, if you are an HP customer who just invested in Hitachi technology, rest assure you will always have a fallback scenario and that of course is to deal with Hitachi itself. Just broaden your vision and give HDS a call to see what they have to offer. You’ll be very pleasantly surprised.

Regards,
Erwin

(post-note 18-05-2011) Some HP customers have already been told that 3-Par equipment is now indeed HP preferred solution they will offer unless Mainframe is involved.

 (post-note 10-07-2011) Again more and more proof is surfacing. See Chris Mellor’s post on El Reg over here

Will FCoE bring you more headaches?

Yes it will!!.

Bit of a blunt statement but here’s why.

When you look at the presentations all connectivity vendors (Brocade,Cisco,Emulex etc…) will give you they pitch that FCoE is the best thing since sliced bread. Reduction in costs, cooling, cabling and complexity will solve all of your to-days problems! But is this really true?

Let start with costs. Are the cost savings really that big as they promise. These days a server 1G Ethernet port sits on the motherboard and is more or less almost a free-bee. Expectation is that the additional cost of 10Ge will be added to a server COG but as usual they will decline over time. Most servers come with multiple of these ports. On average a CNA is 2 times more expensive then 2 GE ports + 2 HBA’s so that’s not a reason to jump to FCoE. Each vendor have different price lists so that’s something you need to figure out yourself. The CAPEX is the easy part.

An FCoE capable switch (CEE or FCF) is significantly more expensive than an Ethernet switch + a FC switch. Be aware that these are data center switches and the current port count on an FCoE switch is not sufficient to deploy large scale infrastructures.

Then there is the so called power and cooling benefit. (?!?!?) I searched my butt of to find the power requirements on HBA’s and CNA’s but no vendor is publishing these. I can’t imagine an FC HBA chip eats more than 5 watts however a CNA will probably use more given the fact it runs on a higher clock speed and for redundancy reasons you need two of them anyway so in general I think these will equate to the same power requirements or an eth+hba combination is even more efficient than CNA’s. Now lets compare a Brocade 5000 (32 port FC switch) with a Brocade 8000 FCoE from a BTU and power rating perspective. I used their own specs according to their data sheets so if I made a mistake don’t blame me.

A Brocade 5000 uses a maximum of 56 watts and has a BTU rating of 239 at 80% efficiency. An 8000 FCoE switch uses 206 watts when idle and 306 watts when in use. The BTU heat dissipation is 1044.11 per hour. I struggled to find any benefit here. Now you can say that you also need an Ethernet switch but even if that has the same ratings as a 5000 switch you still save a hell of a lot of power and cooling requirement on separate switches. I haven’t checked out the Cisco, Emulex and Qlogic equipment but I assume I’m not far off on those as well.

Now, hang on, all vendors say there is a “huge benefit” in FCoE based infrastructures. Yes, there is, you can reduce your cabling plant but even there is a snag. You need very high quality cables so an OM1 or OM2 cabling plant will not do. As a minimum you need OM3 but OM4 is preferred. Do you have this already? If so good you need less cabling, if not buy a completely new plant.

Then there is complexity. Also an FCoE sales pitch. “Everything is much easier and simpler to configure if you go with FCoE”. Is it??? Where is the reduction in complexity when the only benefit is that you can get rid of cabling. Once a cabling plant is in place you only need to administer the changes and there is some extremely good and free software to do that. So even if you consider this as a huge benefit what do you get in return. A famous Dutch football player once said “Elk voordeel heb z’n nadeel” (That’s Dutch with an Amsterdam dialect spelling :-)) which more or less means that every benefit has it’s disadvantage i.e. there is a snag with each benefit.

The snag here is you get all the nice features like CEE,DCBX,LLDP,ETS,PFC,FIP,FPMA and a lot more new terminology introduced into you storage and network environment. (say what???). This more or less means that each of these abbreviations needs to be learned by your storage administrators as well as you network administrators, which means additional training requirements (and associated costs). This is not a replacement for your current training and knowledge but this comes on top of that.
Also these settings are not a one-time-setup which can be configured centrally on a switch but they need to be configured and managed per interface.

In my previous article I also mentioned the complete organizational overhaul you need to do between the storage and networking department. From a technology standpoint these two “cultures” have a different mindset. Storage people need to know exactly what is going to hit their arrays from an applications perspective as well as operating systems, firmware, drivers etc. Network people don’t care. They have a horizontal view and they transport IP packets from A to B irrespective of the content of that packet. If the pipe from A to B is not big enough they create a bigger pipe and there we go. In the storage world it doesn’t work like this as described before.

Then there is the support side of the fence. Lets assume you’ve adopted FCoE in your environment. Do you have everything in place to solve a problem when it occurs. (mind the term “when” not “if”)  Do you know exactly what it takes to troubleshoot a problem. Do you know how to collect logs the correct way? Have you ever seen a Fibre Channel trace captured by an analyzer? If so, where you able to bake some cake of it and actually are able to pinpoint an issue if there is one and more importantly how to solve this? Did you ever look at fabric/switch/port statistics on a switch to verify if something is wrong? For SNIA I wrote a tutorial (over here) in which I describe the overall issues support organisations face when a customer calls in for support and also what to do about it. The thing is that network and storage environments are very complex. By combining them and adding all the 3 and 4 letter acronyms mentioned above the complexity will increase 5-fold if not more. It therefore takes much and much longer to be able to pin-point an issue and advise on how to solve it.

I work in one of those support centers of a particular vendor and I see FC problems every day. Very often due to administrator errors but far more because of a problem with software or hardware. These can be very obvious like a cable problem but in most cases the issue is not so clear and it take a lot of skills, knowledge, technical information AND TIME to be able to sort this out. By adding complexity it just takes more time to collect and analyze the information and advise on resolution paths. I’m not saying it becomes undo-able but it just takes more time. Are you prepared and are you willing to provide your vendor this time to sort out issues?

Now, you probably think I must hold a major grudge against FCoE. On the contrary; I think FCoE is a great technology but it’s been created for technologie’s sake and not to help you as customer and administrator to really solve a problem. The entire storage industry is stacking protocols upon protocols to circumvent the very hard issue that they’ve screwed up a long time ago. (Huhhhh, why’s that?)

Be reminded that today’s storage infrastructure is still running on a 3 decade old protocol called SCSI (or SBCCS for z/OS which is even older). Nothing wrong with that but it implies that shortcomings of this protocol needs to be circumvented. SCSI originally ran on a parallel bus which was 8-bit wide and hit performance limitations pretty quick. So they created “wide scsi” which ran on a 16-bit wide bus. With increase of the clock frequencies they pumped up the speed however the problem of distance limitations became more imminent and so they invented Fibre-Channel. By disassociating the SCSI command set from the physical layer the T10 committee came up with SCSI-3 which allowed the SCSI protocol to be transported over a serialized interface like FC which had a multitude of benefits like speed, distance and connectivity. The same thing happened with Escon in the mainframe world. Both the Escon command set (SBCCS now known as Ficon) as well as SCSI (on FC known as FCP) are now able to run on the FC-4 layer. Since Ethernet back then was extremely lossy this was no option for a strict lossless  channel protocol with low latency requirements. Now that they have fixed up Ethernet a bit to allow for loss-less transport over a relatively fast interface they now map the entire stack into a mini-jumbo frame and the FCP-4 SCSI command and data sits in a FC encapsulated frame which in turn now sits in an Ethernet frame. (I still can’t find the reduction in complexity, if you can please let me know.)

What should have been done instead of introducing a fixer-upper like FCoE is that the industry should have come up with an entirely new concept of managing, transporting and storing data. This should have been created based on todays requirements which include security (like authentication and authorization), retention, (de-)duplication , removal of awareness of locality etc. Your data should reside in a container which is a unique entity on all levels from application to the storage and every mechanism in between. This container should be treated as per policy requirements encapsulated in that container and those policies are based on the content residing in there. This then allows for a multitude of properties to be applied to this container as described above and allows for far more effective transport

Now this may sound like trying to boil the ocean but try to think 10 years ahead. What will be beyond FCoE? Are we creating FCoEoXYZ? 5 Years ago I wrote a little piece called “The Future of Storage” which more or less introduced this concept. Since then nothing has happened in the industry to really solve the data growth issue. Instead the industry is stacking patch upon patch to circumvent current limitations (if any) or trying to generate a new revenue stream with something like the introduction of FCoE.

Again, I don’t hold anything against FCoE from a technology perspective and I respect and admire Silvano Gai and the others at T11 what they’ve accomplished in little over three years but I think it’s a major step in the wrong direction. It had the wrong starting point and it tries to answer a question without anyone asking.

For all the above reasons I still do not advise to adopt FCoE and urge you to push your vendors and their engineering teams to come up with something that will really help you to run your business and not patching up “issues” you might not even have.

Constructive comments are welcome.

Kind regards,
Erwin van Londen