The end of spinning disks (part 2)

Maybe you found the previous article a bit hypothetical and is not substantiated by facts but merely some guestimations?

To put some beef into the equation I’ll try to substantiate it with some simple calculations. Read on.


As shown in Cornell Uni’s report the expected amount of data generated will reach 1700 exabytes in 2011 with an additional 2500 in 2012. 1700 exabytes equates to 1 trillion, 700 billiard gigabytes in EU notation (say what…., look here)

So number-wise it looks like this: 1.700.000.000.000 GB

The average capacity of a disk drive in 2011 is around 1400 GB (the average of enterprise drives with high RPM of 600GB + the largest capacity wise commercially available for enterprise environments HDD of 2TB).In consumer land WD has a 6TB drive but these will not become mainstream until the end of 2011 or beginning 2012 . Maybe storage vendors will use the 3 and 4 TB versions but I do not have visibility of that currently.

1700EB / 1400GB = 1.214.285.714 disk drives are needed to store this amount of information. (Ohh, in 2012 we need 1.785.714.286 units :-))

This leads us to have a look at production capabilities and HD vendors. Currently there are two major vendors in the HDD market. Seagate (which shipped 50 million HDD in FQ3 2011) and WD shipping 49 million. (Seagate acquired HGST and WD is talking to the HDD division of Samsung) Those 4 companies combined have a production capacity of around 150 million diskdrives per quarter. This means on an annual basis a shortage of : 1.214.285.714 – 600.000.000 = 614.285.714 HDD’s
So who says the HDD business isn’t a healthy one? 🙂

OK, I agree, not everything is stored on HDD and the offload to secondary media like DVD,BlueRay,tape etc will cut a significant piece out of this pie however the instantiation of new data will primarily be done on HDD’s. Adoption of newer, larger capacity HDD is restricted for enterprise use because the access density is getting too high which equates to higher latency and lower performance which is not acceptable in these kind of environments.

This means new techniques will need to be adopted in all areas. From a performance perspective a lot can be gained with SSD’s (Solid State Drives) which have extremely good read performance but still lack somewhat in write performance as well as long term reliability. I’m sure over time this will be resolved. SSD will however not fill the capacity gap needed to accommodate the data growth.

As mentioned before my view is that this gap can and will be filled by advanced 3D optical media which provides new levels of capacity, performance, reliability and cost savings.

I’m open for constructive comments.

Cheers,
Erwin

The end of spinning disks

Did you ever wonder how long this industry will rely on spinning disk? I do and I think that within 5 to 10/15 years we’ve reached the end of the abilities of disks to keep up with demand and data growth ratios. A report from Andrey V Makarenko of Cornell University estimates that around 1700 Exabytes (yes EXA-bytes) will be generated in 2011 alone with growth rates to over 2500 EXAbytes next year.

With new technologies invented and implemented in science, space exploration, health care and last but not least consumer electronics this growth ratio will increase exponentially. Although disk drive technology has kept pretty much pace with Moore’s law you can see the advances in development of this technology is declining. Rotational speed has been steady for years and the edges of perpendicular recording have almost been reached. This means that within the foreseeable future there will be a flipping point were demand will outgrow the capacity. Even if production facilities would be increased to keep up with demand, do we as society want to have these massive infrastructures which are very expensive to build and maintain as well as having a huge burden on our environment. So were does this leave us, do we have to stop generating data or generate it in a far more efficient way or should we also combine this with aggressive data life cycle management. I wrote an article earlier in this blog which shows how this could be achieved and it doesn’t take a scientist to understand it.
To go back to the subject there are talks that SSD will take over a significant amount of magnetic based drives and maybe it is so however it still lacks on reliability in one form or another. I’m sure this will be resolved in the not so distant future however will this technology be as cost effective as spinning disks have been in the last decades. I think this will take a significant amount of time to reach that point. So where do we go from here? It is my take that in addition to the uptake of SSD based drives significant advances will be made in 3D optical storage. This will not only allow for massive increase in capacity per cubic inch but also a reduction in cost, energy as well as a massive increase in performance.
Advancements in laser technology and photonic behavior as well as optical media will clear the pathway of adoption into data-centers the moment this will become commercially attractive.

There are numerous scientific studies as well as commercial entities working on this type of technology and due to market demand add significant pressure on the development of it. Check out this wikipedia article on 3D optical storage to get some more information around the technicalities.

Let me know your opinion.

Regards,
Erwin van Londen

Fibre Channel improvements.

So what is the problem with storage networking these days? some of you might argue that it’s the best thing since sliced bread and it’s the most stable way to shove data back and forth and maybe it is however this is not always the case. The problem is some gaps still exist which have never been addressed and one of them is resiliency. There is a lot that has been done to detect errors and to try to recover from them but nobody ever thought of how to prevent errors from occurring. (Until now that is). Read on.

So what is the evolution of a standard like Fibre Channel.It normally is born out of a need that isn’t addressed with current technologies. The primary reason FC was erected is that the parallel SCSI stack had a huge problem with distance. It did not scale beyond a couple of meters and was very sensitive to electrical noise which could disturb the reliable transmission that was needed for a data intensive channel protocol like SCSI. So somebody came up with the idea to serialise the data stream and FC was born. A lot of very smart people got together and cooked up the nifty things we now take for granted like massive address-space, zoning, huge increase in speed and lot of other goodies which could have never been achieved with a parallel interface.

The problem is that these goodies are all created in the dark dungeons of R&D labs. These guys don’t speak much (if at all) to end-user customers so the stuff coming out of these labs is very often extremely geeky.
If you follow a path from the creation of a new thing (whether technology or anything else) you see something like this:

  1. Market demand
  2. R&D
  3. Product
  4. Sales
  5. Customers
  6. Post sales support

The problem is that very often there is no link between #5/#6 and #2. Very often for good reason but this also inflicts some serious challenges. Since I’m not smart enough to work in #2 I’m on the bottom of the food chain working in #6. 🙂 But I do see the issues that arise in this path so I cooked something up. Read on.

Going back to fibre channel there is one huge gap and that is fault tolerance and the acting upon failures in a FC fabric. The protocol defines how to detect errors and how to try to recover from these but is does not have anything which defines how to prevent errors from reoccurring. This means that if an error has been detected and frames get lost we just say “OK, lets try it again and see if it succeeds now”. It doesn’t take a genius to see that if something is broke this will fail again.

So on the practical side there are a couple of things that most often go wrong and that is the physical side of things like SFP’s and cables. These result in errors like encoding/decoding failures, CRC errors, signal and synchronization errors. If these occur the entire frame including your data payload will get dropped and we’re asking to the initiator of that frame to try and resend it. If however the initiator does not have this frame in it’s buffers anymore we rely on the upper layer protocol to recover from this. Most of the time it succeeds, however, as previously mentioned, if things are really broke this will fail again. From an operating system perspective you will see this as SCSI check conditions and/or read/write failures. On a tape environment this will often result in failed back/restore jobs.

Now, you’re gonna say “Hold on buddy, that why we have dual redundant fabrics, multiple entries to our LUNS, multipathing etc etc” i.e. redundancy. True, BUT, what if it is just partially broken? An dodgy SFP or HBA might send out good signals but there could also be a certain amount of not so good signals. This will result in intermittent failures resulting in the above mentioned errors and if these happen often enough you might get these problems. So, although you have every piece of the storage puzzle redundant, you might still run into problems which, if severe enough, might affect your entire storage infrastructure. (and it does happen, believe me)

The underlying problem is that there is no communication between N-Ports and F-ports as well as lack of end-to-end path error verification to check if these errors occur in the fabric and if so how to mitigate or circumvent these. If an N-port sends out a signal to an F-port which gets corrupted underway there is no way the F-port is notifying the N-port and saying “He, dude you’re sending out crap, do something about it”. Similar issue is in meshed fabrics. We all grew up since 1998 with FSPF (Fabric Shortest Path First) which is a FC protocol extension to determine the shortest path from A to B in a FC fabric based on a least cost routing algorithm. Nothing wrong with that however what if this path is very error prone? Does the fabric have any means to make a decision and say “OK, I don’t trust this path, I’ll direct that traffic via another route”? No, there is nothing in the FC protocol which provides this option. The only way routes are redefined is if there are changes in the fabric like an N-Port coming online/offline and registers/de-registers itself with the fabric nameserver and RSCN (Registered Name Change Notifications) are sent out.

For this reason I submitted a proposal to the T11 committee via my, teacher and father of Fibre Channel Horst Truestedt, to extend the FC-GS services with new ways to solve these problems. (proposal can be downloaded here )

The underlying thoughts are to have port-to-port communication to be able to notify the other side of the link it is not stable as well as have and end-to-end error verification and notification algorithm so that hosts, hba’s and fabrics can act upon errors seen in the path to their end devices. This allows active redirection of frames to circumvent frames of passing via that route as well as the option to extend management capabilities so that storage administrators can act upon these failures and replace/update hardware and/or software before the problem becomes imminent and affects the overall stability of the storage infrastructure. This will in the end result in far greater storage availability and application uptime as well as prevent all the other nasty stuff like data corruption etc.

The proposal was positively received with an 8:0 voting ratio so now I’m waiting for a company to pull this further and actually starting to develop this extension.

Let me know what you think.

Regards
Erwin

SNW 2011 Thursday

Today was the last day of the spring edition of SNW. I think it’s been a good conference although the customer attendee numbers could have been better. I think these days the vendors are all keeping their gigs to themselves to prevent customers from wandering around and see other solutions as well after all SNW is an event from Computerworld and SNIA.

This conference has been around three major subjects and that were Cloud, virtualisation and convergence. The funny thing is that I have the impression people left the conference with more questions than answers. A lot of technologies are not fleshed out and nobody seems to have an accurate description of what cloud actually is. Some evidence is clearly there customers are not ready to to adopt new technologies, see here.Earlier in the week Greg Schulz and I had a chat and I mentioned cloud computing to me the total abstraction of business applications from the infrastructure but yet it seems that when I put my ears to the ground everyone comes up with another definition which suits their own needs.

Anyway, today was also my turn for a presso and I had a pretty good attendee turn-up considering it was the last day. Lets wait for the evaluations. I tried to keep it more tangible with issues people are facing today in their data-centers and gave some tips and hints how to overcome them.I had another presso lined up for submission but was not scheduled for a slot. Both of them are downloadable from the SNIA tutorial website both here and here. I do appreciate feedback so don’t hesitate to comment below.

At noon we went for lunch in a little mexican restaurant just on the east side of Santa Clara. Had a great time with Marc Farley, J Michel Metz and Greg Knieriemen. I hope to see them soon again. Great guys to hang around with.

Tomorrow is travel-time for me again back to Down Under after which I wind down to relax a bit with my family on Fiji.

So far the reporting from my side covering SNW Spring 2011.

Keep in touch.

Cheers
Erwin

Is there anything Linux does not have??

I’ve been using Linux since 1997 and back in the “good old days” it could take weeks to have a proper setup which actually had some functionality in it beyond the Royal Kingdom of Geekness.It was a teeth-pulling exercise to get the correct firmware and drivers for a multitude of equipment so if it didn’t exist you were relying on the willingness of hardware vendors to open up their specs so you could work on this yourself.


So much has changed over these last 15 years in the sense that even my refrigerator and phone is running Linux as well as the largest hadron collider and even space stations run on Linux. The vast amount of manufacturing consortium’s are actively developing on and for linux and it looks like the entire IT industry shifts from proprietary operating systems to this little opensource project Mr. Torvalds kicked of almost two decades ago. His fellowship in the IT Hall of Fame is well deserved.

One area were Linux is hardly seen is still on the regular desktop at peoples home office desktops and this is one of the big shortfalls that linux still has. All of the above mentioned examples are really specialised and tailored environments where Linux can be “easily” adopted to suite exactly that particular need and it does an incredible job at it. The people who use Linux have either a more than average interest in computing or fall into the coke and chips/pizza category (Yes, Geeks that is). Just walk into a computer store and ask for a PC/laptop (whatever) for a PC but have them remove the windows operating system, subtract the MS license fee from the invoice and ask for a Fedora/Ubuntu/”you name one of the 100’s of distro’s” to be installed instead. Chances are fairly high you get some glare eyes staring at you. This is the big problem Linux faces.

From a hardware support level most of it if fairly well covered. Maybe not under open-source licenses but from a usability perspective this doesn’t really matter.

Although the Linux foundation does a good job in promoting and evangelizing Linux it will never have the operational and financial power companies like Microsoft have so a commercial heads-on attack is doomed to fail. The best approach, i think, although perceived long term thinking, is via the educational system. make sure young children get in touch with different operating systems so they have the choice of what to use in the future. I recently knocked off windows from my somewhat older laptop and installed Ubuntu. My kids are now using this one for all sorts of things. My son discovered the command line and he’s getting curious. (He thinks he’s smart so I use SELinux, pretty annoying for him :-))
The thought behind this all is they also get another view of what computers can do and that there is more than MS.

As for day to day apps I think Linux still falls short on office automation. Regarding functions and features they still can’t compete with MS but the catchup game has begun.

Cheers
Erwin van Londen

SNW 2011 Wednesday

The Wednesday started off with 4 keynotes and the Best Practices awards and some of them gave some useful insight of datacentre transformation and how to approach this. It also, again, touched on the subject of organizational challenges that the inevitable change will incur if you want, or need, to adopt new strategies.

After the break I had a good conversation with Greg Schultz of @storagio. (Very nice guy, if you have the chance go look him up some time)We shared some insights, discussed primarily the ongoing discussion of where cloud computing in general can or should reside and how to attack the challenges. After that we both joined the session of J Michel Metz, in short know as J. J is the product manager of FCoE at Cisco and he elaborated on the options you now have with this new technology. I’ve attended a couple of the FCoE sessions and in the end everyone seems to agree that although the technology is there it will be extremely hard to adopt it since it requires an immense amount of organizational change to take away fear, risk, and mindset of the people who have worked on these very different technologies. Well, let me rephrase that, the technologies in itself are not that different w.r.t. packaging frames and switching or routing these though a network but the purpose these technologies serve is totally different. I wrote about this before in the article “Why FCoE will die a silent death”. I very much like the technology and admire Silvano what he has accomplished but it will be up to the customer to decide whether or not he/she wants to adopt this technology. The surprising thing that I got of these sessions is that although most vendors acknowledge Fibre Channel will be around for quite some time, they also hint that the development will slow down significantly. All development efforts these days seem to go into Ethernet so at some point in time customer wont have this choice anymore. Only time will tell if customers will accept that. I think it still requires a lot of engineering effort to fill in  the gaps that are still there to overcome reliability and stability issues.
After lunch I joined a wikibon panel session with FCoE advocates of the different vendors. It’s quite obvious they do want you to adopt this and although they don’t admit it they can’t afford FCoE to fail. It has been on the market now for two years and it looks like the adoption ratio is still extremely low or at least not as good as they wanted it to be. They also have to be very careful not to step on toes so the general sense was start now but do it controlled to protect your current assets and increase over time with natural technology refreshes.

The last breakout session of the day for me was the one with @YoClaus Mickelsen, @virtualheff Mike Heffernan and Jerry Sutton from TSYS who elaborated on the end-to-end server to storage virtualisation and the advantages HDS has to offer. Very good session with good questions and feedback. I also, finally,  met Michael  Jaslowski who works for Northern Trust. I have had the pleasure to analyze and fix some of their issues they’ve had in the past on their SAN infrastructure. Always good to meet up with people who you normally only speak to on the phone.

To close the day Miki Sandorfi had a general session around taking your own pace and look out for good solutions before adopting cloud technologies. Keep an open mind and explore what’s right for your business.

See ya tomorrow.

Cheers
E

SNW 2011 Tuesday

The second day at SNWUSA kicked off with three keynotes. One from Randy Mott, CIO of HP, who had a very good session of how they changed their internal infrastructure and processes to increase efficiency and ability to innovate. Second one was from my boss, or actually my bosses, bosses boss so that gives some perspective where I fit in the organization :-), Jack Domme who entered the world of massive data storage, preservation, and ability to harvest and analyse on these enormous amounts of data.


After that I went to the expo to check on the guys in the HDS booth and the setup of the demo’s. Always very nice to see all the software features and functions we provide.
My first breakout session was from John Webster who was providing an overview on “Big Data”. Now what the heck is big data? John provided on overview of some definitions, products and some freaky stuff that’s on the horizon. Then we’ re talking about brain implants and some things you really don’t want to talk or think about since this might touch the real essence of life. Anyway, when  we got our feet back on the ground again, I went over to Kevin Leahy who runs the Cloud strategy at IBM’s CTO office.I think this was the best visited break-out session and people actually had no seat. As Kevin pointed out there are a lot of points that needs to be looked at when either adopting or providing a cloud service. But on the other side it also gives you a lot of flexibility and options to provide IT services to your business and customers.

During lunch I recieved a message from Horst Truestedt who presented my FC-GS proposal on fibre-channel resiliency at the T11 committee meeting in Philadelphia. He mentioned it was extremely well received and by and got unanimously got accepted. This was the best news for today. I’ve been waiting a long time for this and it seems it’s coming now. Yeahh.

After lunch I had the opportunity to meet a lot of people who were here long before I got into this business and warped some pretty interesting discussions into the equation. Due to this I missed a session so I have to dig up the presso from the SNIA website.

The last two keynotes were from Krishna Nathan,who runs the development of IBM’s storage development and he had some interesting points but also some things that are already there in the market place and have been done and dusted by HDS for a long time. Wondering if IBM is trying to play catchup here.The last session was from Dave Smoley. Dave runs the internal IT stuff at Flextronics. His message was that even with a very tight budget and well defined standards innovation doesn’t have to suffer. Instead he encourages companies to push their employees to be creative and come up with ideas that might have high value to the business but not necessarily cost an arm and a leg.

During diner I met up with Robin Harris who runs the StorageMojo blog and a column on ZDNet , were we had some interesting discussions around SSD and life expectancy of disks. Not how long a single disk or SSD is running but we were questioning when disk would become obsolete. Robin thinks he’s going to be retired a long time before disks run out of existence whereas I have the view disks will cease to exist within 10 to 15 years. We’ll see, time will tell. 🙂

A lot of buzz has been going around regarding #storagebeers. It’s a social gathering of people of all shapes, size, background and mindsets all working in the storage industry. It’s a great event with a fantastic turnout. Word goes around this was the best #storagebeers ever. I can’t tell, haven’t been to one before but it indeed was a great evening.

 THE Marc Farley aka StorageRap at #storagebeers.

 The best part is there is absolutely no competitive mindset, everyone talks to everyone about everything but storage. (well a little bit but all geek stuff. :-))

As you can see on the right a great turnup.

That more-or-less sums up my day.

CU tomorrow.
Cheers, E

SNW 2011 Monday

The usual Monday on a storage conference. It’s quite obvious all vendors are poaching cloud storage, seems to be the next big thing but what I couldn’t find out is what cloud storage actually is. Seems all vendors have a different mindset and they try to align their product set to the very cloudy definition of cloud storage.Second big thing is converged networking. Well, you know my thoughts around that. Mixing two totally different cultures is doomed to fail but we’ll see in the long term. The third big thing is solid state disks. Not new and I’m wondering what the fuss is all about. Just another layer of blocks which happen to be very fast but only for specific purposes. When the pricing come down significantly and when vendors are able to sort out the back-end performance we’ll see more adoption is guess.

I have a great day though. Met some of the most notorious bloggers like Seve Foskett, Stu Miniman, Marc Farley, Calvin Zito and others. Marc took us out for lunch at a sushi restaurant down town San Jose together with calvin, enrico and fabio.

From left to right Calvin Zito, Marc Farley, Fabio Raposseli and Enrico Signoretti.

The day for me started of with a session on Storage Security by Gordon Arnold from IBM. It’s very clear that the biggest problem is, again, not the technical side but more the side of legislation and how to adapt the technology to that legislation. Some countries are very restrictive in data locality, recoverability, authenticity and other labels you can hook this up to. There are quite some challenges in this particular arena.

David Dale from NetApp went on with a session on solid state with some very nice comparisons on price per IO and capacity. As always these things don’t go well together so again if the pricing on SSD’s is fixed we’ll see some more adoption in the marketplace.

Dennis Martin from Demartek had a session in which he outlined the benefits and current landscape of FCoE. I think he and I both agreed that the organisational issue as well as the post-sales support issues might have big ramifications on adoption from FCoE in the market. Generally speaking the storage guys don’t like the networking guys and vice versa. It’s a different mindset.

After that it was time for The Woz. (and that’s not an acronym for The Wizard of Oz :-)) Steve Wozniack, co-founder of Apple and these days tied to Fusion-IO. What can you say from the man who more or less invented the PC as we know it. He had some good jokes and still drives people to think about technology and encourages students to discover the capabilities of it.

After lunch had a catchup on Virtual Instruments but had nothing new for me unfortunately. The day concluded with a small cocktail party and had some chats with people around. It is still a very small business and everyone knows everyone but its good to see so many news things coming although it still think a lot of it have to be formalised and standardised before customers should pick up any of it. It’s still a lot of buzz but no real meat that will actually fix stuff.

The Apple iPad equivalent of storage has yet to be found and I still don’t see any vendor able to solved the puzzle. Maybe the next couple of days will bring more. At least we have #storagebeers coming up. 🙂

See ya, tomorrow.

Cheers,
E

SNWUSA – The tourist sunday

Yesterday I met up with Mike Heffernan aka @virtualheff and had some diner in Palo Alto. (Ohh btw mike, thanks for picking up the bill.) and discussed the usual  things around technicalities which we could do better so it was a useful brainstorming session as well. 


Today I first went to the Computer History Museum in Mountain View. Very nice to see the legacy and evolution of computer science. After that I drove to San Fransisco and spent some time over there. Very nice city but you don’t want to ride a bike there. Man it’s hills are steep. So you do the usual tourist stuff so here are some pictures.

 Google’s First.

 Doesn’t need any intro. Unfortunately the tickets were sold out so I couldn’t visit this time.

Maybe the most famous bridge in the world.
More pictures are posted on Fotki  and here .
Tomorrow SNW is going to kick off. I’m very exited and I hope it going to be worth the trip. Curious how many people have registered for my presso on Thursday.
Cheers
Erwin van Londen

SNWUSA Pre-conference schedule

As this week is going to fairly interesting at SNWUSA in Santa Clara I thought I’d might give some short daily updates. In the twittersphere there is already some great noise spread with all schedules around #storagebeers and other social interactions. Yesterday was travelday for me. Getting on a massive Airbus A380  is always quite an experience. Since I used my points on other gadgets (mainly for my kids)
I had to fly tourist class so 14 hours being cranked in a chair is not the best way to travel but on an A380 it’s still doable. At least they have some pretty good entertainment (or annoyance distraction) system in that aircraft so time “flew” by pretty fast. The flying Kangaroo brought me safely to Los Angeles.
After a short wait in LAX the next plane (Ambrear, what a difference) took me to San Jose and I stopped by the office on Central Express way to catch up with some emails and check out the HDS corporate HQ.

Tonight I’m going to have a catchup with buddy Heff (no not that Heff, Michael Heffernan or @virtualheff for insiders) and have some dinner.

I think it’s going to be a good and exiting week.

More to come.

Cheers
E