What is the first question vendors get (or at least used to get) when a customer (non-technical) calls???
I’ll spare you the guesswork: “What does a TB of disks do at your place ??”. Usually I’ll goofle around for the cheapest disk at a local PC store and say “Well Sir, that would be about 80 dollars”. I then hear somebody falling of the chair, trying to get up again, reach for the phone and with a resonating voice asking “Why are your competitors so expensive then?”. “They most likely did not gave a direct answer to your question.”, I reply.
The thing is an HDD should be evaluated on multiple factors and when you spend 80 bucks on a 1TB disk you get capacity and that’s about it. Don’t expect performance or extended MTBF figures let alone all the stuff than comes with enterprise arrays like large caches, redundancy in every sense and a lot more. This is what makes up the price per GB.
“Ok, so why have disk drives become so much slower in the past couple of years?”. Well, they haven’t. The RPM, seek time and latency have remained the same over the last couple of years. The problem is that the capacity has increased so much that the so called “access density” has increased linearly so the disk has to service a massive amount of bytes with the same nominal IOPS capability.
I did some simple calculations which shows the decrease in performance on larger disks. I didn’t assume any raid or cache accelerators.
I first calculated a baseline based on a 100GB disk drive (I know, they don’t exist but it just for the calculations) with 500GB of data that I need to read or write.
The assumption was to have a 100% random read profile. Although the host can read or write in increments of 512 bytes IO size theoretically this doesn’t mean the disk will write this IO in one sequential stroke. An 8K host IO can be split up in the smallest supported sector size on disk which is currently around 512 bytes. (Don’t worry, every disk and array will optimize this but again this is just to show the nominal differences)
So when I have 100GB disk drive this translates to a little over 190 million sectors. In order to read 500 GB of data this would take a theoretical 21.7 minutes. The number of disks are calculated based on the capacity required for that 500GB (Also remember that disks use a base10 capacity value whereas operating systems,memory chips and other electronics use a base2 value so that’s 10^3 vs 2^10.)
|Sectors||RPM||Avrg delay in ms||Max IOPS||Disks required||6|
If you now take this baseline and map this to some previous and current disk types and capacities you can see the differences.
|GB||Sectors||RPM||# Disk per A29||Num IOPS||Time required in sec||in min||%pcnt of base line||* times base value|
You can see here that capacity wise to store the same 500GB on 146 GB disks you need less disks but you also get fewer total IOPS. This then translates into slower performance. As an example a 300GB drive with 10000RPM triples the time compared to the baseline disk to read this 500 gigabyte.
Now these a re relatively simple calculations however they do apply to all disks including the ones in your disk array.
I hope this also makes you start thinking about performance as well as capacity. I’m pretty sure your business finds it most annoying when your users need to get a cup of coffee after every database query. 🙂