Earlier in 2015, NetApp announced the results for the brand-new EF560 All-Flash Array. If you read the press release, we didn’t make one mention of straight IOPS calculated, but focused on the $/IOPS as well as the ridiculously low latencies we were able to achieve. We also had Mike Phelan on the NetApp podcast to go over the results, and he also put an extreme focus on low latency. I’ll never forget him saying, “The race to zero microseconds is on!”
You’ll understand why by the end of this post, I promise. It’s a long one with a lot of data. Coffee and focus required.
Whether you want to believe in it or not, whether some vendor who doesn’t participate tells you its irrelevant because of some reason that makes them look bad, the Storage Performance Council is the top dawg when it comes to the performance benchmarking of enterprise storage systems. There is always a lot of himming and hawing over these results, their validity, how vendors game the system, “anyone can build a dragster,” etc etc etc. It’s old and tired. So part of this post is not only going to be a brag, but also a breakdown of WHY these results matter, not just the results themselves. Why it’s not just about sheer IOPS delivered, but the cost in a $/IOPS figure with real workload simulations (not just random 4k reads), and the low latencies required to deliver those IOPS at those costs.
What is the Storage Performance Council?
The goal of the SPC is to serve as a catalyst for performance improvement in storage subsystems. It works to foster the free and open exchange of ideas and information, and to insure fair and vigorous competition between vendors as a means of improving the products and services available to the general public.
In support of these goals, the SPC develops benchmarks focusing on storage subsystems. These subsystems include components like:
- electronic disks
- magnetic disks
- magnetic tapes
- optical disks
- media robots
- media robot software systems
- media library software systems
- backup/archival software systems
- hierarchical storage management systems
… as well as all the adapters, controllers, and networks that connect storage devices to the computer system.
Key Understanding of the SPC-1 Benchmark
Let’s review some common talking points about the SPC-1 right off the bat, and put to-bed any myths or misconceptions that may be floating around out there, by truly defining what this benchmark encompasses.
(1) SPC-1 Simulates an OLTP workload
This is set up intentionally to be extremely repeatable, but also tends to be a very “write-intensive” workload; way more than just about any other benchmark out there. In a way, this invalidates a lot of your dragster/supercar comparisons out there. This test is not just about straight reads.
(For the record, Josh, I agree with everything you said, but this is not at all a BS test of 4k random reads)
(2) SPC-1 is not very “cache friendly”
Though not impossible, it is extremely difficult for vendors to “overengineer” their systems to effectively use cache to increase results, preventing vendors from gaming the system. This particular benchmark has been designed to simulate conditions that customers frequently encounter. This is done independently outside of any influence from storage vendors.
(3) SPC-1 reports “Value” as $/SPC-1 IOPS
Sure, the SPC-1 allows systems of different sizes to be compared to one another, but at the end of the day, it’s all about the value. Not every single storage system is alike, and most vendors worth their salt have more than one result showing a chain of value across their entire portfolio.
Smaller and less-costly systems are certainly going to produce lower performance on paper, and larger more costly systems are going to produce higher performance … BUT … it’s still all about the $/IOPS.
NetApp is no different.
(4) SPC-1 continuously tracks Response Time
You can build the biggest, baddest IOPS-churning dragster all.day.long, but the result is completely and utterly meaningless unless you also know the response time associated with the IOPS achieved.
Perspective: Would you rather have a system that does 2 Billion IOPS, but requires a bajillion disks and runs at 20ms latencies, or would you rather have a system that runs 650,000 IOPS in 2U that delivers it in <1ms?
Common Mistakes regarding SPC-1 Results
(1) Focusing solely on the final number
This is probably the most common. “OMG! Vendor XYZ did 2.2 Billion IOPS! LOL UR SLOW! ONTAP/WAFL/NetApp sucks!”
What you have to understand wholly in order for these tests to mean anything to you is that this top-end number doesn’t relate to how responsive the system will [or will not] be to application demands. This doesn’t completely discredit the test as a whole though. Don’t jump there! I’m simply commenting on the “top speed of your dragster” and the irrelevance of it to your real-world datacenter. Those numbers are not captured for bragging rights, they are captured to put into a $/IOPS value formula.
(2) Response Time!
Ask any app owner, DBA, Exchange admin, MSSQL engineer … it’s highly likely that your performance metric means nothing if the storage system cannot keep up with the response time requirements they and their applications demand. This is hands-down THE most important and consequential metric related to performance.
(3) RAID
Did you forget about RAID? I bet you did. It’s certainly not the first thing that comes to mind when talking about performance drag races. It’s got some overhead. Do you think all of the vendors used the most resilient configuration or the most conveniently performant configuration? Again, these numbers mean jack squat if the tests were run on anything below a double-disk parity solution. Why? It’s 99.999% unlikely you’re ever going to run anything but that in production, regardless of vendor.
NetApp will always run RAID-DP in FAS.
(4) Capacity Used
Why would this matter? This is another one of the big mistakes … or should I say, things often overlooked. Without calling anyone out, I’ll simply tell you that you need to be sure you’re examining and understanding the amount of hardware that’s required for a particular vendor to achieve measured performance levels. Understanding the percentage of application utilization and unused storage ratio will help reveal efficiency and validate the test.
Bottom line, there are many things to keep in-mind as we review the results below. If you’re unsure about any of these results or why they should matter to you, leave us a comment at the bottom, and we’ll be happy to address them!
Kieran … Dim the Lights … Here We Go!
I’m going to start by just throwing a chart up here of relevant results sorted by their SPC-1 IOPS and then we’ll dive into some analysis…
This is all information available in the full disclosure reports of each of these results, current as of April 22, 2015. We’re not disclosing anything that isn’t already publicly available. We’ve simply organized it so that it’s sorted by the total SPC-1 IOPS number, as well as highlighting some of the other key metrics, such as what we were calling attention to earlier in the post.
Before the chart, I referred to these as relevant results. To me, they’re relevant for several reason, but primarily these are often the vendors we find ourselves up against or compared to in competitive situations, or they have simply posted some astounding results, and I wanted to call attention and discuss some additional points around performance and capacity as it relates to the benchmark.
As I was saying before, it’s often too easy to just look at the SPC-1 IOPS column and walk away, but you would be doing yourselves a disservice. So, if you’re willing to come along for the ride with me, let’s dive into the weeds a bit more …
Examining the Lowest Response Times (SPC-1 LRT)
The ultimate capabilities of a storage subsystem to provide minimum I/O request response times in on OLTP environment is documented by the SPC-1 LRT result.
The final reported SPC-1 LRT metric is computed as the Average Response Time of the 10% load level Test Run.
Bottom Line: The ultimate capabilities of a storage subsystem to provide minimum I/O request response times in the SPC-1 environment is documented by the SPC-1 LRT result. The lower (smaller) the LRT, the quicker, or more responsive, a system is considered to be.
Let’s drill into this a little bit further. Since I’m a NetApp employee, I’m going to show the NetApp graphs and leave it to the vendors to show or write about theirs if they so choose to do so, or I could simply tell you the information is freely available on the SPC website if you’re willing to dig a little, and if your storage vendor participates. Oh they don’t? Tsk tsk …
First, let’s take a look at the EF560 all-flash array result from earlier this year …
This response time chart demonstrates a very desirable curve. The amazing performance was achieved with a single drive enclosure and a mere (24) 400GB solid state drives.
Amongst the systems being reported on, this has the best $/SPC-1 IOPS with sub-millisecond latencies. Of all the systems we are examining, the EF560 has the lowest LRT, at only 180 µs. The response curve is deceptive and may look unattractive but that is due to the expanded response time scale along the left side
100% of measure response time is delivered at less than one millisecond. Stew on that for a second. Have you asked your account rep about the E/EF-series arrays yet? You should!
NEW! NetApp FAS8080EX All-Flash FAS
Today, SPC posted an official result for the All-Flash FAS configuration of our flagship 8080EX box, ranking 5th in total SPC-1 IOPS delivered [as of April 22, 2015], with some awesome overall value and SPC-1 LRT the industry has ever seen.
This response time chart demonstrates zero “hockey stick”. It is the ideal curve because there is NO CURVE.
The FAS8080EX has an extremely flat, very consistent response time curve. It is linear from beginning to end which makes it very easy to plan application deployments, workloads, and greater ability to build budget planning. While the cost is higher than some of the less feature-rich systems, you must remember it is never about just a single number. FAS systems deliver multiple system efficiencies through very feature-rich software management, and this is based on our true list price. We didn’t discount. Neither did Kaminario. Kudos to them for that. None of you should be discounting, in my opinion, but unfortunately, that’s part of the testing parameters. Yet another thing to keep in mind when you’re digesting this torrent of information.
Examining Capacity and Value
If we take the SPC-1 LRT chart from earlier and begin to build it out, we begin to expose some troubling information around the datapoints. Let’s add a column for total configured capacity …
It is very import to examine the size of systems that a vendor tests. The portion of capacity that is used in the benchmark is broken into three Application Storage Units (ASUs), which is where the SPC-1 workloads run. Typically systems running the SPC-1 benchmark are configured in some variation of RAID 10. This often has a severe impact on useable capacity & value.
Bottom Line: The Storage Performance Council mandates that vendors meticulously document their storage configurations. Vendors are required to specify how much of the storage system they are using. This is important information because it helps demonstrate how efficient various systems truly are. I’ll let you go and find your favorite vendors configuration they used to test. I bet you’ll be shocked at just how much gear it took to achieve some of the numbers. We’ll get to that in a bit.
For now, let’s keep building this out and take a look at Application Utilization …
Application Utilization is determined by taking the Total ASU Capacity divided by the total Physical Storage Capacity. This illustrates how much of the actual capacity in the storage system was used to achieve the benchmark result. You’re on the lookout here for larger percentages!
That 62% number for the FAS8080EX essentially means that we were using 62% of our total physical capacity to reach the numbers we did. The lower the percentage means tons more gear, and much higher price, which is why you’re seeing them heavily discount their test results.
I can’t imagine only using 20-30% of my capacity in my storage array just to maintain performance characteristics…
Can you?
Moving on … Protected Application Utilization …
Protected Application Utilization is calculated as the TOTAL ASU capacity plus DP capacity (typically 2x) MINUS the unused data protection capacity divided by physical storage capacity.
Bottom Line: This helps to demonstrate just how much of the total capacity is being eaten up through whatever RAID scheme is being used by each vendor.
Which leads me to … Unused Storage Ratio
This is the heart of efficiency, right here, folks.
Unused Storage Ratio is: Total Unused Capacity / Physical Storage Capacity
As a hard line in the SPC-1 test, this may not exceed 45%. You’re looking for the smallest percentages of unused storage in the tests.
Bottom Line: This is an excellent way to quickly determine which systems are delivering the best capacity efficiency, and which systems must be needlessly large to achieve performance.
Final Thoughts
First, NetApp’s EF560 has officially set the bar and established a #1 ranking in $/SPC-1 IOPS for systems in the sub-millisecond arena. Which is arguably most of them at this point. The EF560 is the PERFECT all-flash system for minimal overhead and all-horsepower. If you’re looking for the best valued all-flash array to run your applications that need the most extreme performance, this is the absolute best you’re going to find in overall value and efficiency.
Second, NetApp delivered some of the most amazingly consistent performance with the “All-Flash FAS (AFF)” in an 8-node clustered Data ONTAP configuration. This is the sheer definition of no compromise, high performance, clustered scale-out shared storage, bringing along with it all of the advanced features you’ve known and loved for years.
We have the #1 Storage OS, #1 replication technology, the only unified architecture All-Flash solution – and can include hybrid and integrated data protection with any hypervisor integration. On top of that, we can throw in software-defined, scale-up, scale-out, x-as-a-service, converged and any other technology that comes out next with more resources than any other competitive offering.
Consider the stake(s) officially in the ground.
Lastly, this is now public knowledge. Via these public benchmarks, NetApp has now proven they have a highly successful portfolio with the right solutions to handle an immense and vast variety of customer needs.
Additional Resources:
SPC-1 Full Disclosure Report for NetApp All-Flash FAS8080EX
SPC-1 Executive Summary for NetApp All-Flash FAS8080EX
Credits:
- The data used here was compiled from public reports via the Storage Performance Council results.
- Big thank you to the analysis team at NetApp for compiling all of this information and allowing me to share it with my readers!
Sources:
Per SPC regulatory requirements for publication, I am required to list the results from other vendors used in this post.
- Hitachi VSP G1000
- Kaminario K2
- IBM Power 780
- NetApp FAS8080 EX
- Hitachi VSP with Accel Flash
- IBM FlashSystem 840
- HP 3PAR StorServ 7400
- EMC XtremIO ISE 820 G3
- NetApp EF560
- Dell SC4020
UPDATE [20150504]: Added cited sources SPC-1 results
UPDATE [20150501]: Post brought back online with updates
UPDATE [20150430]: Updated with new and more accurate graphics tables
UPDATE [20150422]: Apparently there is some confusion in the posting about ordering. No matter as the numbers don’t change, it’s simply a sorting issue and should be fixed, and we will make any changes necessary here as well to reflect SPC-1 updates. /Nick
[sexy_author_bio]
- 1 Comment
- NetApp
- 04/22/2015
Nick – Great post. Great breakdown of the stats. I know this benchmark has been in the works for quite some time but having auditors go over this stuff with a fine tooth comb is a double-edged sword.
Up front – long-time NetApp employee here. That said, I am glad we’re finally in the position of being on the inside looking out for a change. I spoke to a lot of people yesterday as this news was breaking and I heard a lot of – well, the best way to characterize it for me was – whining. “Oh, we can’t post because our systems have ‘always on’ dedupe and compression and SPC-1 doesn’t allow us to post and that’s silly because if we were allowed to post we would show everyone how good our stuff really is and… and… it’s simply not fair to use our absence in against us…”
Alright, already. I get it: you made certain design choices with your product and now those design choices are working against you. You built something “from the ground up” and found out that flexibility and adaptability can be strengths and not complexities. Heh, NetApp has been there. We thought it was massively unfair that we couldn’t get on the Microsoft HCL. For years, we showed customers how to make a multitude of applications more resilient, secure, stable, perform better and yet we couldn’t convince the application vendor themselves. We would whine and complain about the lists and clubs we weren’tallowed to join due to the way we did things. Well, everyone makes choices and there are implications to those choices.
I like being in front of the engineering purist in this discussion. They have to explain why they can’t do something. Meanwhile, I can explain why I can. I like being in that position. Can’t wait for the “Re-built from the ground up” campaigns that are sure to come as competitors look to participate in more discussions like this vs. whining they can’t be a part of them in the first place.