EMC Boots 500 VM’s in 5 minutes!

Wow!  Big congrats to EMC on this achievement!



<insert record needle scratching sound>

Welcome to 2007!

In 2007, we [at NetApp] did a 1200-client cold boot of 1024 Windows XP desktops in 14 minutes.  Looks like they might be a touch faster than that.

Last year at VMworld 2010, we heard some more chest-thumping from them about booting 1000-clients in 45 minutes.

We did 50,000.  Video here.  Reference Architecture here.

This year, we get yet another press release at VMworld 2011 (cited above) saying “500 client boot in 5 minutes” with:

  • Discussed as ‘early results
  • No details, config, or deployment details
  • Was it a warm boot (i.e. was FASTCache pre-warmed or the ESX cache warmed?)
  • Was it a cold boot from a standing start?
  • Was it really a boot, or was it just a “Log In” time, as is seen discussed throughout the referenced press release?
  • Do they know the difference between Login or Boot?  (Hey Look!  A Unicorn!)
  • What platform? (I’ll assume VNX?)
  • Was the platform loaded with SSD’s to somehow achieve such a benchmark?
  • If so, what did this do to the cost of the “solution?”

Vaughn Stewart wrote a very thought-provoking post on this same topic back in June. 

Look, you’re all customers out there in a very competitive market.  I’m not trying to ride EMC too hard here.  They’re doing good stuff.  There’s a reason they are the current market leader.

Those who read my blog know I believe in transparency.  Here is a link to the Reference Architecture TR published by EMC.  Please read for yourself and formulate your own opinions!  But I will share with you mine.

According to the TR, they’ve used 2x 100GB SSDs (RAID 1)  configured as FAST Cache (~43GB per Controller), 25x SAS Drives, 9x NL-SAS, 1x SAS/SSD/NL-SAS spares, and different RAID Levels (1,5,6)  for their layout. The entire layout with the different colors and RAID levels is something out of a children’s Christmas book! (see pg 26 of the linked Ref Arch above)

What they do not seem to be divulging is the warm up period required. Initially, with an empty FAST Cache, the response times are variable and in fact are worse than no FAST Cache at all. Furthermore, depending on the locality of reference of the working set and the size of the working set it can take from *minutes* to *hours* to “warm up.”

With high certainty, I would say that they had pre-warmed the cache when they ran the demo.  Because of it’s persistent nature, the data is still in the FAST cache during a reboot. So for all we know, they could have run the test in Hopkinton and shipped the box to Vegas.

I have no reason to believe they are able to accomplish this in 5mins without a pre-warmed cache. I have considerable doubts they can do it with an empty cache.

But I think the most valuable point out their TR is the layout, which shows the architectural lengths, and considerable time spent in order to properly configure this thing using Best Practices. Never mind the operational overhead required to manage this type of a layout, or what sort of exercise you would have to go through in order to rebuild this whole thing if something went wonky.  (Yes, you’d basically have to tear it all down and start from scratch reconfiguring.)


So, let’s talk reality, and what we’re doing at NetApp.  Chris Gebhardt & Co. recently published NetApp TR-3949, detailing….let me say that again…. DETAILING ….exactly how we are able to boot 5000 Windows 7 VM’s in 21 minutes, with NO SSD.  The other kicker of this is that this level of performance can be achieved with or without FlashCache.  Our abilities in this area have improved almost 80% since 2007!

The biggest thing you need to remember, Mr. Customer, is that there is so much more to a successful VDI deployment than boot times.  And we are the best at it.  Hands-down.


0 0 votes
Article Rating
Notify of

Inline Feedbacks
View all comments
Would love your thoughts, please comment.x
Sign Up

New membership are not allowed.