Data consolidation will be the next big move for IT departments to regain control of their data.

I’m back! It’s time to dust off DatacenterDude.com, and get back to work. After spending the first few weeks in CohesityHQ assimilating and driving our first VMworld, I have many a thought to share with you from my interactions at the show. One final thanks to the community for your massive outreach and support the last few months. I’m kidding…I’ll never stop thanking the community! -Nick

I’ll start off this post with a tweet from the Founder and CEO of Puppet Labs …

https://twitter.com/puppetmasterd/status/638917490702053376

Anyone that has attended VMworld the last couple of years has noticed the absolute sprawl of storage vendors and solutions available to attendees on the floor of the Solutions Exchange. We had a running internal joke at NetApp a few years ago that VMworld was our equivalent of “NetApp World,” because we did not have our own show, and NetApp Insight was not open to the public yet. While I am a geeky fan of shiny new things, this is not the right way forward.

And if you’ll humor me, I’ll take you down the rabbithole with me.

The 2000’s, Virtualization, and Server Consolidation

Let’s travel back in time to 2006. Ten years ago, VMware introduced ESX 3.0 with Virtual Center. A couple of years later, we had vSphere and vCenter. Over the last decade, VMware, Microsoft, and the various pieces of the Linux community made server virtualization mainstream.

Millions of users around the world began to consolidate their servers into virtual instances stacked up onto physical hosts. But to understand WHY they did this, we need to go back to the fundamentals of server management, and what drove hypervisor platforms to success, and mad adoption around the world everywhere.

From the 90s and into the 2000s, servers were still very much segregated into a per-app or per-use-case basis, and were often accompanied by a piece of storage, typically fibre-attached as either a DAS brick/JBOD or a separate storage array (think Symmetrix/Clariion/Hitachi Thunderbolt, etc). We made due for a while, but over time, as data grew into longer life cycles, and as drive technology grew exponentially in capacity, we saw people begin to keep higher amounts of data that were larger in size and quality for longer periods of time.

This presented all sorts of scaling issues around server procurement. The vendors loved it. 90% of the time, only 10% of the server’s capability was used, but we had to design for the “worst-case” scenario where that big batch would run at 3AM, and that server better be able to handle it.

What if we could create a compute “pool” of physical resources and abstract the management and provisioning layer on top of that larger, more robust pool of resources.

While Microsoft deserves some credit for Virtual Server being one of the first-used mainstream versions of virtualization, VMware really took the ball and ran with it when they introduced ESX and Virtualcenter.

The rest, as they say, you most likely know and use today.

Server Consolidation vs Data Consolidation

We’re entering an interesting new era where IT, yet again, is about to repeat history. We learned a lot of lessons about consolidation during the last decade, specifically how it relates to servers, but it’s my belief we’re about to go through a lot of the same motions for getting our data under control.

Here’s a quote from an anonymous end-user I spoke with at VMworld last week:

“…well, we use a mix of Pure and NetApp for our VMware environment, but for backup, we have Avamar proxies going to a VNX, and then it gets archived off to a Data Domain. Oh, and we also use Isilon in some various use-cases, and now we’re looking at a Hadoop farm for our reporting and analysis. We have piles of servers in the racks that our test/dev department never uses, and we honestly have no idea who owns what.”

So, this one very large end-user has:

Pure Storage for Tier1 apps and VDI
NetApp FAS for Tier2 virtualization
EMC Avamar, VNX, and Data Domain for data protection
Isilon for NAS
Runaway sprawl has overtaken the datacenter footprint for test/dev
Adding new platform for analytics

First, I’m no fan of this kind of design, but it’s very easy over time for this kind of purpose-built single-solution approach to turn into what it has today. Absolute bloat and runaway sprawl. Politics get in the way, DBAs insist on having dedicated systems, and it’s harder to undo something once it’s in place.

This attendee had this sort of “self-epiphany” moment where he realized just how much stuff they had in their datacenter, and I think all in one moment realized how much overhead that generated. It was a common theme throughout the week at VMworld.

Does this sound familiar to anyone? Aren’t these the same fundamentals that drove us to do server virtualization? The saving grace of virtualization, to me, has been the management layers that wrap around them. VMware vCenter, System Center & Hyper-V Manager for Microsoft. Truly extensible management platforms providing a single point of configuration for all the things. And I hope it goes without saying the ecosystem and community those platforms and solutions created!

Enter Cohesity…

Many of you have come to me and asked what it was about Cohesity that made me join. While there are several culture, leadership, and team-based reasons I like Cohesity, the tech has to come first. Especially in my role as Evangelist, there has to be a natural passion. I interviewed with 15 different companies looking for the best opportunity. Took my time. But I knew the minute the following conversation happened…

One of the first things I asked Mohit when interviewing was,

“What’s your ‘one thing’? What’s your ‘HOLY $#@%’ that people are going to lose their minds over?”

After he was done explaining a few different approaches to me, I got it. It sunk in. I remember thinking to myself … “He has created the vSphere-equivalent of a data consolidation platform!”

Remember how I described them earlier?

Truly extensible management platforms providing a single point of configuration for all the things.

For those still uncertain about what Cohesity is building, that’s a great way to sum it up. You’ll hear more and more from me on Cohesity specifically in the coming months, but it’s my belief that Mohit and the amazing team around him are building the next great extensible platform to clean up the mess of the last 20 years of sprawl, allow us to make actual USE of all of those backups we’re keeping in a new and efficient way called SnapTree®, leveraging those snaps as insta-clone customizable and automated test/dev environments in a true DevOps approach, as well as being able to run native and 3rd-party analytics, all on a brand-new built-from-scratch distributed file system leveraging infinitely-scalable top-tier commodity hardware from Intel with a full REST API and even a published Open SDK.

It’s a lot more than “just backup” as several people were quick to jump to in recent weeks. It’s not just backup at all. That is simply a means to get data on the box, and only one, at-that. It’s much bigger-picture than backup. For more thoughts around data protection, check out this stellar panel from VMworld!

Oh, by the way … did I mention it’s also going to have native SMB 2.1/3.0 support? Hello file shares, Win2012, and Hyper-V!

Let’s go back to that end-user I mentioned talking to at VMworld. His new footprint would look something like:

Pure Storage for Tier1 apps and VDI
Cohesity for “everything else”

Imagine having a single extensible data platform to consolidate all of your secondary data and workflows on to, in order to centralize and standardize configuration, reduce overall complexity, and completely eliminate massive amounts of annual spend from your business on various piece-part IT solutions that only do one thing….

Welcome to the Data Consolidation era, folks!

[sexy_author_bio]