Data Immortality and the Death of Backup

By on 09/26/2011.

Keypunch card from Columbia University Computer Center’s IBM 360

 

This post is full of a lot of my own personal hypothesis, theorycrafting, and general soap-box-ish-ness.  But I figured we should set the stage a bit and start out with a bit of a history lesson, for those of you that may need bringing up to speed.

Punch Cards.

I suppose you could truly refer to this as the first form of “backup.”  When Mauchly and Eckert built the UNIVAC I, they created the first form of digital computing in 1951.  It used magnetic drums for internal storage, and punch cards for external storage, which, technically, could be considered a form of backup.

In the 1960’s, punch cards were superceded by something more capable and more efficient…


IBM 3410 Magnetic Tape Subsystem

 

Magnetic Tape.

Since the typical spool of magnetic tape could essentially store the equivalent of 10,000 punch cards, it was an instant hit, and stuck around as the leading form of external storage media until the mid-1980’s.  Big and small companies (and even some home users) began to create tape backups. The first backup traditions and strategies started to arise in early 1960. Tape backups were the most widespread, because of tape drive’s reliability, scalability and low cost. All these advantages make tape backup an attractive solution even today.

We all know IBM led the way in the beginning, and shortly after introducing the punch card in 1951, they also introduced another brilliant piece of technology that really didn’t take hold, arguably, until the mid-1980’s…

Hard Disk Drives.

While IBM introduced the first hard drive in 1956 with the IBM 305 RAMAC, it wasn’t until the PC/XT came to be in 1983 that it became a standard component in most personal computers.  Other vendors, such as Hitachi, were equally as important in the evolution of the HDD, as they were the first to release a drive with more then 1GB of capacity in 1982.  Still, in the 1960’s and 1970’s hard drives were not suitable for backups because of their high price, large size and low capacity.

However, what I believe is the most underrated achievement of the late-1980’s-to-early 1990’s, and made HDD’s what they are today is the introduction of RAID (redundant array of inexpensive disks).  It enabled essentially what we have today, utilizing hard disk drives as the primary mechanism for storing data.

 

Removable Media.

      

There’s no need to rehash the 1990’s and early 2000’s, as most of us in IT today lived through that period and can personally relate to how those technologies evolved, as well as how they directly affected our lives, both personally and professionally.

 

Enterprise Storage Arrays, Virtualization, and the Kitchen Sink.

And here we are today, virtualizing everything, sharing workloads on the same (or dissimilar!) storage arrays, and still arguing about how things are supposed to be getting backed up.  Nothing’s changed, I would argue.  It’s still all just 1’s and 0’s, but the capacities, required throughputs, and all of that stuff, have exponentially increased, and the only thing that’s changed really is the devices used to backup these increased demands.

I will definitely argue, probably to the point of my death, that virtualization is the single-most pivotal thing in the last 10 years that brought shared storage arrays to the mainstream.  We all know they’ve been around for 20-ish years now, but let’s be realistic:  It was all about DAS and silo’ed application stacks until virtualization came along.

But this is where I want to throw you a giant curveball…

What would it take to get to the point that backup is no longer a talking point?

Think about that for a second or two.  What would it really take?  What makes backups so compelling and “necessary?”  I’m sure we can all prioritize a similar set of answers in different lists, but it usually revolves around several key things:

  • Human error (users deleting/corrupting files)
  • Disasters
  • Hardware failures
  • The ability to somehow sleep better at night, for whatever reason

What if we had the ability to eliminate all of these needs?  Well, I would argue that for the most part, we have.  Which leads most people to continue to take ridiculous amounts of backups because of Item #4 in the list above.

For the Human Factor, most of us have some sort of snapshot-based technology, with the ability to recover a file to a certain point-in-time.  Some argue that this IS a form of backup, some argue that it isn’t.  I won’t get into that argument here.

For Disasters, we have vendors like Cisco and F5 that have Global Traffic Managers, enabling Long-Distance Active/Active Datacenters to become more of a reality.  We also have VMware with it’s Site Recovery Manager product, that enables automated failover capabilities, and even automated “failback” in the latest version.

Hardware failures are pretty much a no-brainer these days with virtualization taking hold as it is.  Although some, including me, would argue that it introduces net new complications and educational requirements to the repertoire of today’s Datacenter Engineer.

But there is a key ideal that makes all of these possible.  It is the underlying thing that makes these technologies become “reality.”

Abstraction.

ab·stract: to consider apart from application to or association with a particular instance.

Linux did it with LVM.  VMware did it with Clusters.  NetApp has done it with MetroCluster.  EMC is doing VPLEX.

It is my personal belief that continued (if not complete) Abstraction is the gateway to true Federation of Datacenters.  Immortality.

Guess what comes along with that immortality?  The death of backup.  Yup, I said it.  Data[center] immortality brings along the imminent death of backups.

*climbs up on soap box*

Most companies were slow to adopt virtualization, and even today, I talk to some of the biggest customers that do not take advantage of things as simple as native plugins that would make their life so much easier.  It’s not the first time you’ll hear me say it, and it won’t be the last, but People, Process, & Politics run datacenters, or ultimately determine HOW they are run.  And it is typically not the most efficient, beneficial way to do so.  It is typically at the whim of some 50+ C-level or VP deciding whether he’s willing to “bet the farm,” likely with a slim understanding of what the engineer is actually explaining to him.  It reminds me a bit of how the government is run today.  Don’t get me wrong, there are some visionary executives out there that believe in tech, trust their engineers, and generally push the industry forward by adopting innovations from companies like the one I work for.  But what they don’t understand is that by “playing it safe,” they make their lives so much more difficult than they need to, and in the grand scheme of things, stall further adoption.

Example?  We’re about to release the latest version (8.1) of our DataONTAP operating system here at NetApp, and there are still customers running 7.2 and below.  5+ year old versions of the software.  The old “If it ain’t broke, don’t fix it” attitude must die.

Now.  Today.  We have got to get past the point of petty fears preventing further adoptions of technologies.

*steps down off soap box*

 

Reality.

The reality of this is that we will get to a point, I would wager in the next FIVE years (look where we were five years ago?!) where datacenters will become federated worldwide, ushering in an age of “data immortality,” and truly eliminating the need for holistic backup.  I think the first iteration of this will be with cloud-based DR.

What if I told you that you could build SRM DR Failover plans to a hosted Cloud provider, and no longer needed to stand up disparate datacenters around the world?

And that this would all failover automatically in the event of local or regional disasters?

What if I told you that in the future we could incorporate plex’ing technologies like MetroCluster with Cloud providers for things like Hybrid-Cloud-DRS, enabling you to “vMotion” a server/workload to a hosted cloud provider, or even further, “load balance” between your private/public cloud?

Why in the world would you need to labor yourself with tape rotations anymore?

I sure as hell wouldn’t bother.  Neither should you or your company.

We’ve been talking about backup for 60 years.  Let’s find a way to innovate out of the need to backup completely, rather than continuing to innovate new ways to back up.

-Nick

9 Comments