NetApp DataONTAP 8.1: A Reponse

Yesterday, I put up a post announcing the official General Availability of NetApp’s latest version of DataONTAP.  I expected several things that have already happened.  Huge amount of hits compared to my usual post.  Some naysayers or competitors coming in and bashing.  All in good fun.  Kinda like those times when you were a kid and you’d dig cat poop out of the sandbox and throw it at someone?  Doesn’t really hurt, but it’s still gross and a bit much.  (Get my point?)

But what I wasn’t expecting was an incredibly thorough and detail-oriented comment, like the one I got from “nate.”  You can see the comment in the aforementioned link above.

It was lengthy, and thought-provoking, and once I started responding to it, it started turning into it’s own post.  So here you go, Nate!

From what I see the 3210 was released at the same time as the 6200 series – I don’t understand why the 3210 would be considered old when the rest of the 3200 and 6200 series is not ? Maybe saying it’s old is the polite way of saying the 3210 doesn’t have enough memory to run it or something?

Bingo.  You nailed it.  While the 3210 did come out at the same time as the 62xx’s, you’re exactly right.  The technical reason behind shunning the 3140 and 3210 is lack of system memory to support the enhancements made to the OS that require more memory utilization AND FlashCache at the same time.

Can you clarify on the cluster mode how data is distributed? Is a particular volume spread across all cluster members? Can the data of a volume be re-striped as cluster nodes are added?  I looked through your previous two posts and is seems like more of a collection of arrays that can transparently move data(in the form of volumes) back and forth, I assume with at least a global name space.

Sort of along the same lines – can an aggregate(assuming that concept still exists) span a cluster? Can the de-dupe stuff span the cluster?

Excellent question.  I could almost write an entire post (and likely will) on this topic alone.  Architecture.  I’d like to start with the second question first, before addressing data distribution.  You read my previous two posts correctly.  DataONTAP 8.1 Cluster-Mode architecture, in it’s simplest, is a collection of our traditional HA pairs (for now), connected via a 10GbE “backbone.”  This can be heterogeneous, but is limited to a specific set of model numbers, for reasons such as the memory one above.  These HA pairs still operate as each other’s first line of defense in a failure.  Node-failover still happens at the local level, the same way it always has.  Software ownership is used to define aggregates, and in the event of a failure, ownership can be quickly switched over to the partner controller.  This covers almost every instance of “failure.”

Data Distribution.  As is common, most people over-think this.  Aggregates are owned by controllers, and volumes live within those aggregates.  Aggr’s do not span nodes in the cluster, and neither do volumes.  Data does not stripe between nodes.  Sounds cool on paper to be able to do this, but the reality is that it would make a HUGE mess if one thing went wrong, or you picked up some extra latency on one of the nodes, to where the write-down couldn’t complete as fast, etc etc etc.   No.  The one trick thing to remember here is that we are no longer exporting data directly from controller nodes.  We are no longer directly accessing (client-side) nodes themselves.  All of this is handled through Vservers (deep-dive detail on this in my next post, “Anatomy of a Vserver”)

Does 8.1 do away with the “7 mode” altogether ? I mean cluster mode has been available in some form for a while now.

Nope.  7-mode is the traditional run-method of NetApp DataONTAP.  We needed to create a name  that would distinguish Cluster-Mode systems.  If you’re still running version 7.x of DataONTAP, you’ve likely heard that referred to as “7G.”   These are all just naming conventions.  While there are some that would say, “In the future, everything will be Cluster-Mode,” I am not one of those.  I believe 7-mode will be around for some time.   And let’s please not “dismiss” DataONTAP 8.1 as “Cluster-Mode-only,” folks.  There have been some incredible enhancements in DataONTAP 8.1 for customers to upgrade to immediately!  The way partial writes are handled for example, or the increased performance because of code optimization with WAFL.  All of this stuff is alive, and continues to live, so please don’t NOT upgrade to 8.1 just because you’re not interested in Cluster-Mode.  You SHOULD be, because holistically, it’s the way of the future, and you should consider expanding and learning about it, but I’m not telling you to drop everything today and completely rearchitect your environment on Cluster-Mode (my competitors are going to take this one and run with it).  You should definitely upgrade your current systems to DataONTAP 8.1, and if you’re interested in building out a Cluster-Mode solution, reach out to your Account SE’s, and get them involved, and let them know you’re interested.

Does 8.1 cluster mode support V-series ? (your past posts implies it does), my last NetApp (about 1.5 years ago) I noticed cluster mode did not support V-series (and my box was a V-series), or at least the docs said if I recall right the disks needed to be “netapp disks”, not sure why it would of cared.

Yes.  You can have a Cluster-Mode implementation that has any combination of 3240’s, 3280’s, 6240’s, 6280’s, and those can be standard nodes or V-series nodes virtualizing other arrays.  Short-and-sweet answer there, if you need more detail I can certainly get that to you.  In general, any previous iteration of “Cluster-Mode” be that 8.0 or the original ONTAP-GX should be deprecated in lieu of DataONTAP 8.1.

How is cache handled? I assume it is at least mirrored as with a traditional HA pair in 7-mode. In cluster mode if a node goes off line either by fault or say software or hardware change does the system go to write through mode or does it mirror the data to another member in the cluster?

This, somewhat, goes back to the architecture discussion above.  A cluster is a collection of HA pairs, so you’ve always got that initial mirror to the partner.  If an entire pair needs to be removed, then all volumes and Vservers will be evacuated first.  NVRAM mirroring still happens between the HA pair as it always has.  I’m unaware of any significant changes to this.  Great question, though!

How is access to the cluster load balanced?

It’s not.  All controlled via access through 1 or more Vservers.  This can be split into a delegated scenario with many multiple Vservers, or just one being accessed as a global namespace.  There’s two sides to the cluster, the front-end access, and the back-end data-mgmt.  This is the core stuff that came from the Spinnaker acquisition years ago.

I was an Exanet customer at one time(I do miss that platform it was a nice cluster) and the way they handled things was primarily a round robin DNS entry for the cluster with each interface having it’s own IP. Each node in the cluster was responsible for a subset of the data, if a request came in for data for a node that did not have ownership of the data it transparently sent the request to the node that did have access across the cluster interconnect. Files were fairly evenly distributed across all cpu cores in the cluster.

Some of this I cannot comment on (no Exadata experience), but what you’re describing towards the end is similar.  A Vserver has logical interfaces (LIFs) that are linked to the physical ports or ifgrp’s of each controller node.  Let’s say a storage admin executes a move of a volume and its content from one node in the cluster to another.  Traffic would continue from the Vserver to this volume, but it would traverse across the cluster backbone to the destination node of the cluster until the admin also “home’d” the LIF to the home port of the destination node.  We see this as a huge opp to do some trick things with load-balancing, and that will likely come down the road.  Which takes us to you next and final point….

Can the cluster balance itself based on load? I was talking to Compellent recently and was pretty surprised to hear that their live volume software can move volumes between arrays automatically based on I/O load and stuff. Though this functionality is only sort of a stop gap vs a real cluster where all controllers are participating in the I/O for all volumes eliminating the need to have to move data between arrays for this purpose.

The quick answer is No, it cannot.   What I will say is that as of 8.1, we have 1:1 mapping of zAPI’s to command-line functions, as well as near-complete coverage with the Powershell Toolkit.  In my own fantasy world, this would be a great opp for the Cisco’s and F5’s of the world (not to call them out but…) to take advantage of this fact and begin to co-develop solutions where they could use the BigIP’s to manage a NetApp cluster.  Would be pretty slick.

I also cannot comment on the NetApp CORE team’s roadmap.  I asked if it was something they were looking into, and the general response was “not yet.”  Frankly, I’m fine with that.  I can do everything I need to with Vserver design.  I don’t think the driving force behind Cluster-Mode is the ability to span all that data spread thinly like butter on toast, but more to aggregate resources into single or multiple logical namespaces that prevent you from ever having to take an outage again because you need to take the storage down.

Let’s not get too ahead of ourselves here and get the foundation of the house right before we start hanging drywall.

Nate, again, thanks for the excellent comment, and I hope this answers all of your questions.


0 0 votes
Article Rating
Notify of
Newest Most Voted
Inline Feedbacks
View all comments
04/20/2012 13:42

Nick, your candor is refreshing in this post, good work.  NetApp is going to have to dig in it’s heels and refuse to compare c-mode to Isilon and Equallogic.  But it is possible to do without being coy and you nailed it.

Reply to  keith
04/20/2012 14:02

Keith,  you’re exactly right.  It’s going to be easy for the industry to try to draw comparisons, and it’s something we’re going to have to fight tooth and nail to resist.  But I believe we’re putting our best foot forward, and the industry will eventually come along, and I’ll continue to fight the good fight bringing true awareness with posts like these.

Thanks so much for the compliments, and I’m glad you enjoyed reading!

Reply to  that1guynick
05/14/2012 11:58

I’d guess the term Infinivol won’t be used by Netapp in public circles.Were you under NDA when it was discussed?

You can probably ask about hybrid aggregates though (since 8.1C will be happy to tell you that an existing aggregate isn’t Hybrid).

Louw Pretorius
Louw Pretorius
04/20/2012 15:27

What sounds to me the most interesting in 8.1 is what used(?) to be called InfiniVols and what that brings to the way I plan my architecture.  

Reply to  Louw Pretorius
04/21/2012 13:11


I’ve never heard of anything referred to as an InfiniVol.  Could you maybe let me in on the secret?  :)

Reply to  that1guynick
04/21/2012 13:45

Hi Nick,  with C-mode you can stretch a volume across controllers… it used to be called Infinivol but I’m sure it has a new name now.

Reply to  Louw
05/14/2012 12:07

With GX stretch data across nodes, but that wasn’t an Infinivol, it was a High Performance Option (HPO) volume. That was dropped in 8.0C mode (with the exception of GX customers who couldn’t) and there were grand plans to extend it’s use in the future.

As far as I’m aware (and I’m looking at the NGSH shell of a 8.1 cluster) there’s no ability to stripe across nodes in the current release.

04/21/2012 00:01

Thanks for the quick turnaround on the questions! I have three more questions! I recall reading a few years ago about the ability to move a volume or something between netapp arrays (live w/o impact though it was, if I recall right limited to NFS and iSCSI) – was it a metro cluster or something(I don’t recall seeing any requirement for the newer cluster mode at the time I read it – sorry my memory is foggy I may be totally inaccurate of what I am saying here)? It was an ability I really haven’t seen NetApp talk about very… Read more »

Reply to  nate
04/21/2012 13:10

1) Yes, we’ve been able to move volumes between aggregates and controllers live for some time now with DataMotion, and you’ll only see this grow as we get more and more into Cluster-Mode. 2) It’s all about disk ownership, because volumes are always going to live within an aggregate, and the aggr’s can only be owned by a single controller at a time, which makes sense, I hope. 3) It’s definitely “really here now.”  I expect a migration process that will take a couple of years where people will begin to use Cluster-Mode as the defacto configuration, even on traditional… Read more »

Reply to  that1guynick
05/02/2012 12:34

Regarding 3: SyncMirror is not yet ready for Cluster-Mode in the 8.1 code, though I heard it may be available in as early as 8.2. For me, this is a critical decision (do you want Cluster-Mode? or do you want SyncMirror?) as it has to be made at time of purchase.

Nick Howell
Reply to  justpaul
05/24/2012 17:06

@openid-17817:disqus ,

Thanks for the comment.  You’re right.  We couldn’t boil the ocean and get every single product into the initial implementation.  Think of this along the lines of how VMware VI3 came about, and the leaps that were made between that and vSphere 4 & 5.  I imagine you’ll see similar leaps in functionality, and feature-parity in the future, but we’ve got to lay the groundwork first.  

Dimitris Krekoukias
Reply to  nate
04/23/2012 13:22

Hi All, Dimitris from NetApp here.

Regarding the SPEC SFS numbers, see here my post including costs. multiple controllers hit a volume: Look up FlexCache – it’s been around forever and used by huge graphics houses, for instance.We also have something in Cluster-Mode called Load-Sharing Mirrors. Not used in the SPEC SFS test because it would be cheating…Plus more stuff coming in the future. 


Dan Pancamo
Dan Pancamo
08/24/2012 14:17

A Few Questions…

1. Is Volume Snapmirror (VSM) from 7mode to cluster mode supported?   If VSM is not supported, how are volume migrations from 7mode to cluster mode accomplished?
1. Is Snapvault capability supported in Cluster mode?

07/14/2014 02:34

Can we have metro setup in cluster mode? how can we migrate exiting metro cluster setup from 7 mode to cluster mode?

Would love your thoughts, please comment.x
Sign Up

New membership are not allowed.