Yesterday, I put up a post announcing the official General Availability of NetApp’s latest version of DataONTAP. I expected several things that have already happened. Huge amount of hits compared to my usual post. Some naysayers or competitors coming in and bashing. All in good fun. Kinda like those times when you were a kid and you’d dig cat poop out of the sandbox and throw it at someone? Doesn’t really hurt, but it’s still gross and a bit much. (Get my point?)
But what I wasn’t expecting was an incredibly thorough and detail-oriented comment, like the one I got from “nate.” You can see the comment in the aforementioned link above.
It was lengthy, and thought-provoking, and once I started responding to it, it started turning into it’s own post. So here you go, Nate!
From what I see the 3210 was released at the same time as the 6200 series – I don’t understand why the 3210 would be considered old when the rest of the 3200 and 6200 series is not ? Maybe saying it’s old is the polite way of saying the 3210 doesn’t have enough memory to run it or something?
Bingo. You nailed it. While the 3210 did come out at the same time as the 62xx’s, you’re exactly right. The technical reason behind shunning the 3140 and 3210 is lack of system memory to support the enhancements made to the OS that require more memory utilization AND FlashCache at the same time.
Can you clarify on the cluster mode how data is distributed? Is a particular volume spread across all cluster members? Can the data of a volume be re-striped as cluster nodes are added? I looked through your previous two posts and is seems like more of a collection of arrays that can transparently move data(in the form of volumes) back and forth, I assume with at least a global name space.
Sort of along the same lines – can an aggregate(assuming that concept still exists) span a cluster? Can the de-dupe stuff span the cluster?
Excellent question. I could almost write an entire post (and likely will) on this topic alone. Architecture. I’d like to start with the second question first, before addressing data distribution. You read my previous two posts correctly. DataONTAP 8.1 Cluster-Mode architecture, in it’s simplest, is a collection of our traditional HA pairs (for now), connected via a 10GbE “backbone.” This can be heterogeneous, but is limited to a specific set of model numbers, for reasons such as the memory one above. These HA pairs still operate as each other’s first line of defense in a failure. Node-failover still happens at the local level, the same way it always has. Software ownership is used to define aggregates, and in the event of a failure, ownership can be quickly switched over to the partner controller. This covers almost every instance of “failure.”
Data Distribution. As is common, most people over-think this. Aggregates are owned by controllers, and volumes live within those aggregates. Aggr’s do not span nodes in the cluster, and neither do volumes. Data does not stripe between nodes. Sounds cool on paper to be able to do this, but the reality is that it would make a HUGE mess if one thing went wrong, or you picked up some extra latency on one of the nodes, to where the write-down couldn’t complete as fast, etc etc etc. No. The one trick thing to remember here is that we are no longer exporting data directly from controller nodes. We are no longer directly accessing (client-side) nodes themselves. All of this is handled through Vservers (deep-dive detail on this in my next post, “Anatomy of a Vserver”)
Does 8.1 do away with the “7 mode” altogether ? I mean cluster mode has been available in some form for a while now.
Nope. 7-mode is the traditional run-method of NetApp DataONTAP. We needed to create a name that would distinguish Cluster-Mode systems. If you’re still running version 7.x of DataONTAP, you’ve likely heard that referred to as “7G.” These are all just naming conventions. While there are some that would say, “In the future, everything will be Cluster-Mode,” I am not one of those. I believe 7-mode will be around for some time. And let’s please not “dismiss” DataONTAP 8.1 as “Cluster-Mode-only,” folks. There have been some incredible enhancements in DataONTAP 8.1 for customers to upgrade to immediately! The way partial writes are handled for example, or the increased performance because of code optimization with WAFL. All of this stuff is alive, and continues to live, so please don’t NOT upgrade to 8.1 just because you’re not interested in Cluster-Mode. You SHOULD be, because holistically, it’s the way of the future, and you should consider expanding and learning about it, but I’m not telling you to drop everything today and completely rearchitect your environment on Cluster-Mode (my competitors are going to take this one and run with it). You should definitely upgrade your current systems to DataONTAP 8.1, and if you’re interested in building out a Cluster-Mode solution, reach out to your Account SE’s, and get them involved, and let them know you’re interested.
Does 8.1 cluster mode support V-series ? (your past posts implies it does), my last NetApp (about 1.5 years ago) I noticed cluster mode did not support V-series (and my box was a V-series), or at least the docs said if I recall right the disks needed to be “netapp disks”, not sure why it would of cared.
Yes. You can have a Cluster-Mode implementation that has any combination of 3240’s, 3280’s, 6240’s, 6280’s, and those can be standard nodes or V-series nodes virtualizing other arrays. Short-and-sweet answer there, if you need more detail I can certainly get that to you. In general, any previous iteration of “Cluster-Mode” be that 8.0 or the original ONTAP-GX should be deprecated in lieu of DataONTAP 8.1.
How is cache handled? I assume it is at least mirrored as with a traditional HA pair in 7-mode. In cluster mode if a node goes off line either by fault or say software or hardware change does the system go to write through mode or does it mirror the data to another member in the cluster?
This, somewhat, goes back to the architecture discussion above. A cluster is a collection of HA pairs, so you’ve always got that initial mirror to the partner. If an entire pair needs to be removed, then all volumes and Vservers will be evacuated first. NVRAM mirroring still happens between the HA pair as it always has. I’m unaware of any significant changes to this. Great question, though!
How is access to the cluster load balanced?
It’s not. All controlled via access through 1 or more Vservers. This can be split into a delegated scenario with many multiple Vservers, or just one being accessed as a global namespace. There’s two sides to the cluster, the front-end access, and the back-end data-mgmt. This is the core stuff that came from the Spinnaker acquisition years ago.
I was an Exanet customer at one time(I do miss that platform it was a nice cluster) and the way they handled things was primarily a round robin DNS entry for the cluster with each interface having it’s own IP. Each node in the cluster was responsible for a subset of the data, if a request came in for data for a node that did not have ownership of the data it transparently sent the request to the node that did have access across the cluster interconnect. Files were fairly evenly distributed across all cpu cores in the cluster.
Some of this I cannot comment on (no Exadata experience), but what you’re describing towards the end is similar. A Vserver has logical interfaces (LIFs) that are linked to the physical ports or ifgrp’s of each controller node. Let’s say a storage admin executes a move of a volume and its content from one node in the cluster to another. Traffic would continue from the Vserver to this volume, but it would traverse across the cluster backbone to the destination node of the cluster until the admin also “home’d” the LIF to the home port of the destination node. We see this as a huge opp to do some trick things with load-balancing, and that will likely come down the road. Which takes us to you next and final point….
Can the cluster balance itself based on load? I was talking to Compellent recently and was pretty surprised to hear that their live volume software can move volumes between arrays automatically based on I/O load and stuff. Though this functionality is only sort of a stop gap vs a real cluster where all controllers are participating in the I/O for all volumes eliminating the need to have to move data between arrays for this purpose.
The quick answer is No, it cannot. What I will say is that as of 8.1, we have 1:1 mapping of zAPI’s to command-line functions, as well as near-complete coverage with the Powershell Toolkit. In my own fantasy world, this would be a great opp for the Cisco’s and F5’s of the world (not to call them out but…) to take advantage of this fact and begin to co-develop solutions where they could use the BigIP’s to manage a NetApp cluster. Would be pretty slick.
I also cannot comment on the NetApp CORE team’s roadmap. I asked if it was something they were looking into, and the general response was “not yet.” Frankly, I’m fine with that. I can do everything I need to with Vserver design. I don’t think the driving force behind Cluster-Mode is the ability to span all that data spread thinly like butter on toast, but more to aggregate resources into single or multiple logical namespaces that prevent you from ever having to take an outage again because you need to take the storage down.
Let’s not get too ahead of ourselves here and get the foundation of the house right before we start hanging drywall.
Nate, again, thanks for the excellent comment, and I hope this answers all of your questions.
- 17 Comments