NetApp Clustered ONTAP Design Deep Dive

I’ve been waiting a long time to get to this post.  I knew it needed to be talked about, I wanted to make sure we did it right, and I finally found the right person to speak as an SME on the content.  This is, to me, the culmination of understanding one of the most important parts of Clustered ONTAP…


I’d like to introduce you all to Karl Rautenstrauch.  He is a Solutions Architect with NetApp, focusing on Clustered ONTAP.  I first met Karl last year, and the first thing you notice about this man is his outgoing passion.  We got into a conversation about his session at Insight while in Dublin, and turns out he was presenting a session on one of the most popular questions of late:

How do I design my Vserver layout?

Karl is supporting our largest customers as they move to Clustered ONTAP.  He eats, sleeps, and breathes Clustered ONTAP, and has done so for the last two years.  That’s earned him the name “Klustered Karl” amongst his peers.

With that, I’m going to hand you over to Karl…  Enjoy!


1st of all, please allow me to air some dirty laundry.

I hate the name Vserver.

Ok, hate is not a strong enough word.  How about – loathe?  Detest?  Despise?

Why does one simple word rile me up so much?  It does an injustice to what we have delivered to our customers with Clustered ONTAP.  What we have built and delivered is a virtual storage array that abstracts data access (networking – SAN/NAS) and data layout (volumes/LUNs) from the client.

Let’s start with a brief review of Clustered ONTAP.  A cluster is anywhere from one to twelve highly available storage pairs connected via a high-speed low-latency private fabric (aka Cluster Interconnect).  The Vserver (or Virtual Array –-> pass it on) is the storage array, as the client sees it, that resides on top of all the available hardware.  It maps to the IP Addresses, host name, and/or Fibre Channel addresses they attach to. The addresses they see are mapped to virtual adapters that can share physical interfaces or be placed on dedicated interfaces.  We call these virtual adapters or LIFs (short for Logical Interfaces).  Mapping a LUN is done through the Virtual Array.  A file share or export is mapped or mounted via the Virtual Array.  Those LUNs and NAS volumes?  They can reside on any piece of physical hardware in the cluster.  The Virtual Array abstracts where they physically reside from the NAS or SAN client thanks to the federated namespace each Vserver offers.  This allows us to provide powerful and transparent data mobility capabilities no matter what protocol you use to access your data.

clustered ontap virtual storage array


Oh, and as the next diagram illustrates, you can have one or more Virtual Arrays sharing some / all / or none of that backend clustered hardware.  Think of the power and flexibility you can provide to your end users by creating secure logically separated storage arrays versus deploying separate physical arrays.  Isolated islands versus a dynamic pool that can make the best use of your available pooled capacity and performance headroom.  Is it better to distribute the workload evenly or isolate portions of it to limit impact and/or guarantee a level of performance unaffected by the remainder of apps in the cluster?  That’s the power an Agile Data Infrastructure delivers.  The power to choose the best solution without having to purchase separate products to deliver it.


clustered ontap san nas client connections

So, how can you best reap the benefits of what clustered ONTAP, and more specifically the Virtual Array, offers?  When it is time to help a customer architect their clustered ONTAP solution, I focus on a few important areas of Virtual Arrays:

  • The number of Virtual Arrays to create
  • Whether or not to physically isolate some physical components
  • Data Access patterns
  • Naming conventions – boring, I know, but I am a former admin.  Old habits die hard!


The Number of Virtual Arrays

The answer is four.  Wait, the question was not how many Super Bowls in a row my beloved Buffalo Bills lost consecutively??  I’m kidding of course!  The real answer is one I always hate to give any customer, but here it rings true.  “It depends.”  Do you host infrastructure for customers of your own?  Do you need to provide secure separation between lines of business or divisions in your corporation?  Do you separate management of SAN and file resources between different teams?  Or, are you consolidating multiple storage systems into one?  All are excellent reasons to have more than one Virtual Array.

Each Virtual Array can join a separate Active Directory or LDAP domain.  Each can have different admin accounts defined.  The volumes, LIFs, and LUNs on one Virtual Array cannot be seen or accessed from another.  Access to storage pools can be limited to a specific Virtual Array (or arrays).  The Virtual Arrays in clustered ONTAP are truly separate from one another even though they share the same hardware.


Physical Segmentation

A Virtual Array can access all storage pools and interfaces (LAN and SAN) that are available in a cluster.  But what if you have SLA’s for an application or consumer that require guaranteed available capacity or performance?  Dedicating storage pools, interfaces, and even storage controllers to that application will allow you to meet those requirements.  I’m sure you asking, “But wait a minute.  Why would I not just return to the old days of dedicating an entire array to that application?”  At some point, you could run out of capacity in that dedicated array and be forced to add a second.  That would present a new array to manage and new connections from the client to the array.  Clustered ONTAP allows you to expand the existing Virtual Array and offer the application more of what it needs from the same array.  Good bye sheet metal boundaries – the beauty of the Virtual Array!


Data Access Patterns

Let’s stay on the above example for one more minute.  What if the application I was discussing above changed over time?  It went from a gusher of reads and writes to a trickle.  My handy OnCommand suite will help me see that and I can take advantage of the resources in the cluster as a whole to move the data set to a lesser cost / lower performance storage pool, on a lesser cost / lower performance node, with a lesser cost / lower performance interface – or any combination of the three!

NAS-specific workloads offer another design decision.  Will I use all the network interfaces at my disposal or access my share / export in a direct manner?  Remember, a Virtual Array is a cluster-wide entity and CAN use resources from anywhere in the cluster.  It also abstracts the volume location from the client.  So in a cluster, I can access a volume from any node where I have created a LIF in that Vserver.  In diagram two, LIFs 1, 2, 3, or 4 can all provide read/write access to the volumes belonging to V1 on the backend.  The Cluster Interconnect, or backend fabric, and Namespace abstraction allow me to get from “A to Z,“ no matter where I start out.  The cluster knows where the data resides and how to get you there.

“But Karl, wouldn’t that add extra latency?”  Not really.  Take a look at our SPECsfs results (   23 out of every 24 read and write requests traversed the interconnect.  High-speed and low-latency.  Nothing in life is free though.  There’s always a price to pay, right?  In this case, we prioritize data access requests over system-level activities.  Shame on us for wanting to service reads and writes as fast as possible!  As a result, heavy use of the interconnect means less available bandwidth for volume mobility.  It will take longer to move a volume between nodes in the cluster.  So, I advise customers to map high bandwidth workloads to direct data paths (LIF and Volume on the same node) and use the interconnect for high client, but low bandwidth workloads like home directories and departmental shares.  Again, the beauty of the Agile Data Infrastructure shines.  Two different workload requirements serviced by one shared infrastructure.  Workload separation and distributed workloads in one platform!

SAN workloads leverage the magic of ALUA to find a LUN and do not factor into Vserver component design.  We can focus on SAN best practices at another time.


Naming Conventions

Traditional storage systems have one thing in common.  When you create something, it stays put.  So why not put the location in the name?  Controller1_Aggr1_Vol1 makes sense right?  I can produce an easy to read report that way!

LUNs, Volumes, and even LIFs can be moved in clustered ONTAP.  Why put a physical location in their name?  They do belong to a Virtual Array, so feel free to use that name in their title and you can reap the rewards of Virtual Array immortality during a hardware refresh.  You replaced a Year 2012 storage controller pair with a Year 2015, but the Virtual Array stayed the same.  No one knows you did what you did and the naming convention still applies.

Now some resources are still attached to a physical location, so naming accordingly makes sense to me.  It makes the provisioning process easier.  Node1_Aggr1_FlashPool tells me EXACTLY where I am going to place a new or existing LUN.  Node1_LACP_10Gb lets me know my new LIF will offer smokin’ performance for an application if I place it on that channel group.


Goodbye for now

Thanks for taking the time to read this post and thanks to Nick for granting me some virtual real estate on his blog.  Look for more posts going forward on clustered ONTAP best practices.  I’m happy to share what I have learned with you.


Klustered Karl, eh?  I think with continued content like this, it is certainly going to stick.

Many thanks to Karl for this kind of content.  We need it more and more these days in a world filled with Buzzword Bingo.  I’m just as guilty of it as anyone, so it makes me very happy when I can return to more technical depths such as this.

If you have any Clustered ONTAP questions for Karl, I’m sure he’ll be monitoring the comments section!

0 0 votes
Article Rating
Notify of
Newest Most Voted
Inline Feedbacks
View all comments
Jonathan Adair
Jonathan Adair
12/11/2012 07:02

Thanks Karl! Very few people have the passion for NetApp that you do and it shows in your work. Keep it up my friend, very much appreciated!

Captain KVM
Captain KVM
12/11/2012 07:58

Thanks Nick & Karl for the great post.

Mike Cabibbo
Mike Cabibbo
12/11/2012 09:26

Great post Karl, clustered ONTAPis amazing!

Jodey Hogeland
Jodey Hogeland
12/11/2012 10:46

Great job Karl — It is great being on the same team!

Ian Erikson
12/12/2012 13:34

Nice post – I will call it Varray from now on…. I agree Vserver stinks.

Dave Silvestri
Dave Silvestri
12/19/2012 21:03

Having just taken the Admin course for Clustered OnTap, I have to say I’m impressed by the scope of features… but honestly there are missing features that I am more concerend about. Synchronous SnapMirror. In today’s day and age, where the loss of minutes of data could potentially mean disaster for some companies, I don’t understand how this can be missing. Tie this together with Load Sharing Snap Mirror, where I think Synchrous SM should be essential, and it makes me even more concerned that it’s missing. SnapVault – missing. Promised in 8.2 MetroCluster – missing. Seems to me that… Read more »

Nick Howell
Reply to  Dave Silvestri
12/21/2012 15:53

@eab90cb8c2f2ef52e339ff8a57bcc7c1:disqus ,

Couldn’t agree more, but let’s keep it all in perspective.  Those things will get here at some point.  Everything gets a priority weight, and those get divided up amongst the engineering resources available.

Personally, I think we’re all going to see 8.3 be the “Mecca” release, where all of this massive transition comes together, and we begin to look forwards at what we can now do, rather then behind us at what we left behind.

05/23/2013 23:31

Guys Anyone who can share the student guide in ontap cluster mode. My email id is [email protected] afford to attend the traing. Your help will be greatly appeciated

Jonathan Hewitt
Jonathan Hewitt
11/20/2013 17:10

Hi – we are currently looking at the option of moving our mode-7 systems (HA pairs) to 8.2 cluster, but it seems we still have the issue of using HA pairs! Surely this is not yet true clusterisation? I want in a dream world 4 controllers to start with spread across two equipment rooms. If one room is flooded or powered down then the other room should take over – a bit messy with HA pairs.

Matt D
Matt D
Reply to  Jonathan Hewitt
09/22/2014 23:19

Still can be done depending on the room locations.
Each head of the HA pair to be in each room.
Being a four node Cluster there will be a cluster switch for private intracluster traffic. (removal of the infiniband requirement)

09/29/2014 18:51

Not sure if this blog is still live ” I would also need a student’s guide for Cluster mode if someone can share on my Email address “[email protected]


07/10/2016 13:11

vserver is really taken from what it is a virtual server. This term has been around log before NetApp decided to do GX version of ONTAP which led to what Cluster ONTAP is today. So is it a server or not. In reality it is a server because it contains an operating system called ONTAP using WAFL filesystem. Is it a Virtual Array as you describe it. Yes that true to. But giving it’s content of what it does internally is really a server. I still hate the fact my former peers at NetApp still call them filers when in… Read more »

Would love your thoughts, please comment.x
Sign Up

New membership are not allowed.