UPDATE [Jul 18 23:30 PDT]:  We continue to receive a lot of calls from customers hitting this issue, so I wanted to share an official position from NetApp CSS (support) to both our customers, as well as our field and partner community to advise customers on information you should know, as of today, about the current status of the 5.5U1 APD issue:

Build 1881737 (ESXi 5.5 Express P4) corrects for this issue, however, it is not fully qualified by NetApp yet. We are anticipating it will be on our IMT in the next week or two.

Our current fully-supported recommendation remains to back the ESXi hosts down to 5.5 flat (build 1331820) and if the SSL Heartblead issue is of concern, have them apply patch ESXi 550-201404401-SG located here: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2076121

Hopefully, within the next week or so, QA/Interop will complete testing and have this latest P4 patch listed on the IMT. When it hits the IMT, I’ll be sure to continue to update you all here.


UPDATE [Jun 12 11:40 PDT]: VMware has released Patch 4 {link} to address the NFS APD issue.  It is recommended to update immediately from current 5.5U1 installs.


UPDATE [Apr 18 17:00 PDT]: VMware has released KB 2076392, noting that this is a known issue affecting ESXi 5.5 Update 1 hosts with connected NFS storage. VMware is working towards providing a resolution to customers. To work around this issue, VMware recommends using ESXi 5.5 GA.   It was also brought to my attention today that 5.5U1 had not made it on to the NetApp IMT yet, as the QA teams had not finished their thorough interop testing.  This is one of those lessons where one must pay attention to the IMT’s for all your gear and software before upgrading whimsically.  #FoodForThought


UPDATE [Apr 17 16:25 PDT]: For now, if this condition is being experienced, the recommendation is to downgrade ESXi to 5.5.

"REMAIN CALM!"
“REMAIN CALM!”

Recently, an issue was uncovered by several NetApp customers running NFS in vSphere 5.5U1 where their datastores would go offline randomly, multiple times throughout the day. If you have not yet upgraded to 5.5 U1, DON’T! There is an ongoing internal thread at NetApp about this issue, so if you’re a NetApp employee, make sure you’re following the server-virt distribution list. When I first heard the news, my first inclination was to post an alert on twitter.  Little did I know how widespread this had become.

My first troubleshooting thought was that this was another iteration of the vSphere 5.5 change of the NFS queue depths from 64 to 4 Billion.  I can confirm that it is NOT related to the issue found in KB 2016122. VMware has confirmed the issue in vSphere and is working closely with NetApp to determine root cause, and we should expect a public KB very soon. This post will be updated with findings as they’re released.  Stay tuned…

[sexy_author_bio]

9 thoughts on “NFS Disconnects in VMware vSphere 5.5 U1

      1. This is correct if you are using Nutanix hardware it also is effected. I believe it is any vendor that users NFS.

  1. Yes same with Nutanix systems. We opened a case with VMware beginning of April and with Nutanix at the same time. Nutanix did thorough testing and the assumption was made that it is most probably an issue with vSphere 5.5 U1 (Nutanix was in direct contact with VMware at that time). We did the ESX downgrade on April 10th which immediately solved the problem. The official VMware KB was created about a week later or so I believe.

  2. Between this and NetApp KB 1014463 (IMT note 7291) I’ve been recommending against ESXi 5.5 for my customers. Am I being overly cautious?

  3. Nick, any chance you can post an update on this topic ? Has Patch 4 been officially validated by Netapp CSS ?

  4. Sorry to Bump this Nick, but is there any news on this? 5.5 U2 is released now and was wondering if Netapp has qualified any of these updates or resolved the issues with Vmware. THanks!

Leave a Reply

Discover more from DatacenterDude

Subscribe now to keep reading and get access to the full archive.

Continue reading