With this blog post, I will try save a few hours of your life!
I’ve recently setup a simple 2 node failover cluster that will be used for AlwaysOn Availability Group. After the installation, I suggested the customer to perform failover testing, not only to see the behavior of the Windows Failover Cluster but also to see how the application(s) would respond to a failover. One of the tests was disabling the Production LAN.
When I disabled the NIC on the node that was hosting the cluster group, it did a nice failover as expected.
I activated the NIC again and did the same test on the other node (which was hosting the cluster group now). To my surprise, there was no failover at all but the cluster name and the IP address just went offline
This is not what I expected.
I checked the cluster events and saw the following error “Cluster role ‘Cluster Group’ has exceeded its failover threshold”
That’s right…, I forgot all about that. The maximum failover limit is set to 1 time in 6 hours.
I knew that you had to change this for testing. This is really good explained in this blog post http://blogs.msdn.com/b/arvindsh/archive/2012/03/09/failover-cluster-group-maximum-failures-limit.aspx .
The blog is using Windows Server 2008, but my installation is using Windows Server 2012 R2. It really took me a long time to figure out how I could change the maximum failover limit of the cluster group in Windows Server 2012…
You have to right click on the cluster name and select the properties. Now if you look closely, there is a “link” in the first windows called “Manage Core Cluster Resource Group”.
I didn’t notice that link because most of the links are just to open the help files…
Go to the Failover tab, and there you will find the Maximum failures in the specified period. By default it is set to 1 in a period of 6 hours. This is not enough during failover cluster testing and will give you the failed/offline issue that I had.
Change it to a higher value, like 100 for example.
Don’t forget to set it back to the default value after you finish your tests
I did a few more tests and it was working fine now!
Well, I hope that I have saved a few hours of your life with this 😉