Restore System Databases

Did you ever had to restore your system databases? It’s not that difficult when the databases are still available (or at least some version of it). However, what do you need to do when all system databases are gone?
For example when your system database LUN is missing J. Now, Books Online is not really explaining this in detail so I’ve made this blog post to make it clearer.

Before you read on, please make sure that you have a full backup of all your system databases.

Restore MASTER database

SQL Server requires the master database before a SQL Server instance can run at all. Some version of a master database must exist. Without that “version” the service will not start and it will be impossible to restore the backup.
So, the first step is to rebuild the system databases.

To rebuild the system databases, you can follow these steps on http://msdn.microsoft.com/en-us/library/dd207003.aspx

Insert the SQL Server installation media into the disk drive or mount the iso-file.

From a command prompt window (run as administrator), enter the following command.

Setup /QUIET /ACTION=REBUILDDATABASE /INSTANCENAME=<InstanceName> /SQLSYSADMINACCOUNTS=<accounts>

<InstanceName> Is the name of the instance of SQL Server. For the default instance, enter MSSQLSERVER.
<accounts> Specifies the Windows groups or individual accounts to add to the sysadmin fixed server role.

When the rebuild is finished, you will find new (empty) data and log files in the data folder of the instance.

Once you have the “dummy” master.mdf and ldf file in place, you need to start up the instance in single user mode. This is done by adding the –m startup parameter. Open the SQL Server Configuration Manager to add it. (see screenshot below)

In the SQL Server Configuration Manager, start the SQL Server service. Once it is started we finally can restore the master database. The example starts sqlcmd and executes a RESTORE DATABASE statement that restores a full database backup of master from a disk device.

The REPLACE option will replace the current master database with the one in the backup file. Also notice that after the restore, the instance is shutting down. Remove the –m option again from the startup parameters before you restart SQL Server. You will see that all your user databases are back online and that all your server logins exist.

Restore MASTER database on a failover cluster

The procedure is slightly different than the restore of a stand-alone instance. The rebuild of the system databases must be performed on the active node and the SQL Server resource in the corresponding cluster application group must be taken offline before performing the procedure. I’m trying to restore the master database on a Windows 2012 R2 Failover Cluster and I’m running SQL Server 2012 SP2 Enterprise Edition.

I started with a rebuild of the system databases and started up the SQL Server instance with –m parameter (single user mode), as described above.

To restore the master database, I open a SQL CMD but I always get the error that only one administrator can connect to the instance. The problem is that as soon as I startup the service, the sp_server_diagnostics process kicks in and takes the first connection…
After searching for a while, I figured out that you have to startup a cluster instance in single user mode in a different way. More information can be found on http://msdn.microsoft.com/en-us/library/ms188236.aspx . Check the Note for Clustered Installations.

Restore MASTER database alternative

Another alternative method to restore the MASTER database could be the following:

  • Startup the instance with the new empty master database after the rebuild of the system databases
  • Restore the original master database as a user database (use a different name)
  • Shut down the service again
  • Remove the master.mdf and ldf files.
  • Rename the restored master database back to master.mdf and mastlog.ldf
  • Restart the service

The method above is also working but the above mentioned procedures are the best practices for SQL Server.

Restore MODEL database

The model database can be restored as a normal database. However, if the database should be corrupt, you have to start up the SQL Server Instance with trace flag –T3608. It prevents SQL Server from automatically starting and recovering any database except the master database.

To activate the trace flag, open the SQL Server Configuration manager and add it to the startup parameters.

(Re)start your SQL Server Instance. Do not forget to remove this trace flag after the model database has been restored.

Open the SQL Server Management Studio and restore the original mode database from the backup

Restore MSDB database

Before you can restore the MSDB, you’ll have to stop the SQL Server Agent. You can do this with the SQL Server Configuration manager or just right click on the agent in the SSMS and click stop

The MSDB database can be restored like a normal user database.

Once the restore is finished, you can restart the SQL Server Agent and you should see all your jobs again in the SSMS.

Restore Tempdb database

You cannot backup or restore the tempdb. This database is re-created every time an instance of SQL Server is started. When the server instance is shut down, any data in tempdb is deleted permanently.

If your tempdb is corrupt, just restart your SQL Server Instance. In case, your disk is corrupt, you will have to move your tempdb. More information on how to move system databases can be found here http://msdn.microsoft.com/en-us/library/ms345408.aspx

Restore RESOURCE database

It is a hidden database. The Resource database resides in the mssqlsystemresource.mdf file, which contains only code. Therefore, SQL Server cannot back up the Resource database. It can be restored by a file-level restore in Windows or by running the setup program for SQL Server.

 

HAPPY RESTORE!

Maximum Failover Limit

With this blog post, I will try save a few hours of your life!

I’ve recently setup a simple 2 node failover cluster that will be used for AlwaysOn Availability Group. After the installation, I suggested the customer to perform failover testing, not only to see the behavior of the Windows Failover Cluster but also to see how the application(s) would respond to a failover. One of the tests was disabling the Production LAN.
When I disabled the NIC on the node that was hosting the cluster group, it did a nice failover as expected.

I activated the NIC again and did the same test on the other node (which was hosting the cluster group now). To my surprise, there was no failover at all but the cluster name and the IP address just went offline

This is not what I expected.
I checked the cluster events and saw the following error “Cluster role ‘Cluster Group’ has exceeded its failover threshold”

That’s right…, I forgot all about that. The maximum failover limit is set to 1 time in 6 hours.
I knew that you had to change this for testing. This is really good explained in this blog post http://blogs.msdn.com/b/arvindsh/archive/2012/03/09/failover-cluster-group-maximum-failures-limit.aspx .
The blog is using Windows Server 2008, but my installation is using Windows Server 2012 R2. It really took me a long time to figure out how I could change the maximum failover limit of the cluster group in Windows Server 2012…

You have to right click on the cluster name and select the properties. Now if you look closely, there is a “link” in the first windows called “Manage Core Cluster Resource Group”.
I didn’t notice that link because most of the links are just to open the help files…

Go to the Failover tab, and there you will find the Maximum failures in the specified period. By default it is set to 1 in a period of 6 hours. This is not enough during failover cluster testing and will give you the failed/offline issue that I had.
Change it to a higher value, like 100 for example.

Don’t forget to set it back to the default value after you finish your tests

I did a few more tests and it was working fine now!

Well, I hope that I have saved a few hours of your life with this 😉

Cloud Witness in Windows Server v.Next

Last week, I attended the session “Did you vote today? A DBA’s guide to cluster quorum” at PASS Summit, by Allan Hirt.
Interesting stuff, and I learned that my quorum configuration is already done correctly according to what Allan explained.

Another interesting part was that Allan announced that there is a new type of quorum in Windows Server v.Next.

Cloud Witness!

Instead of using a File Share Witness or Disk Witness, you can now also choose Cloud Witness.
Basically it’s going to create a file on your Azure Storage which counts as a quorum vote.
This cloud witness can be really helpful in case of multi-site clusters.
With multi-site clusters, there is always the question in which datacenter you are going to put the file share witness.
In fact, to configure it correctly, you should put your file share witness in a third datacenter. But that’s really too expensive for most companies just for a file share witness. The cloud witness provides a good alternative for this problem.

The only “issue” that I see with this cloud witness, is that your cluster nodes must have connection with the internet. And honestly, I haven’t seen much SQL Servers or cluster nodes that are able to connect to the internet.
But with the appropriate firewall settings, you should be OK.

I’ve already installed a 2 node Windows vNext Failover Cluster and tried it out.
It’s actually really easy to configure.

Start with opening the Failover Cluster Manager and connect to your cluster.
Right click on the cluster name à More Actions à Configure Cluster Quorum Settings…

In the “Select Quorum Configuration Option” windows, select “Select the quorum witness”

In the “Select Quorum Witness” windows, select “Configure a cloud witness”

To configure the cloud witness, you need to specify your storage account and your Azure storage account key.

This information can be found on the Azure Portal. Just go to the storage option. On the bottom of your screen you will see a button called Manage Access Keys.

Click on that button, copy one of the 2 keys and paste it in the Azure storage field of the cloud witness configuration

Your configuration should look similar like this screen shot below

Finally, complete the wizard and if all went well you have now configured your cloud witness.

When you look at your storage account in the Azure Portal, you will notice that a new container, “msft-cloud-witness”, is created with 1 blob file inside.

Pretty cool if you ask me 😀 !

Pieter

 

Create an empty AlwaysOn Availability Group

Last week I had to install a 6 node failover cluster at a customer. 4 nodes were used for high availability solution in site A and 2 nodes were in site B for disaster recovery. Each site was hosting 9 SQL Server instances.

On my last day, I had to configure the AlwaysOn Availability Groups (AVG) as disaster recovery solution.  I have chosen the following settings for my AVG’s

  • Asynchronous commit mode
  • Backups on primary replica
  • Readable secondary

There was just one small “problem”. There were no databases to include in the AVG ‘s because this was a complete new platform. I couldn’t use the AlwaysOn Availability Group Wizard because that expects a database. But I wanted to prepare the AVG’s in advance so that the customer only needed to add the databases to the groups at a later stage.
Since I had to install 9 AVG’s I created a script that I could reuse every time I want to create an empty availability group.
The script will create an Availability Group on the primary replica, will add a group listener and will join one secondary replica all in one execution.

You can find the script below but I must mention a few remarks and restrictions

  • You have to execute the script in the SQLCMD mode in the SQL Server Management Studio. To do this go in the main menu to Query –> SQLCMD Mode
  • Only one secondary replica can be used
  • The settings for the replica’s are the one mentioned above
  • Only one IP address can be linked to the group listener
  • Named instances are supported

Before you can execute the script you only need to specify a few parameters:

Parameter Description
PrimaryReplica The primary replica. Default and Named instances are supported eg. ServerInst01
SecondaryReplica The secondary replica. Default and Named instances are supported eg. ServerInst01
EndPointPort Specify the port for the endpoints
AvailabilityGroup The name of the availability Group
Listener The name of the group listener
IPListener The IP address of the listener
ListenerPort Specify the listener port
ListenerSubnet Specify the subnet of the listerner IP Address

You can find my script below. If you should have questions, remarks or suggestions, please do leave a comment so that I can improve it.

Have fun!

--- YOU MUST EXECUTE THE FOLLOWING SCRIPT IN SQLCMD MODE.
:setvar PrimaryReplica Node1 --Specify the primary replica
:setvar SecondaryReplica Node2 --Specify the secondary replica
:setvar EndPointPort 5022 --Specify the port for the endpoints
:setvar AvailabilityGroup AVG_Group --Specify the name of the availability group
:setvar Listener AGL --Specify the name of the group listener
:setvar IPListener 192.168.0.20 --Specify the IP Address of the group listener
:setvar ListenerPort 1433 -- Specify the listener port
:setvar ListenerSubnet 255.255.240.0 --Specify the subnet of the listeren IP Address

:Connect $(PrimaryReplica)

USE [master]
GO
declare @AVGLogin as sysname
declare @CMD nvarchar(max)

select @AVGLogin=service_account from sys.dm_server_services
where servicename like 'SQL Server (%'

if not exists(select name from sys.syslogins where name=@AVGLogin)
begin
 set @CMD=N'CREATE LOGIN [' + @AVGLogin + '] FROM WINDOWS WITH DEFAULT_DATABASE=[master], DEFAULT_LANGUAGE=[us_english]'
 print 'Create service account as server login...'
 exec sp_executesql @CMD
end

if not exists (select name from sys.endpoints where name='Hadr_endpoint')
begin
print 'Create HA/DR endpoint...'
CREATE ENDPOINT [Hadr_endpoint] 
 AS TCP (LISTENER_PORT = $(EndPointPort))
 FOR DATA_MIRRORING (ROLE = ALL, ENCRYPTION = REQUIRED ALGORITHM AES)
end

IF (SELECT state FROM sys.endpoints WHERE name = N'Hadr_endpoint') <> 0
BEGIN
 ALTER ENDPOINT [Hadr_endpoint] STATE = STARTED
END


set @CMD=N'GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [' + @AVGLogin + ']'
print 'GRANT CONNECT ON ENDPOINT...'
exec sp_executesql @CMD

GO

:Connect $(SecondaryReplica)

USE [master]
GO

declare @AVGLogin as sysname
declare @CMD nvarchar(max)

select @AVGLogin=service_account from sys.dm_server_services
where servicename like 'SQL Server (%'

if not exists(select name from syslogins where name=@AVGLogin)
begin
 set @CMD=N'CREATE LOGIN [' + @AVGLogin + '] FROM WINDOWS WITH DEFAULT_DATABASE=[master], DEFAULT_LANGUAGE=[us_english]'
 print 'Create service account as server login...'
 exec sp_executesql @CMD
end

if not exists (select name from sys.endpoints where name='Hadr_endpoint')
begin
print 'Create HA/DR endpoint...'
CREATE ENDPOINT [Hadr_endpoint] 
 AS TCP (LISTENER_PORT = $(EndPointPort))
 FOR DATA_MIRRORING (ROLE = ALL, ENCRYPTION = REQUIRED ALGORITHM AES)
end

IF (SELECT state FROM sys.endpoints WHERE name = N'Hadr_endpoint') <> 0
BEGIN
 ALTER ENDPOINT [Hadr_endpoint] STATE = STARTED
END

set @CMD=N'GRANT CONNECT ON ENDPOINT::[Hadr_endpoint] TO [' + @AVGLogin + ']'
print 'GRANT CONNECT ON ENDPOINT...'
exec sp_executesql @CMD

GO

:Connect $(PrimaryReplica)

IF EXISTS(SELECT * FROM sys.server_event_sessions WHERE name='AlwaysOn_health')
BEGIN
 ALTER EVENT SESSION [AlwaysOn_health] ON SERVER WITH (STARTUP_STATE=ON);
END
IF NOT EXISTS(SELECT * FROM sys.dm_xe_sessions WHERE name='AlwaysOn_health')
BEGIN
 ALTER EVENT SESSION [AlwaysOn_health] ON SERVER STATE=START;
END

GO

:Connect $(SecondaryReplica)

IF EXISTS(SELECT * FROM sys.server_event_sessions WHERE name='AlwaysOn_health')
BEGIN
 ALTER EVENT SESSION [AlwaysOn_health] ON SERVER WITH (STARTUP_STATE=ON);
END
IF NOT EXISTS(SELECT * FROM sys.dm_xe_sessions WHERE name='AlwaysOn_health')
BEGIN
 ALTER EVENT SESSION [AlwaysOn_health] ON SERVER STATE=START;
END

GO

:Connect $(PrimaryReplica)

USE [master]
GO

declare @SystemAddressPrimary nvarchar(256)
declare @SystemAddressSecondary nvarchar(256)

if charindex('','$(PrimaryReplica)')=0
 set @SystemAddressPrimary='$(PrimaryReplica)'
else
 select @SystemAddressPrimary=substring('$(PrimaryReplica)',0,charindex('','$(PrimaryReplica)'))

if charindex('','$(SecondaryReplica)')=0
 set @SystemAddressSecondary='$(SecondaryReplica)'
else
 select @SystemAddressSecondary=substring('$(SecondaryReplica)',0,charindex('','$(SecondaryReplica)'))

declare @CMD nvarchar(max)
set @CMD='
CREATE AVAILABILITY GROUP [$(AvailabilityGroup)]
WITH (AUTOMATED_BACKUP_PREFERENCE = PRIMARY)
FOR 
REPLICA ON N''$(PrimaryReplica)'' WITH (ENDPOINT_URL = N''TCP://' + @SystemAddressPrimary + ':$(EndPointPort)'', FAILOVER_MODE = AUTOMATIC, AVAILABILITY_MODE = SYNCHRONOUS_COMMIT, BACKUP_PRIORITY = 50, SECONDARY_ROLE(ALLOW_CONNECTIONS = READ_ONLY)),
 N''$(SecondaryReplica)'' WITH (ENDPOINT_URL = N''TCP://' + @SystemAddressSecondary + ':$(EndPointPort)'', FAILOVER_MODE = AUTOMATIC, AVAILABILITY_MODE = SYNCHRONOUS_COMMIT, BACKUP_PRIORITY = 50, SECONDARY_ROLE(ALLOW_CONNECTIONS = READ_ONLY));'

print 'Create the availability group...'
exec sp_executesql @CMD

print 'Create the availability group listener...'
ALTER AVAILABILITY GROUP [$(AvailabilityGroup)]
ADD LISTENER N'$(Listener)' (
WITH IP
((N'$(IPListener)', N'$(ListenerSubnet)')
)
, PORT=$(ListenerPort));

GO

:Connect $(SecondaryReplica)

ALTER AVAILABILITY GROUP [$(AvailabilityGroup)] JOIN;

GO

Adding a crappy backup network to your cluster

Last week, I got a really nice challenge of a customer.
They wanted to add a backup network in their 2 node cluster.
No problem, that’s pretty straight forward but… they mentioned that the backup network is crappy and unstable…
Despite that the network was unstable, they still want it to use it.

The tricky part here was that the cluster could not failover in case of a network failure of the backup LAN.
On the other hand, SQL Server must be dependent on the backup IP address if you want to use it.

This is how I worked out my solution
First of all, I added the backup IP address in my cluster configuration. I added it under the SQL Server Network Name.

At this moment, the SQL Server Network Name is now dependent on both IP addresses.
The SQL Server resource is dependent on the SQL Server Network Name, which means that it will listen on both IP addresses.
I’ve verified it by restarting the SQL Server and checked the SQL Server Log after the reboot.

So far so good. But with this configuration, a failover will occur when the backup network will go offline.
SQL Server will go offline because it’s dependent on the SQL Server Network Name, which is also offline because it is dependent on both IP addresses.
To fix this problem, I changed the dependencies of the SQL Server Network Name. Instead of being dependent on both IP addresses, only one IP address should be up and running.
I did this by changing the dependencies from AND into OR.

With this configuration, the SQL Server Network Name will stay online if the backup network goes offline.
But the reverse is also true… if the production LAN should go offline and the backup LAN stays up, the SQL Server Network Name will also stay online.
In this particular case, we want of course that a failover is initiated.

To make the failover possible, I added an extra dependency. The backup IP address must be dependent on the production LAN.

Let’s follow the chain what happens when the production LAN goes offline

backup network will go down because it’s dependent on the production LAN

SQL Server Network Name will go down because both production and backup LANs are down

SQL Server will go down because it’s dependent on the SQL Server Network Name

Failover is initiated!

 

To be sure that no failover will occur in case of restart failure of the backup IP address, I’ve unchecked the option “If restart is unsuccessful, fail over all resources in this Role” in the Policies tab of the backup IP address.

With this configuration the resource will try to restart every 2 minutes for 1 hour. If it is still not possible to restart, it will stay in a failed status.

I’ve tested this configuration really good and it’s working perfectly.

But… the real solution in this case was to get the backup network stable but this configuration is a nice workaround J

How to move the Available Storage group in a Cluster

I’m working on a fully automatic silent installation of an SQL 2008 Failover cluster.  This include a lot of testing, believe me Knipogende emoticon After each uninstall, I always had to move the available storage to the correct node to start my installation.  I found it very annoying that this is not possible in the Failover Cluster Manager tool.  As you can see below,when you click on the storage or on a specific disk, there is no option available to move the Available Storage group to another node.

Storage1

Storage2

So the only option that I had was rebooting the server so that the storage group did a failover Verwarde emoticon
It takes a lot of time to reboot so there must be a faster way.

And there is one! Pretty simple in fact…

Just open a command prompt and execute the command below where <Nodename> is the name of the server where you want to move your storage.

cluster group “Available Storage” /MOVE:<Nodename>

Conclusion:The cluster command has more options available than the failover cluster manager. So, if you want to perform a certain action on your cluster and it’s not available in the failover cluster manager, don’t forget to check the cluster command.  Maybe you’ll find the solution there.

Configure Cluster MSDTC with Powershell

Today, I configured a new SQL Server Failover Cluster with command line. The goal is to have a fully automatic installation of a SQL Server Failover Cluster.  By using Powershell, I was already able to add the MSDTC resource.  My next challenge was to configure it automatically Knipogende emoticon
I figured out that the MSDTC settings were in the registry under HKEY_LOCAL_MACHINEClusterResources. The problem was that the folders below this registry key are having an ID, so I had to find a way to know which one is used for MSDTC.
Well, I managed to configure it with the following script.
Import-Module FailoverClusters;
$InstanceName=$args[0]
$msdtcKey = Get-ChildItem “Registry::HKEY_LOCAL_MACHINEClusterResources” | Where-Object {([string](get-itemproperty -path “Registry::$_”).”Name”) -match “MSDTC-SQL Server ($InstanceName)”}
Set-ItemProperty -path “Registry::$msdtcKeyMSDTCPRIVATEMSDTC” -name “AllowOnlySecureRPCCalls” -value 0
Set-ItemProperty -path “Registry::$msdtcKeyMSDTCPRIVATEMSDTC” -name “TurnOffRpcSecurity” -value 1
Set-ItemProperty -path “Registry::$msdtcKeyMSDTCPRIVATEMSDTCSecurity” -name “NetworkDTCAccess” -value 1
Set-ItemProperty -path “Registry::$msdtcKeyMSDTCPRIVATEMSDTCSecurity” -name “NetworkDTCAccessClients” -value 1
Set-ItemProperty -path “Registry::$msdtcKeyMSDTCPRIVATEMSDTCSecurity” -name “NetworkDTCAccessInbound” -value 1
Set-ItemProperty -path “Registry::$msdtcKeyMSDTCPRIVATEMSDTCSecurity” -name “NetworkDTCAccessOutbound” -value 1
Set-ItemProperty -path “Registry::$msdtcKeyMSDTCPRIVATEMSDTCSecurity” -name “NetworkDTCAccessTransactions” -value 1
Set-ItemProperty -path “Registry::$msdtcKeyMSDTCPRIVATEMSDTCSecurity” -name “XATransactions” -value 1
stop-ClusterResource “MSDTC-SQL Server ($InstanceName)”
start-ClusterResource “MSDTC-SQL Server ($InstanceName)”

Let’s have a detailed look
$msdtcKey = Get-ChildItem “Registry::HKEY_LOCAL_MACHINEClusterResources” | Where-Object {([string](get-itemproperty -path “Registry::$_”).”Name”) -match “MSDTC-SQL Server ($InstanceName)”}
First, I get all the child items of the registry HKEY_LOCAL_MACHINEClusterResources.  In each of the child items, I’m searching for a name that matches “MSDTC-SQL Server (<MyinstanceName>)”
Of course, you should change the “match” parameter when you have used another name for your MSDTC. Note: The parameter $InstanceName should be the name of your SQL Server Instance
As a result, I get the correct key!
Set-ItemProperty -path “Registry::$msdtcKeyMSDTCPRIVATEMSDTC” -name “AllowOnlySecureRPCCalls” -value 0
Set-ItemProperty -path “Registry::$msdtcKeyMSDTCPRIVATEMSDTC” -name “TurnOffRpcSecurity” -value 1
Set-ItemProperty -path “Registry::$msdtcKeyMSDTCPRIVATEMSDTCSecurity” -name “NetworkDTCAccess” -value 1
Set-ItemProperty -path “Registry::$msdtcKeyMSDTCPRIVATEMSDTCSecurity” -name “NetworkDTCAccessClients” -value 1
Set-ItemProperty -path “Registry::$msdtcKeyMSDTCPRIVATEMSDTCSecurity” -name “NetworkDTCAccessInbound” -value 1
Set-ItemProperty -path “Registry::$msdtcKeyMSDTCPRIVATEMSDTCSecurity” -name “NetworkDTCAccessOutbound” -value 1
Set-ItemProperty -path “Registry::$msdtcKeyMSDTCPRIVATEMSDTCSecurity” -name “NetworkDTCAccessTransactions” -value 1
Set-ItemProperty -path “Registry::$msdtcKeyMSDTCPRIVATEMSDTCSecurity” -name “XATransactions” -value 1
I have used that key to set the correct registry values, according to my customer needs.
More information can also be found on
stop-ClusterResource “MSDTC-SQL Server ($InstanceName)”
start-ClusterResource “MSDTC-SQL Server ($InstanceName)”
Finally, you need to stop and start the cluster resource in order to take the changes into effect.
Note: The parameter $InstanceName should be the name of your SQL Server Instance
Have fun with it!