DBA Morning Checklist with PBM 3.0

At PASS Summit I presented my session Run Your DBA Morning Checklist with Policy-Based Management. I started my session by asking the audience what they wanted to check every morning on their SQL environment.
I noted the list down and compared it to my existing morning checklist. The good thing about the audience list was that I got some new ideas to add to my DBA Morning Checklist. I’m pleased to share my new DBA Morning Checklist 3.0 with the following checks added.

Default Backup Compression

The policy is going to check if the server setting Compress Backup is switched on. Backup compression increases CPU usage, and the additional CPU consumed by the compression process might adversely impact concurrent operations. Before enabling this setting, make sure that the CPU increase has no impact on your production environment. In my experience, I had never had any issue by using backup compression by default. In most cases, the full or differential backups are taken during low business hours, where the server can handle the extra CPU load.

Max degree of parallelism

The policy is going to check the server setting Max Degree of Parallelism. The condition is built according to the Microsoft recommendation

  • For servers that use more than eight processors, use the following configuration: MAXDOP=8.
  • For servers that have eight or less processors, use the following configuration where N equals the number of processors: MAXDOP=0 to N.
  • For servers that have NUMA configured, MAXDOP should not exceed the number of CPUs that are assigned to each NUMA node.
  • For servers that have hyper-threading enabled, the MAXDOP value should not exceed the number of physical processors.

Maximum Server Memory

I’m just going to check if the Maximum Server Memory setting is different than the default setting 2147483647 MB. It’s difficult to check other values because the policy cannot check if there are other instances or other software that need memory.

Optimize for Ad Hoc Workloads

The policy is going to check if the server setting Optimize for Ad Hoc Workloads, is set to True. By default, this setting is set to False.
The optimize for ad hoc workloads option is used to improve the efficiency of the plan cache for workloads that contain many single use ad hoc batches. More information can be found on https://msdn.microsoft.com/en-us/library/cc645587.aspx

I don’t see any downside by enabling this setting.

Outdated Statistics

I’m going to check if every statistic has been updated since the last 7 days.

Remote DAC Connection

This policy is going to check if the server setting Remote DAC Connection is enabled.
The DAC lets an administrator access a running server to execute diagnostic functions or Transact-SQL statements, or to troubleshoot problems on the server, even when the server is locked or running in an abnormal state and not responding to a SQL Server Database Engine connection. More information can be found on https://msdn.microsoft.com/en-us/library/ms190468.aspx I recommend to always enable this setting after installation.

Tables with Fragmented indexes

This policy is verified against every table in every database. The policy will fail if the index has a fragmentation percentage which is higher than 10% and page count > 1000

Trace Flags Enabled

By default, no trace flags are enabled. However, some trace flags might be useful to enable on production servers. I have blogged about the trace flags I enable by default.
In this policy, I’m going to check if all these trace flags are enabled.

Perform volume maintenance task

Perform volume maintenance task is a local group policy that can be set to improve the Instant File Initialization. More information can be found on https://msdn.microsoft.com/en-us/library/ms175935.aspx

However, I was not able to define any policy to check this GPO… I thought it was possible to check this with WMI but didn’t find a solution for it yet. So, if somebody can help me out with this, please let me know 😉

But again, you can download my full DBA Morning Checklist 3.0 from http://aka.ms/dbamorningchecklist Feel free to adjust J

Enjoy!

Pieter

Last known good DBCC CHECKDB not always correct

On the internet you can find a lot of methods to find out when DBCC CHECKDB has last run on your databases. The 2 most common queries are the based on DBCC DBINFO() and DBCC PAGE.
Below you can find 2 examples of this (replace “YourDatabase” with the correct database name):

USE YourDatabase
GO
CREATE TABLE #tmp
	(ParentObject varchar(1000) NULL,Object varchar(1000) NULL,Field varchar(1000) NULL,Value varchar(1000) NULL)
insert into #tmp
EXEC ('DBCC DBINFO() WITH TABLERESULTS')
select cast(value as datetime) from #tmp where field='dbi_dbcclastknowngood'
drop table #tmp

OR

DBCC TRACEON (3604); 
GO
DBCC PAGE (YourDatabase, 1, 9, 3);
GO

In my morning DBA checklist, I’m using the first query to check if there has been a database integrity check for the last 7 days. This query works well, however, last week I’ve noticed an issue. As soon as a database is in read-only or emergency mode, the DBCC CHECKDB command will succeed but the dbi_dbcclastknowngood field will (or can) not be updated. For that reason, my check indicates a failed status for my read-only databases…

Let’s take an example. First of all, I’m going to perform a DBCC CHECKDB for my AdventureWorks database

DBCC CHECKDB(AdventureWorks2012)

Once the check is finished, let’s verify the last dbcc check date with the DBCC DBINFO() query.

DBCC

Now, let’s set the database to read-only and do the test again.

DBCC read only

As you might notice, the last DBCC CHECKDB time is the same in both cases.

I found a good tip (old blog post) of Jonathan Kehayias to find the correct information. Just scan the SQL Error Log for the last 7 days. If I check the logs from my previous tests, you can see that there was a DBCC CHECKDB at 14:57:22 which was on my read-only database.

Error Log

I’ve modified the script of Jonathan. I’m using the xp_readerrorlog and I’m not checking the SQL 2000 version anymore… I’ve also changed my DBCC Check in my DBA checklist with the query below.

CREATE TABLE #ErrorLog
	(
	  Logdate datetime ,
	  ProcessInfo VARCHAR(12) ,
	  MsgText VARCHAR(3000)
	)
CREATE TABLE #NumberOfErrorLogFiles
	(
		ArchiveID tinyint,
		FileDate datetime,
		FileSize int
	)
insert into #NumberOfErrorLogFiles
EXEC master..xp_enumerrorlogs

DECLARE @SinceDate datetime  
SET @SinceDate = DATEADD(dd, -7, GETDATE())
DECLARE @ErrorFile tinyint
DECLARE @Severity tinyint
DECLARE @SeverityString nvarchar(3)
DECLARE @LastLogFile tinyint
select @LastLogFile=min(ArchiveID) from #NumberOfErrorLogFiles
where FileDate<@SinceDate
DECLARE @DBCCStatement nvarchar(1024)
set @DBCCStatement=N'DBCC CHECKDB (' + DB_NAME() + ')'

set @ErrorFile=0
while @ErrorFile<=ISNULL(@LastLogFile,0)
begin
    insert into #Errorlog
	EXEC master.dbo.xp_readerrorlog @ErrorFile, 1, @DBCCStatement, NULL, @SinceDate, NULL, N'ASC'
	set @ErrorFile=@ErrorFile+1
end

select * from #ErrorLog where Logdate>@SinceDate

drop table #ErrorLog
drop table #NumberOfErrorLogFiles

Have fun!

Pieter

DBA Morning Checklist with PBM (Part 3)

A few months ago, I blogged about how I did my morning checklist by using Policy Based Management together with a Central Management Server. This method works really well for me and I’ve implemented it at all my customers for the moment. When the months passed by, I’ve added extra checks to my list and created the necessary policies for it. This blog post gives a short description how my full morning DBA checklist looks like.

All jobs succeeded

Are there any jobs that have failed the last 7 days? I’m using 7 days here because I visit my customers mostly on a weekly bases. You can of course change that period in the condition that is checked in the policy.

Check SQL Server services are running

This policy will check if all SQL Server services are still running. Sounds ridiculous but it happened to me a few times that the SQL Server Agent was stopped without anybody noticing. This check only works from SQL Server 2008 R2 SP1 onwards because I’m using the DMV sys.dm_server_services, which is available since that version.

Data and Log files on separate volumes

For best performance, your data and log files should be stored on separate volumes. In the condition, I’m just checking the @DataAndLogFilesOnSeparateLogicalVolumes parameter in the Database Performance facet.

Data File Growth in MB

The autogrowth setting of your data file should be set MB instead of %. I’m also verifying if the growth setting is at least 128 MB…

Database Auto Close

You should never set the auto close parameter to True on your database. This option will cause performance degradation because of the overhead of opening and closing the database after each connection. The policy will check if there is any database on the server that has auto close set to True.

Database Auto Create Statistics

This policy will check if the Auto Create Statistics database setting is set to True. In some cases, like on some SharePoint databases, you have to ignore this check because the statistics are handled by SharePoint.

Database Auto Shrink

This rule checks whether the auto shrink database option is set to False. This setting should never be set to True because it will lead to physical fragmentation.

Database Auto Update Statistics

This policy will check if the Auto Update Statistics database setting is set to True. In some cases, like on some SharePoint databases, you have to ignore this check because the statistics are handled by SharePoint.

Database Collation

Is the database collation the same as the collation of my master database? The condition that is checked is the @CollationMatchesModelOrMaster parameter in de Database Performance facet.

Database Compatibility Level

Database compatibility level should be equal to the server level. The condition is going to check the @CompatibilityLevel database parameter against the collation of the master database that I retrieve by querying the sys.databases.

Database DBCC CHECKDB last 7 days

Is a DBCC CHECKDB performed the last week on the databases? I’m choosing 7 days here because this is how I schedule the DBCC command at my customers. The condition is checked based on a T-SQL query that is using the DBCC DBINFO() information.

Database is in Normal State

The status of the database should be in a normal state. This policy will indicate a failure on offline, suspect, … status of the database

Database owner

This policy will check if the owner of my database is set to sa.

Database page verify

Database option Page Verify should be set to CHECKSUM.

Free disk space

This policy is going to check if there is any LUN on the server that has less than 10% free disk space. The check itself is done by a powershell command or if powershell is not installed on the server by the fsutil command. The script also supports mount points.

Full or Differential Backup

Is there a full backup taken of my database during the last 24 hours? Or… Is there a differential backup taken of my database during the last 24 hours and a full backup during the last 7 days. Some of the database are too big to take a full backup every day. This check is going to verify is there is a differential backup. I’ve planned a full backup every week, next to the differential backups. You can change the condition of course, according to you backup strategy.

Jobs owned by sa

Are my jobs owned by the sa user? In the condition, I’ve filtered on the category Jobs from MSX, Database Maintenance, Full-Text and Uncategorized. There is no server policy created for this, so I’m using a T-sql script based on the sysjobs table.

Log File Growth in MB

The autogrowth setting of your log file should be set MB instead of %. I’m also verifying if the growth setting is at least 128 MB and not 1 MB in some cases…

Recovery Model = FULL for Production Databases

The policy is going to check if the recovery model of production databases is set to full. By default, I’m always using a full recovery model for production databases unless the application owner tells me that simple should be fine. I had to find a way to specify if a database exists on a production server or not? I’ve used a simple trick by specifying an Extended Property “Environment” on the master database.

In the policy, I’ve specified a server restriction based on the extended property.

Severity Alerts

Were there any severity Alerts in the SQL Server Log with a value larger than 16 during the past 7 days? When your SQL Server Log file is really big, it might happen that this policy returns a timeout.

Transaction Log Backup Taken

Is there a transaction log backup taken of my database during the last 2 hours? Again, you can change this time period depending on the backup strategy that you’re using.

All these checks give a me nice overview how healthy my SQL environment is, in no time!

You can download all my policies here

In case you should have any issues or you want to share some extra policies, just let me know!

Pieter

DBA Checklist with Policy Based Management 2.0

In my last blog post I talked about how easy it is to use Policy Based Management for your daily DBA Checklist.
And it gets even better when you use the PBM in combination with a Central Management Server.
I check my customers servers in no time 😀

However, while using it for a few weeks now, I noticed that there were some improvements possible.
The first improvement was the policy where I check the free disk space of the server.
The initial version was going to check if the amount of free space of the volume that contains the data file was more than 10 GB.
See this screenshot below

 

But what about the disks where the log files are stored? They are not checked.
Or what happens if your disk is less than 10 GB like for example the system databases LUN. The policy will always fail.
I had to find a way that could show the free disk space in percentage for every disk in my server.
I’ve checked all the Facets in the PBM but I couldn’t find one that was useful. If you can find one, please let me know.

I had to come up with a script that could do the trick.
In the old days, you had to build a script that was using the xp_fixeddrives stored procedure. Problem is that it this procedure can’t check mount points.
The best way nowadays is use powershell. But then again, some old servers didn’t had powershell installed and the script failed again.
I found some useful scripts on this site http://www.mssqltips.com/sqlservertip/2444/script-to-get-available-and-free-disk-space-for-sql-server/ , combined and tuned a few of them.

SET NOCOUNT ON
Declare 
@Free_Bytes BIGINT,
@TotalBytes BIGINT,
@Drive Char(1),
@sql varchar(1000),
@svrName varchar(255)
set @svrName = convert(varchar(255),SERVERPROPERTY('MachineName'))
set @sql = 'powershell.exe -c "Get-WmiObject -ComputerName ' + QUOTENAME(@svrName,'''') + ' -Class Win32_Volume -Filter ''DriveType = 3'' | select name,capacity,freespace | foreach{$_.name+''|''+$_.capacity+''%''+$_.freespace+''*''}"'
--creating a temporary tables
CREATE TABLE #output (line varchar(255))
CREATE TABLE #DiskSpace ( Drive Char(1), Total_Space Bigint, Free_Space Bigint,Percentage_Free as (Free_Space*100 / Total_Space ) )
CREATE TABLE #Fixeddrives ( Drive Char(1), Free_Space Bigint)
CREATE TABLE #fsutil ( ID INT IDENTITY , Drivedata Varchar(2000))
--inserting disk name, total space and free space value in to temporary table
insert #output
EXEC xp_cmdshell @sql
--select * from #output
if (select count(1) from #output where line like '%powershell.exe%')=0
BEGIN
	insert into #DiskSpace
	select rtrim(ltrim(SUBSTRING(line,1,CHARINDEX('|',line) -3))) as drivename
		  ,round(cast(rtrim(ltrim(SUBSTRING(line,CHARINDEX('|',line)+1,
		  (CHARINDEX('%',line) -1)-CHARINDEX('|',line)) )) as Float),0) as 'TotalSpace'
		  ,round(cast(rtrim(ltrim(SUBSTRING(line,CHARINDEX('%',line)+1,
		  (CHARINDEX('*',line) -1)-CHARINDEX('%',line)) )) as Float),0) as 'FreeSpace'
	from #output
	where line like '[A-Z][:]%' and LEN(line)>6
	order by drivename
END
ELSE
BEGIN
	Insert into #Fixeddrives exec master.dbo.xp_fixeddrives
	insert into #DiskSpace ( Drive)
	select Drive from #Fixeddrives

	DECLARE Drive_cursor CURSOR FOR
	SELECT Drive from #DiskSpace
	OPEN Drive_cursor
	FETCH NEXT FROM Drive_cursor INTO @Drive

	WHILE @@FETCH_STATUS = 0
	BEGIN

	select @sql = 'insert into #fsutil exec master.dbo.xp_cmdshell ''fsutil volume diskfree '+ @Drive+':'''
	exec(@sql)

	select @Free_Bytes = substring (Drivedata, charIndex (':', Drivedata, 0)+ 1 , 1000)
	from #fsutil where Drivedata like '%Total # of free bytes%'

	select @TotalBytes =substring (Drivedata, charIndex (':', Drivedata, 0)+ 1 , 1000)
	from #fsutil where Drivedata like '%Total # of bytes%'

	Update #DiskSpace set Total_Space = @TotalBytes, Free_Space=@Free_Bytes
	WHERE Drive = @Drive

	TRUNCATE TABLE #fsutil
	FETCH NEXT FROM Drive_cursor INTO @Drive

	END
	CLOSE Drive_cursor
	DEALLOCATE Drive_cursor

END

select * from #DiskSpace
order by Drive

--Drop the temporary tables
drop table #output
drop table #DiskSpace
drop table #Fixeddrives
drop table #fsutil

You can use this script in the condition of your policy by using the ExecutSql option

 

Second Issue was the full backup check. Some big databases, like data warehouses, are too big to perform a full backup every day.
Instead of a full backup, a daily differential backup is taken and a full backup is taken every week.
With that backup strategy my full backup policy was failing, so I also had to cover the differential backups in my condition.

 

You can download the new version of my policies from my SkyDrive and import them into your Central Management Server.
If you should have any suggestion on improving the policies, do not hesitate to contact me!

Pieter

DBA Checklist with Policy Based Management

In my job as a SQL Server Database Consultant, I have my “recurring” customers. This means that every week, I’m at the customer for 1 day to do day-to-day DBA tasks.
Since those customers don’t have a DBA, I have to perform the daily DBA checklist on a weekly basis J… Initially, I’ve started with executing some T-SQL scripts to verify the SQL Servers. But as more servers were installed, my DBA checklist became really time consuming.

So, I had to find a way to perform the same checks in a more efficient way. I’ve started playing with the policy based management feature and I’ve created my “DBA Checklist” policies that I’m checking against all the SQL Servers at once.
Less time consuming and it gives you a nice overview of failed checks.

The first step that you need to do is creating a Central Management Server (CMS) and register all your servers under that CMS.
I’m not going into detail on this but you can find more information on http://msdn.microsoft.com/en-us/library/bb895144.aspx

I’ve created the following DBA Checklist

  • Verify if there were any failed SQL Agent Jobs during the last week
  • Verify if all the SQL Server Services are running
  • Verify if all databases are in a “Normal” state (not suspect, restoring, offline, …)
  • Verify free disk space
  • Verify free space in the data files
  • Verify if a full backup was taken of the database in the last 24 hours
  • Verify if a transaction log backup was taken of the database in the last hour
  • Verify if there were errors in the error log with a severity > 17 for the last 7 days

This checklist is pretty basic, I know, but it gives me a good idea if my SQL Servers are fine or not. Feel free to give me any suggestion of checks that I can add to the list.

You can download the policies from my SkyDrive and import them into your Central Management Server.
Please note that the policies will not work against a SQL Server 2000 or lower.

Connect to your CMS – Go to Management – Policy Management – Policies
Right click Policies and select “Import Policy”

 

Select the xml –files you just have downloaded as the “Files to import” and click OK.

 

If all went fine, you should get a result like the one below

 

Now let’s do a check against all my servers.
Right click on your CMS and select “Evaluate Policies”

As a source, select again the CMS server, because that’s the one that where you have just imported the policies

Select the policies that you want to verify. You can easily find my policies because I’ve put them in the category “DBA Checklist”

Now, just click the Evaluate button. As a result, all the selected policies will be checked against the servers that are registered under your CSM.
When the checks are finished, you should see this kind of result

Below, in the target details area you can see which checks are failed. Click on the “View” link to see more details and correct them.
In the end, you want to see all you’re policy checks green 😀 !

If you should have suggestions or you find any improvements, please let me know so that I can adapt them.

Have fun!