Posts

Three amigos screw up a cluster

Three amigos joined me for a cluster installation the other day. Neither of them was very friendly, so I decided to call them all out in one place in order for others to steer clear.

The opinionated folder

Using mount point folders is a great idea for neatness and to avoid having a gazillion drive letters in play. It is also a great idea to review the best practices for SQL Server clusters on mount point folders, as there are a few gotchas. This blog post lays it all out.

If you fail to set up the correct permissions, funky stuff will ensue. The issue I’ve seen most often is something along the lines of lack of privileges on the SYSVOL\{some-long-GUID}. In that case, it’s time to read the blog post and berate oneself for trying to put stuff in the root folder.

The unwilling mover

The following error stumped me for a while:

“The resource ‘LogDisk (INST1)’ could not be moved from cluster group ‘Available Storage’ to cluster group ‘SQL Server (INST)’ “

For some reason the cluster didn’t feel like moving the cluster disks into the SQL Server group. Unforturnately I didn’t have enough time to really dig in to why this occurs, but it does have something to do with running with mounted folders. Fortunately the solution is fairly simple – just create an empty cluster group, call it what you want your SQL Server cluster group to be called, and move the disks into it before you install SQL Server. This is usually very simple, but in this case the GUI didn’t let me move the disks. Basically nothing happened, which made it somwhat difficult to troubleshoot… When searching for the correct syntax for moving disks via Powershell, I came across this blog post that covers this issue and the Powershell workaround.

The lost agent (with a slight detour via insufficient privileges)

clip_image003

The SQL Server installation failed thoroughly when it became evident that the cluster network object (CNO) for some reason lacked the privileges to create computer objects. This led to a failure of the cluster to bring up the name resource, and subsequently to abort the installation. The solution is to make sure the CNO has the permission to create computer objects. This blog post goes into excellent detail.

After rectifying the above I decided to salvage the installation, and I noticed that there was something missing. Like the agent. In order to create that resource, take a look here. With that done this specific issue was taken care of.

 

 

 

 

 

 

 

GoldenGate surprises

I’ve been way busier than I exepected, and I’ve got somewhat of a bad conscience for not having been able to update the blog. But here goes.

I’ve been tasked with setting up an Oracle GoldenGate solution for a customer. The environment is fairly straight forward with two servers running Windows 2012R2 and Oracle Standard Edition One (yep, licnesens bought before the new SE Two licensing) 11.2.0.4 and GoldenGate 12.1.0.1. The idea is to get GoldenGate to handle replication from site A to site B in case of disaster.

GoldenGate is an interesting animal in itself as it is *very* complex and have in true Oracle fashion somewhere along the lines of a gazillion parameters. In this case we’ve opted to set up GoldenGate on each of the two servers, despite the fact that teh replication is stricly one-way (from site A to site B).

There are a lot of blogs (especially Gavin Soorma) that do an execellent job of explaining exactly how GoldenGate works, so I’ll focus on some of the more interesting findings so far:

Integrated or classic mode

GoldenGate 12c gives you the choice of using the classic way of sifting through the log files and shipping the result or leveraging logminer to extract the data. Long story short: integrated extract and replicat supports more data types and is faster. WAY faster. I tried banging in 2M rows into a table and measuring how quickly the data crawled over to the secondary. It went from minutes to seconds. GoldenGate integrated extract/replicat *is* supported on Oracle Standard Edition (despite that the integrated functionality leverages Oracle Streams, an enterprise-only feature) but some things (like parallelism) is disabled, despite the fact that Oracle for some reason recommends using classic capture/replicat on Standard Edition and Standard Edition One. Considering the speed increase with integrated I’ll go with that, thank you very much.

The value of keys

Picture this: a single, heap table with a couple of hundred thousand rows. Inserting the data is quick and GoldenGate does an excellent job of shuttling the data over to the secondary site B. Now let’s see what happens when we do a delete instead. The cursor comes back in a few seconds on the pirmary, and after hitting commit I started looking at the secondary. This is where things become interesting.
The CPU went to 50% and basically pegged there – this is due to the fact that I’ve got two CPUs in my testing box and one of them was completely saturated. Funny thing is, that apart from an initial flurry of disk- and network activity, nothing seemed to be happening on the server. The rows are still in my table as well. Scratching my head I took a look at long-running queries using a query by David Mann

SELECT nvl(ses.username,'ORACLE PROC')||' ('||ses.sid||')' USERNAME,
 SID, 
 MACHINE, 
 REPLACE(SQL.SQL_TEXT,CHR(10),'') STMT, 
 ltrim(to_char(floor(SES.LAST_CALL_ET/3600), '09')) || ':'
 || ltrim(to_char(floor(mod(SES.LAST_CALL_ET, 3600)/60), '09')) || ':'
 || ltrim(to_char(mod(SES.LAST_CALL_ET, 60), '09')) RUNT 
 FROM V$SESSION SES, 
 V$SQLtext_with_newlines SQL 
 where SES.STATUS = 'ACTIVE'
 and SES.USERNAME is not null
 and SES.SQL_ADDRESS = SQL.ADDRESS 
 and SES.SQL_HASH_VALUE = SQL.HASH_VALUE 
 and Ses.AUDSID <> userenv('SESSIONID') 
 order by runt desc, 1,sql.piece;

and found this little gem:

 DELETE /*+ restrict_all_ref_cons */ FROM "DBUSER"."AAA" WHERE "A"=:1 and rownum < 2

Wait, what? This basically means a RBAR (row by agonizing row) handling of a delete which took about four HOURS(!). Lesson learned – ALWAYS make sure to have keys in your tables. And expect deletes to take a lot of time.

Memory woes

When stress testing the replication I started receiving quite a lot of bothersome memory problems (inability to reallocate, out of virtual memory, header corruptions, etc). After searching the internet for answers and finding exacly nothing, it turned out that there is a small gotcha with the 64 bit version of GoldenGate – it defaults to 64GB memory (which means 128GB virtual memory pool). Since my machine only sported 8GB of memory, the process very much did get a memory problem. By changing the parameter CACHESIZE to a more reasonable number the problem went away.

Back in the saddle – index time

Well, I’m back in the saddle. Four weeks of vacation, of which three resembled some kind of monsoon. The last was great, and ofcourse the weather became great as soon as I returned to work.

I decided to get me a new script to show me unused indexes, and started looking at Stack Overflow to see if someone already had taken care of it (someone usually has). Turns out that there was a blog post by John Pasquet of folioTek here, and a very interesting addition to said blog post (and scripts) by user HAL9000 of Stack Overflow here. It just became my go to script for finding unused indexes. Thanks, HAL9000 and John!

 

Vacation! And a few gotchas…

The blog has been slow lately due to the impending vacation. I’ll toss in a few things I stumbled on the last week though:

When dealing with a cluster, chances are that some of the disks just won’t be returned to the fold when deleted from available storage. For some reason, the cluster sometimes keeps the reservations on some disks, leading to some rather weird error messages. The solution is to us powershell to give said reservations the boot like this:

Clear-ClusterDiskReservation -disk X -force

Speaking of clusters; trying to find which node the SQL Server is running from can be a bit of a challenge from within SQL Server. Try this script that I found over at SQLMatters.com:

With ClusterActiveNode as
(
SELECT SERVERPROPERTY('ComputerNamePhysicalNetBIOS') as NodeName, Cast('Active' as varchar(10)) as Active
),
ClusterNodes as
(
SELECT NodeName FROM sys.dm_os_cluster_nodes
)
Select b.nodename, isnull(active,'Passive') as NodeStatus from ClusterNodes as b left join ClusterActiveNode as a
on a.NodeName = b.nodename

I’ve also thoroughly enjoyed setting up a 2008R2 cluster on Windows 2012R2 and mount points. That’s also riddled with some interesting … features. A couple of good links are an mssqltips.com article here, a Microsoft Connect entry here and a blog post from Jacob Moran from 2009 here.

Found a nice script to set up instant file instantiation with powershell, written by Ingo Karstein:

#written by Ingo Karstein, http://ikarstein.wordpress.com

#  v1.1, 10/12/2012

# modified by Preston Cooper 4-27-2015 to modify "Perform Volume Maintenance Tasks" and added elevation from http://stackoverflow.com/questions/7690994/powershell-running-a-command-as-administrator

## <--- Configure here
$accountToAdd = "DOMAIN\Username"
$pathToFile1 = "c:\temp\secedit.sdb"
$pathToFile2 = "c:\temp\PerformVolumeMaintenanceTasks.txt"

## ---> End of Config

# Get the ID and security principal of the current user account
$myWindowsID = [System.Security.Principal.WindowsIdentity]::GetCurrent();
$myWindowsPrincipal = New-Object System.Security.Principal.WindowsPrincipal($myWindowsID);

# Get the security principal for the administrator role
$adminRole = [System.Security.Principal.WindowsBuiltInRole]::Administrator;

# Check to see if we are currently running as an administrator
if($myWindowsPrincipal.IsInRole($adminRole))
{
    # We are running as an administrator, so change the title and background colour to indicate this
    $Host.UI.RawUI.WindowTitle = $myInvocation.MyCommand.Definition + "(Elevated)";
    $Host.UI.RawUI.BackgroundColor = "DarkBlue";
    Clear-Host;
}else{
    # We are not running as an administrator, so relaunch as administrator

    # Create a new process object that starts PowerShell
    $newProcess = New-Object System.Diagnostics.ProcessStartInfo "PowerShell";

    # Specify the current script path and name as a parameter with added scope and support for scripts with spaces in it's path
    $newProcess.Arguments = "& '" + $script:MyInvocation.MyCommand.Path + "'"

    # Indicate that the process should be elevated
    $newProcess.Verb = "runas";

    # Start the new process
    [System.Diagnostics.Process]::Start($newProcess);

    # Exit from the current, unelevated, process
    Exit;
}

And finally a nice script from Eric Humphrey to set SQL Server trace flags with Powershell:

##############################################################################
##
## Add-SqlServerStartupParameter
##
## by Eric Humphrey (http://www.erichumphrey.com/category/powershell/)
##
##############################################################################

<#

.SYNOPSIS

Adds an entry to the startup parameters list for all instances of SQL Server
on a computer

.EXAMPLE

PS >Add-SqlServerStartupParameter '-T3226'

#>

param(
    ## The parameter you wish to add
    [Parameter(Mandatory = $true)]
    $StartupParameter
)

$hklmRootNode = "HKLM:\SOFTWARE\Microsoft\Microsoft SQL Server"

$props = Get-ItemProperty "$hklmRootNode\Instance Names\SQL"
$instances = $props.psobject.properties | ?{$_.Value -like 'MSSQL*'} | select Value

$instances | %{
    $inst = $_.Value;
    $regKey = "$hklmRootNode\$inst\MSSQLServer\Parameters"
    $props = Get-ItemProperty $regKey
    $params = $props.psobject.properties | ?{$_.Name -like 'SQLArg*'} | select Name, Value
    #$params | ft -AutoSize
    $hasFlag = $false
    foreach ($param in $params) {
        if($param.Value -eq $StartupParameter) {
            $hasFlag = $true
            break;
        }
    }
    if (-not $hasFlag) {
        "Adding $StartupParameter"
        $newRegProp = "SQLArg"+($params.Count)
        Set-ItemProperty -Path $regKey -Name $newRegProp -Value $StartupParameter
    } else {
        "$StartupParameter already set"
    }
}

 

Basic settings for SQL Server 2012/2014, part 4

Part 4 – scripts for the previous parts

Below are scripts for most of the SQL Server specific settings I discussed in the previous three parts. Please note that not quite everything is included (trace flags, instant file instantiation, etc.).

 

-- Set memory amount
sp_configure 'show advanced options', 1; GO RECONFIGURE; GO sp_configure 'max server memory', 40000; GO RECONFIGURE; GO



-- Turn on optimize for adhoc workloads
EXECUTE sp_configure 'show advanced options', 1; GO RECONFIGURE; GO EXECUTE sp_configure 'optimize for ad hoc workloads', 1; GO RECONFIGURE
-- Turn on DAC

EXECUTE sp_configure 'remote admin connections', 1;
GO
RECONFIGURE;
GO

-- Adjusting TEMPDB
ALTER DATABASE [tempdb] MODIFY FILE ( NAME = N'tempdev', SIZE = 100MB , FILEGROWTH = 50MB);
GO

-- Script to create TEMPDB files like the one existing
USE [master]

go

DECLARE @cpu_count     INT,
@file_count    INT,
@logical_name  SYSNAME,
@file_name     NVARCHAR(520),
@physical_name NVARCHAR(520),
@alter_command NVARCHAR(max)

SELECT @physical_name = physical_name
FROM   tempdb.sys.database_files
WHERE  NAME = 'tempdev'

SELECT @file_count = Count(*)
FROM   tempdb.sys.database_files
WHERE  type_desc = 'ROWS'

SELECT @cpu_count = 8

WHILE @file_count < @cpu_count
BEGIN
SELECT @logical_name = 'tempdev'
+ Cast(@file_count +1 AS NVARCHAR)

SELECT @file_name = Replace(@physical_name, 'tempdb.mdf',
@logical_name + '.ndf')

SELECT @alter_command = 'ALTER DATABASE [tempdb] ADD FILE ( NAME =N'''
+ @logical_name + ''', FILENAME =N'''
+ @file_name
+ ''', SIZE = 100MB, FILEGROWTH = 50MB )'

PRINT @alter_command

EXEC Sp_executesql
@alter_command

SELECT @file_count = @file_count + 1
END;



--Create operator DBATeam
USE [msdb]
GO
EXEC msdb.dbo.sp_add_operator @name=N'DBATeam',
@enabled=1,
@pager_days=0,
@email_address=N'foo@bar.com; gazonk@bar.com';
GO
--Add alerts

USE [msdb]
GO
EXEC msdb.dbo.sp_add_alert @name=N'Severity 016',
@message_id=0,
@severity=16,
@enabled=1,
@delay_between_responses=60,
@include_event_description_in=1,
@job_id=N'00000000-0000-0000-0000-000000000000';
GO

EXEC msdb.dbo.sp_add_notification @alert_name=N'Severity 016', @operator_name=N'DBATeam', @notification_method = 7;
GO

EXEC msdb.dbo.sp_add_alert @name=N'Severity 017',
@message_id=0,
@severity=17,
@enabled=1,
@delay_between_responses=60,
@include_event_description_in=1,
@job_id=N'00000000-0000-0000-0000-000000000000';
GO

EXEC msdb.dbo.sp_add_notification @alert_name=N'Severity 017', @operator_name=N'DBATeam', @notification_method = 7;
GO

EXEC msdb.dbo.sp_add_alert @name=N'Severity 018',
@message_id=0,
@severity=18,
@enabled=1,
@delay_between_responses=60,
@include_event_description_in=1,
@job_id=N'00000000-0000-0000-0000-000000000000';
GO

EXEC msdb.dbo.sp_add_notification @alert_name=N'Severity 018', @operator_name=N'DBATeam', @notification_method = 7;
GO

EXEC msdb.dbo.sp_add_alert @name=N'Severity 019',
@message_id=0,
@severity=19,
@enabled=1,
@delay_between_responses=60,
@include_event_description_in=1,
@job_id=N'00000000-0000-0000-0000-000000000000';
GO

EXEC msdb.dbo.sp_add_notification @alert_name=N'Severity 019', @operator_name=N'DBATeam', @notification_method = 7;
GO

EXEC msdb.dbo.sp_add_alert @name=N'Severity 020',
@message_id=0,
@severity=20,
@enabled=1,
@delay_between_responses=60,
@include_event_description_in=1,
@job_id=N'00000000-0000-0000-0000-000000000000';
GO

EXEC msdb.dbo.sp_add_notification @alert_name=N'Severity 020', @operator_name=N'DBATeam', @notification_method = 7;
GO

EXEC msdb.dbo.sp_add_alert @name=N'Severity 021',
@message_id=0,
@severity=21,
@enabled=1,
@delay_between_responses=60,
@include_event_description_in=1,
@job_id=N'00000000-0000-0000-0000-000000000000';
GO

EXEC msdb.dbo.sp_add_notification @alert_name=N'Severity 021', @operator_name=N'DBATeam', @notification_method = 7;
GO

EXEC msdb.dbo.sp_add_alert @name=N'Severity 022',
@message_id=0,
@severity=22,
@enabled=1,
@delay_between_responses=60,
@include_event_description_in=1,
@job_id=N'00000000-0000-0000-0000-000000000000';
GO

EXEC msdb.dbo.sp_add_notification @alert_name=N'Severity 022', @operator_name=N'DBATeam', @notification_method = 7;
GO

EXEC msdb.dbo.sp_add_alert @name=N'Severity 023',
@message_id=0,
@severity=23,
@enabled=1,
@delay_between_responses=60,
@include_event_description_in=1,
@job_id=N'00000000-0000-0000-0000-000000000000';
GO

EXEC msdb.dbo.sp_add_notification @alert_name=N'Severity 023', @operator_name=N'DBATeam', @notification_method = 7;
GO

EXEC msdb.dbo.sp_add_alert @name=N'Severity 024',
@message_id=0,
@severity=24,
@enabled=1,
@delay_between_responses=60,
@include_event_description_in=1,
@job_id=N'00000000-0000-0000-0000-000000000000';
GO

EXEC msdb.dbo.sp_add_notification @alert_name=N'Severity 024', @operator_name=N'DBATeam', @notification_method = 7;
GO

EXEC msdb.dbo.sp_add_alert @name=N'Severity 025',
@message_id=0,
@severity=25,
@enabled=1,
@delay_between_responses=60,
@include_event_description_in=1,
@job_id=N'00000000-0000-0000-0000-000000000000';
GO

EXEC msdb.dbo.sp_add_notification @alert_name=N'Severity 025', @operator_name=N'DBATeam', @notification_method = 7;
GO

EXEC msdb.dbo.sp_add_alert @name=N'Error Number 823',
@message_id=823,
@severity=0,
@enabled=1,
@delay_between_responses=60,
@include_event_description_in=1,
@job_id=N'00000000-0000-0000-0000-000000000000'
GO

EXEC msdb.dbo.sp_add_notification @alert_name=N'Error Number 823', @operator_name=N'DBATeam', @notification_method = 7;
GO

EXEC msdb.dbo.sp_add_alert @name=N'Error Number 824',
@message_id=824,
@severity=0,
@enabled=1,
@delay_between_responses=60,
@include_event_description_in=1,
@job_id=N'00000000-0000-0000-0000-000000000000'
GO

EXEC msdb.dbo.sp_add_notification @alert_name=N'Error Number 824', @operator_name=N'DBATeam', @notification_method = 7;
GO

EXEC msdb.dbo.sp_add_alert @name=N'Error Number 825',
@message_id=825,
@severity=0,
@enabled=1,
@delay_between_responses=60,
@include_event_description_in=1,
@job_id=N'00000000-0000-0000-0000-000000000000'
GO

EXEC msdb.dbo.sp_add_notification @alert_name=N'Error Number 825', @operator_name=N'DBATeam', @notification_method = 7;
GO
--Set failsafe operator
USE [msdb]
GO
EXEC master.dbo.sp_MSsetalertinfo @failsafeoperator=N'DBATeam',
@notificationmethod=1;
GO
-- Set database mail profile
USE [msdb]
GO
EXEC msdb.dbo.sp_set_sqlagent_properties @databasemail_profile=N'MailProfile';
GO
-- Set SQLAgent to use database mail
USE [msdb]
GO
EXEC msdb.dbo.sp_set_sqlagent_properties @email_save_in_sent_folder=1,
@use_databasemail=1;
GO
-- Configure parallelism and backup compression settings.

EXEC sys.sp_configure N'show advanced options', N'1' RECONFIGURE WITH OVERRIDE;
GO
EXEC sys.sp_configure N'cost threshold for parallelism', N'50';
GO
EXEC sys.sp_configure N'max degree of parallelism', N'2';
GO
RECONFIGURE WITH OVERRIDE;
GO
EXEC sys.sp_configure N'show advanced options', N'0';
GO
RECONFIGURE WITH OVERRIDE;
GO
EXEC sys.sp_configure N'backup compression default', N'1';
GO
-- Adjust MODEL
USE [master]
GO
ALTER DATABASE [model] MODIFY FILE ( NAME = N'modeldev', SIZE = 100MB , FILEGROWTH = 50MB);
GO
ALTER DATABASE [model] MODIFY FILE ( NAME = N'modellog', SIZE = 100MB , FILEGROWTH = 50MB);
GO
-- Example of creating DBA_DB database and user for replication

USE [master]
GO

CREATE DATABASE [DBADB]
ON PRIMARY
( NAME = N'DBADB', FILENAME = N'D:\MSSQL\Data\DBADB.mdf' , SIZE = 102400KB , MAXSIZE = UNLIMITED, FILEGROWTH = 51200KB )
LOG ON
( NAME = N'DBADB_log', FILENAME = N'E:\MSSQL\Log\DBADB_log.ldf' , SIZE = 10240KB , MAXSIZE = 2048GB , FILEGROWTH = 51200KB )
GO

USE [master]
GO
CREATE LOGIN [dba_repl] WITH PASSWORD=N'xxxxxxxx', DEFAULT_DATABASE=[DBADB], CHECK_EXPIRATION=OFF, CHECK_POLICY=OFF
GO
USE [DBADB]
GO
CREATE USER [dba_repl] FOR LOGIN [dba_repl]
GO
USE [DBADB]
GO
ALTER ROLE [db_owner] ADD MEMBER [dba_repl]
GO

 

Migrating databases, part 2

Having gone through the truly offline ways of transferring data, let’s take a look at log shipping and database mirroring. These are fairly old techniques as both of them came into being back in SQL Server 2005. The concept is very simple – every once in a while data from the transaction logs are sent to the secondary site and applied there. Then the cycle restarts and after a set time a new shipment of transactions occur.

In the case of log shipping, this is always asynchronously. There will always be a lag between the primary and the secondary instance, something to be very much aware of.
Okay, so what’s the fuss with this whole log shipping thing? Well, consider a system with 2TB worth of data. Stopping said system, backing up, transferring the backups and then restoring might be difficult due to time constraints. (If you have the time, go for it as it is basically bulletproof). With log shipping, you can take a backup on a running system, transfer said backup when convenient, restore the backup on the secondary also when convenient, and then prepare for log shipping.
By setting up log shipping, you shorten the time for the final switchover to a very small time. Say for instance that you set your log shipping job to execute every ten minutes, then you only ever have a maximum of ten minutes worth of data to back up, transfer and restore on the secondary site, shortening the switchover time considerably. It works fine in all editions of SQL Server. Sounds easy? Well, it is. With a few gotchas.

First and foremost, as with all the techniques described in this and the previous blog post: you have to manually transfer the logins and users. If you don’t, things will be somewhat difficult in a switchover scenario – difficult in the sens that nobody will be able to access the database. Bummer.
Second, there is no automatic failover to the secondary site. You can script a failover, but there is no automatic provision for doing it with SQL Server. If you need automatic failovers, then this is the wrong feature to use.
Third, there is no provision for client transfer. You need to take care of getting the client to the server by yourself, and my top tip here is to use DNS pointers – this way the client won’t need to be updated to point to the new server, as this is taken care of with DNS. Just remember to keep the TTL down…

One thing that makes log shipping vastly superior to, for instance, Data Guard, is the fact that it’s quite possible to use log shipping to migrate to a newer version of SQL Server. This can potentially shorten the time for an upgrade by quite some time, and does simplifiy testing as well.

Brent (as usual) has a great FAQ about log shipping: http://www.brentozar.com/archive/2013/03/log-shipping-faq/

In order to shorten the switchover time even more, there is something else called database mirroring. It appeared in SQL Server 2008 and was marked for deprecation in SQL Server 2014. It still works, but, well, don’t use it for production stuff. Under the hood it’s a bit different from log shipping; where log shipping uses a basic backup-and-copy to get the files to the secondary, mirroring uses TCP endpoints and individual transaction log records. This means that it’s possible to have the log data transferred synchronously to the secondary – in fact, that’s the ONLY way to do it on Standard Edition. With Enterprise Edition, you may have an asynchronous mirror, but on standard you’re stuck with synchronous.

Mirroring has a longer list of requirements than log shipping (TCP endpoints, keys, certificates if used, database name, etc. ) to keep track of, but generally speaking, it’s more of “different” requirements than “more difficult” requirements.

Practical tips

There is an old but still very relevant blog post about the differences here: https://nilebride.wordpress.com/2011/07/24/log-shipping-vs-mirroring-vs-replication/

Log shipping step by step:
http://www.mssqltips.com/sqlservertip/2301/step-by-step-sql-server-log-shipping/

Database mirroring step by step:
http://www.mssqltips.com/sqlservertip/2464/configure-sql-server-database-mirroring-using-ssms/

Migrating databases, part 1

Migrating databases

Most of us poor souls that for some reason enjoy this work called “database administration” get to meet “Mr. Migration” from time to time. I’ve lost count of the number of customers in need of moving one or more (most often A LOT) of databases from one server to another. The reasons for this are as many as the databases themselves: licensing issues, new underlying design, performance problem alleviation or frankly anything under the sun. Since the ball usually lands squarely in our court when it comes to trying to please all members of the project – be it the end users that don’t want any downtime, the project management that don’t want any risk or the sysadmins that don’t want to do any work ( 😀 ) – I decided to write down my different takes on the issue. Basically there are four ways to transfer data, each with it’s own specific challenges. This is part 1, covering detach/attach and backup and recovery.

As always when doing I/O in SQL Server it is well worth mentioning alignment and NTFS allocation size. By default, Windows formats a disk to an NTFS cluster size of 4K. Since SQL Server generally works with 64Kb block size (8 pages á 8K) this means an increased load on the I/O system – up to 16(!) I/O operations per SQL Server block retrieved. This can lead to a serious performance degradation even before we’re out of the gate, so the best practice is to set this to 64Kb. Trouble is, you have to reformat the drive to do it…
Alignment used to be a much larger problem than it is today, but still rears its ugly head from time to time. Windows 2008 and above (usually) takes care of this by itself, but always double check your alignment.
Checking NTFS allocation size is done by putting the below into an administrator-level command prompt, changing the drive letter:

fsutil fsinfo ntfsinfo e:

and looking for Bytes per Cluster.

Checking the alignment can be done by putting the below into an administrator-level command prompt, looking for DeviceID and StartingOffset:

wmic partition list full

Detach/attach

The simplest migration technique I know of is probably an old-fashioned detach-copy-attach. First of all, let’s get one thing out of the way: due to the way SQL Server is designed, most of the time the logins are not included in the backup/database file copies. (I say most of the time, since from SQL Server 2012 and onwards there exists provisions for a “self contained” database which can include logins). This lack of users is a problem associated with every type of migration I’ll go through here, so keep it in mind.

Detaching a database means excising it from the master database of the source server, copying the actual database and transaction log files to the destination and plugging them into the new master database. This means no extra time for backing up or restoring the data in itself, and in the case of a move, no need for extra disk space. The detach and attach operations always take the same time regardless of database size. In my opinion, that’s where the appeal ends. Consider the scenario where the database detaches fine, but for some reason won’t attach. Yes, it can happen even if it is very rarely, and no, it’s neither a great time to troubleshoot, nor is it very good for your blood pressure. I’ve been a DBA for quite some time, and I always strive to keep everything redundant. In the detach/attach scenario, I have the potential to saw off the branch of the tree I’m sitting on, and most of the time I don’t enjoy falling unceremoniously to the ground as this does nothing for my image.
Also consider that the entire files are copied – even if there is 99% free space. Nothing is done to the data/transaction log files, so a top tip is to make sure that you’ve done your housekeeping well before embarking on the detach/attach journey.
Onward to a more elegant migration path instead!

Backup/recovery

Another solution with very low risk is a simple backup and recovery. The database(s) are backed up on the source system, transferred over to the destination and restored. There are a few issues that might sink this migration path, though. Downtime and the need for a lot of disk space comes to mind. The added disk space can be somewhat alleviated using SQL Server backup compression, available from SQL Server 2008 Enterprise Edition and SQL Server 2008R2 Standard Edition. It is turned off by default, but can yield surprisingly good results if turned on. Try it out and see what the result is for your specific data.
The downtime issue is a bit more difficult to handle. There will be downtime – from the moment you start the backup, through the transfer to the destination host and the restore. There are ways to shorten the backup/restore time though, but before starting to fiddle with the knobs, we first need to find out what settings are used by default. This is easily done via two trace flags:

USE master
GO
DBCC TRACEON (3605, -1) 
DBCC TRACEON (3213, -1) 
GO

These flags will log the exact backup settings used to the SQL Server log file. Armed with the information about what settings are used at the moment, let’s see how we can increase the backup speed.

1. Use compression – this shortens the backup time and the restore time. The actual savings depend entirely on what kind of data you have and how compressible it is. But do try compression as it generally rocks.

BACKUP DATABASE [BackupPerfTest] 
TO 
DISK = N'S:\MSSQL\Backup\BackupPerfTest.bak'
WITH FORMAT, INIT, MEDIANAME = N'BackupTest', NAME = N'BackupTestFull', SKIP, NOREWIND, 
NOUNLOAD, STATS = 1, COMPRESSION
GO

2. Set BUFFERCOUNT – this specifies the total number of I/O buffers that will be used for the backup operation. The total space used by the backup buffers is determined by: buffercount * maxtransfersize. Increasing BUFFERCOUNT can significantly increase your backup performance, but keep in mind that you are using up memory in the SQLSERVR.EXE process. As always, test this out before throwing it in production.

BACKUP DATABASE [BackupPerfTest] 
TO 
DISK = N'S:\MSSQL\Backup\BackupPerfTest.bak'
WITH FORMAT, INIT, MEDIANAME = N'BackupTest', NAME = N'BackupTestFull', SKIP, NOREWIND, 
NOUNLOAD, STATS = 1, COMPRESSION, BUFFERCOUNT=400
GO

3. Set MAXTRANSFERSIZE – this setting specifies the largest unit of transfer used between SQL Server and the backup media.
The possible values are multiples of 64 KB ranging up to 4194304 bytes (4 MB). The default is 1 MB. This is also displayed in the log with the trace flags above.

BACKUP DATABASE [BackupPerfTest] 
TO 
DISK = N'S:\MSSQL\Backup\BackupPerfTest.bak'
WITH FORMAT, INIT, MEDIANAME = N'BackupTest', NAME = N'BackupTestFull', SKIP, NOREWIND, 
NOUNLOAD, STATS = 1, COMPRESSION, BUFFERCOUNT=400, MAXTRANSFERSIZE=2097152
GO

4. Set BLOCKSIZE – this setting specifies the physical block size for the backup files, in bytes. The trouble is that it defaults to 512 bytes for everything that isn’t a tape…

BACKUP DATABASE [BackupPerfTest] 
TO 
DISK = N'S:\MSSQL\Backup\BackupPerfTest.bak'
WITH FORMAT, INIT, MEDIANAME = N'BackupTest', NAME = N'BackupTestFull', SKIP, NOREWIND, 
NOUNLOAD, STATS = 1, COMPRESSION, BUFFERCOUNT=400, MAXTRANSFERSIZE=2097152, BLOCKSIZE=65536
GO

5. Increase the number of backup files. This does complicate the restore somewhat, but the rewards vastly compensate for this. As long as your disk subsystem can keep up (especially if you’re running SSDs), this setting alone will give you a serious performance boost.

BACKUP DATABASE [BackupPerfTest] 
TO 
DISK = N'S:\MSSQL\Backup\BackupPerfTest1.bak',
DISK = N'S:\MSSQL\Backup\BackupPerfTest2.bak',
DISK = N'S:\MSSQL\Backup\BackupPerfTest3.bak',
DISK = N'S:\MSSQL\Backup\BackupPerfTest4.bak'
WITH FORMAT, INIT, MEDIANAME = N'BackupTest', NAME = N'BackupTestFull', SKIP, NOREWIND, 
NOUNLOAD, STATS = 1, COMPRESSION, BUFFERCOUNT=400, MAXTRANSFERSIZE=2097152, BLOCKSIZE=65536
GO

The only way to increase the transfer speed between the hosts that I know is to transfer less data – this is where compression comes in Depending on the data, I’ve seen between 30% and 80% reduction in backup file size. Less data to transfer, faster restore due to compression.

Combining all the ideas above with a differential backup solution is another interesting way to shorten the actual down time. Let’s assume, for the sake of the argument, that the database is 1TB and it takes 1 hour to backup, transfer and restore. An idea is to first do a full backup, transfer and restore it without recovering the database – this is done without downtime, when is convenient for everybody involved. When it is time for the actual migration, stop the application, do a differential backup, transfer that and restore. Depending on the transaction volume this will most likely be a much smaller backup file, meaning that the entire operation can be done much faster. This is nothing new in any way, but I’m just pointing it out so no one forgets old school stuff.

While we’re on the subject of differential backups, what if there was a way to keep the destination host in sync or similar? Well, that’s the subject of the next post!

Basic settings for SQL Server 2012/2014, part 3

Part 3: simplifying the life of the DBA

When I’ve got the environment set up quite the way I like it, I usually set up a repository where I keep all my scripts. I designate one server as the “master” for my purposes, and set up a database called DBADB. In here I put all my standard scripts (like the bunch from Brent Ozar & co., SP_WHOISACTIVE from Adam Machanic, the log table for Ola Hallengren’s maintenance solution and such) to keep track of them, and to have only one place to update.

In my standard setup scripts for new databases, I also create a local DBADB as well as a local user for replication. When this is done, I enable replication between the “master” server and the new server, and presto! I’ve got a) a central repository for scripts and b) all the scripts available at all my servers.

Set up a good index maintenance solution, like either Ola Hallengrens script suite or Minion Reindex from Sean McCown of MidnightDBA fame. These run circles around maintenance plans and do a much better job of keeping your database server in good working order.

Olas scripts here, and the Minion Reindex scripts here.

Create DNS aliases for each and every database connection there is. After making sure that each and every system is accessing the DNS alias instead of the actual server name, I’ve simplified my life tremendously should I ever have to move a database to another server. I’ve got a good example of when this was a customer of mine that had to move a whole lot of databases from an old server to a new one. Since we had the DNS alias solution in place, we only had to stop the databases, move them, change the DNS pointer and restart – we knew for a fact that we did not have to change anything in the client connection strings.

Document everything. Yes, this falls squarely under the category “simplifying the life of the DBA”. Good documentation simplifies your life. Every. Single. Time. Keep a good record of what jobs runs when on which servers. Document what users have sysadmin privileges and why. Make sure that any deviations from the standard disk setup is meticulously written down. Catalog what install options you’ve chosen for each server. The list can go on and on, but I think you get the point.

Add operators, alerts and be proactive. There are several problems that can be proactively monitored by SQL Server itself. By creating operators and connecting these to e-mail addresses or pager alerts, it is possible to wake people up in the middle of the night if something goes wrong, as opposed to remaining completely oblivious to the fact that the server is slowly sinking in the depths. With alerts, whenever SQL Server logs an error of a specific severity, the alert is triggered and something interesting happens in the life of the DBA.

Brent Ozar has a script to create the alerts here: http://www.brentozar.com/blitz/configure-sql-server-alerts/ Also remember that when creating the operator, it is quite possible to specify several e-mail addresses if you just separate them with a comma.

Set up a failsafe operator – this is basically the go-to guy if no-one else can be found to harass. Take a look at this post on MSDN: https://technet.microsoft.com/en-us/magazine/gg313743.aspx

Change the settings on the model database to some sensible settings. Increasing a database file with 1MB every time it becomes full is not only dumb but also very wasteful. The problem becomes even bigger with transaction logs and virtual log files (see this youtube link for more info – https://youtu.be/lcmYeE-cqQo). The solution is to change the settings for the model database to something sensible (50-100MB size increase a good starting point – adapt to your specific needs!) so that every database that gets created at least have some sense in the growth settings.

These are the things I do every time and at every client. Nothing is new, nothing is exotic, it’s just hard, consistent work. And despite it being nothing new, it can’t be said often enough…

Basic settings for SQL Server 2012/2014, part 2

Updated: 2016-11-07

Part 2: settings inside SQL Server

When the server is all good and installed, it’s time to go through and set up a bunch of server settings. First of all – dedicated admin connections. This is used as a “get out of jail for free”-card for a DBA if all the schedulers are cranky and won’t accept connections for one reason or another. Using the DAC, it might be possible to sneak in through the “back door” and solve whatever problem that is plaguing the system without having to go through the hassle of a reboot/service restart. Turn. It. On. Period.

https://msdn.microsoft.com/en-us/library/ms190468.aspx
https://msdn.microsoft.com/en-us/library/ms189595.aspx

Optimize for adhoc was introduced way, way back to minimize the impact of adhoc plans in the plan cache. Simply put, it works bu not storing a complete plan in the cache until it is reused at least once. Instead a “stub” is stored, and less memory is consumed by plans that are never reused. Having read the same recommendation from several sources, I always turn this on – BUT – keep an eye out for surprises in the plan cache.

Kimberly L. Tripp lays it all out here: http://www.sqlskills.com/blogs/kimberly/plan-cache-and-optimizing-for-adhoc-workloads/ and her post links to a gazillion other posts.

Memory, then. The default setting for MIN_SERVER_MEMORY is 0 and MAX_SERVER_MEMORY is 2147483647 MB. Yes. The technical term for that amount is “a lot”. It’s a really bad idea to leave this setting as is, as SQL Server is a bit like my late lawnmower – it will eat EVERYTHING in sight. This comes with some consequences for wildlife, geography and neighbours, or in this case – other consumers of memory. There are several ways to come up with a number for MAX_SERVER_MEMORY, but a simple rule of thumb is 90% of total memory or 4GB, whichever is greater. Suddenly it wasn’t so great to give the server 6GB of RAM, riiight?
NOTE: the previous number is ONLY applicable if you have ONLY one instance on the server and NO other consumers (Fulltext daemon, reporting/analysis/integration services, etc.)

http://www.brentozar.com/archive/2012/11/how-to-set-sql-server-max-memory-for-vmware/
Somewhat older but still very much sound:
https://sqlserverperformance.wordpress.com/2009/10/29/suggested-max-memory-settings-for-sql-server-20052008/

TEMPDB has been a hot topic for contention for many years. I’ve read everything between 2-8 files depending on number of cores, and then tune upwards if you run into problems. I make it simple: on a SAN with lots of spindles or SSDs: 8 files and be done with. Too many files can actually hurt performance if you have too few spindles (but then again – everything hurts if with too few spindles).

Paul Randal lays out a lot of information here: http://www.sqlskills.com/blogs/paul/correctly-adding-data-files-tempdb/ and http://www.sqlskills.com/blogs/paul/a-sql-server-dba-myth-a-day-1230-tempdb-should-always-have-one-data-file-per-processor-core/
This KB is also relevant: https://support.microsoft.com/kb/2154845

Lock pages in memory – perhaps the closest thing to a religious argument the community has ever had. I fall into the “on the hedge” category of DBAs. I personally don’t use it most of the time, but it’s nice to know that it is there should I need it.

Jonathan Kehayias has a great blog post with both sides of the argument and lots of interesting information: https://www.simple-talk.com/sql/database-administration/great-sql-server-debates-lock-pages-in-memory/

Backup compression is another no-brainer. Or is it? The prevailing opinion is to turn it on as the servers out there usually has a lot more CPU than I/O resources. That is undoubtedly the case, but keep in mind the environment outside the database. If you’re running a compressing backup solution like TSM, NetBackup or similar, the compression and subsequent deduplication of that system will suffer greatly. In essence: make sure you only compress your backups at one place, and if at all possible do it at the SQL Server.

Paul Randal again has a great post on the subject: http://www.sqlskills.com/blogs/paul/sql-server-2008-backup-compression-cpu-cost/

Maximum degree of parallelism will hurt you out of the box. There is no established consensus, but my opinion is this: most of the OLTP environments out there generally don’t benefit from parallelism – quite the contrary. I’ve found that more often than not the database spends more time reassembling the query results (as seen by a CXPACKET wait) than the query would have taken just burning one CPU. That’s why I generally set MAXDOP to 2 for 1-4 CPUs and MAYBE 2-4 for 4-8 CPUs. Then I keep an eye on the cache and tune accordingly. It is beoynd me why the default is 0 (as in PARALLELIZE OVER EVERY CPU THERE IS!).

This runs hands in hand with cost threshold for parallelism. This defaults to 5, which is a ridiculously low number and results in SQL Server trying to parallelize EVERYTHING. Combine that with the above MAXDOP of 0 and well, all hell is bound to break loose. Again, there is no “right” way to do it, but I start out with 50 as a baseline and tune up or down accordingly. In OLTP workloads, expensive queries are generally a bad idea anyway.

Trace flags are usually something to be wary of, but there are a few that are a downright good idea to use.
3226 for instance, suppresses the backup success messages in the SQL Server log. There’s enough information in there already, so a little less stuff to wade through is only a good thing.
http://msdn.microsoft.com/en-us/library/ms188396.aspx

1118 tells the server to avoid mixed extents and give each new object its own 64KB of allocated data.
1117 tells the server to grow each file in a filegroup equally. This can come in handy to avoid hotspots as a result of uneven file growth. Both these flags are documented in in  http://support.microsoft.com/kb/2154845/en-us

2371 modifies the behavior of statistics auto update – very useful on large tables.
http://blogs.msdn.com/b/saponsqlserver/archive/2011/09/07/changes-to-automatic-update-statistics-in-sql-server-traceflag-2371.aspx

2562 and 2549 pertains to improvements for DBCC CHECKDB.
https://support.microsoft.com/en-us/kb/2634571
4199 enables non-default query processor enhancements and while IT SHOULD BE TESTED, I’ve found it to be quite safe to use.
https://support.microsoft.com/en-us/kb/974006

 

 

Basic settings for SQL Server 2012/2014, part 1

Whenever I set up an SQL Server I usually do the same things. I’ve collected a list of best practice settings over the years (and most often put them in a script, as most of my friends are scripts). Throughout a few blog posts I’ll go over the settings I do use and discuss a bit why I’ve chosen to set the parameters the way I do.

Part 1: configuring the server

There are a few hard rules when it comes to setting up the environment for SQL Server. The first two applies to storage:

Always make sure you align your partitions with the underlying stripe size. Windows 2008 and above (usually) takes care of this by itself, but always double check your alignment. Failure to align the partition can lead to significant performance degradation.

Format your partitions where you will store SQL Server related files to a NTFS Allocation Size of 64KB. While SQL Server can do I/O in any size from 512 bytes to 8M, a larger NTFS allocation size is beneficial in almost all circumstances. (In some cases with Analysis Services, 32KB might yield better results. See the link below for further information).

More information:
https://technet.microsoft.com/en-us/library/dd758814%28v=sql.100%29.aspx
https://blog.purestorage.com/what-is-sql-servers-io-block-size/

SQL Server loves memory, and in my opinion you can’t give it too much memory. Many of my clients have a tendency to give it way too little (6-8GB) where double that amount might be a good starting point. This all depends (as always) on the workload and the requirements, but remember – memory transfer is fast, disk transfer is slow – regardless of if you’re sporting all SSDs.
Jeremiah Peschka have a great blog post that summarises my thinking:

http://www.brentozar.com/archive/2014/05/much-memory-sql-server-need/

Allen McGuire wrote a great set of scripts to harvest and report on SQL Server IO that can be useful:
http://allen-mcguire.blogspot.se/2014/03/harvest-and-report-on-sql-server-io.html

CPU choice is another interesting topic. Remember a few years ago when it was all about the MHz? Well, these days it’s all about cores, and since the licenses are purchased per core, well, that’s not necessarily a good idea. Parallellizing is a difficult concept, and depending on your workload it might be more or less tricky to parallelize said workload. In a worst case scenario with, say, SharePoint – where you are forced to set Max Degree of Parallelism – a CPU with many slow cores will give you worse performance than a CPU with fewer fast cores.
In general, I prefer few fast cores over many slow ones. Most of my clients’ workloads benefit more from this choice.

Glenn Berry lays it out here: http://sqlperformance.com/2014/01/system-configuration/selecting-a-processor-for-sql-server-2014-1

Instant File Initialization is something that always should be turned on. If it not, then every time a file is created it needs to be filled with zeroes. Not zeroes like myself, but actual zero bits. This means a huge hit on the storage system whenever a file is created. With instant file initialization, the pointer is created without the need for zeroing the actual data. This means instant file creation and no I/O effects. This will affect restores too…
Sure, the blocks are not zeroed and hence are vulnareble for reading with a hex editor. But as Kevin Boles (TheSQLGuru) put it in a presentation for SQL Server User Group Sweden: “if someone is reading your SQL Server data files with a hex editor, you are already pwnd”.

Check this blog post from the premier field engineer team at Microsoft: http://blogs.msdn.com/b/sql_pfe_blog/archive/2009/12/23/how-and-why-to-enable-instant-file-initialization.aspx

Power settings have a funny way of being ignored. The bad thing is that Windows 2012 has a habit of setting the power plan to “balanced”, meaning that you won’t get the maximum performance from your server. Set the power plan to “high” and keep an eye out from time to time – this has been known to “magically” change itself.