Migrating databases, part 2

Having gone through the truly offline ways of transferring data, let’s take a look at log shipping and database mirroring. These are fairly old techniques as both of them came into being back in SQL Server 2005. The concept is very simple – every once in a while data from the transaction logs are sent to the secondary site and applied there. Then the cycle restarts and after a set time a new shipment of transactions occur.

In the case of log shipping, this is always asynchronously. There will always be a lag between the primary and the secondary instance, something to be very much aware of.
Okay, so what’s the fuss with this whole log shipping thing? Well, consider a system with 2TB worth of data. Stopping said system, backing up, transferring the backups and then restoring might be difficult due to time constraints. (If you have the time, go for it as it is basically bulletproof). With log shipping, you can take a backup on a running system, transfer said backup when convenient, restore the backup on the secondary also when convenient, and then prepare for log shipping.
By setting up log shipping, you shorten the time for the final switchover to a very small time. Say for instance that you set your log shipping job to execute every ten minutes, then you only ever have a maximum of ten minutes worth of data to back up, transfer and restore on the secondary site, shortening the switchover time considerably. It works fine in all editions of SQL Server. Sounds easy? Well, it is. With a few gotchas.

First and foremost, as with all the techniques described in this and the previous blog post: you have to manually transfer the logins and users. If you don’t, things will be somewhat difficult in a switchover scenario – difficult in the sens that nobody will be able to access the database. Bummer.
Second, there is no automatic failover to the secondary site. You can script a failover, but there is no automatic provision for doing it with SQL Server. If you need automatic failovers, then this is the wrong feature to use.
Third, there is no provision for client transfer. You need to take care of getting the client to the server by yourself, and my top tip here is to use DNS pointers – this way the client won’t need to be updated to point to the new server, as this is taken care of with DNS. Just remember to keep the TTL down…

One thing that makes log shipping vastly superior to, for instance, Data Guard, is the fact that it’s quite possible to use log shipping to migrate to a newer version of SQL Server. This can potentially shorten the time for an upgrade by quite some time, and does simplifiy testing as well.

Brent (as usual) has a great FAQ about log shipping: http://www.brentozar.com/archive/2013/03/log-shipping-faq/

In order to shorten the switchover time even more, there is something else called database mirroring. It appeared in SQL Server 2008 and was marked for deprecation in SQL Server 2014. It still works, but, well, don’t use it for production stuff. Under the hood it’s a bit different from log shipping; where log shipping uses a basic backup-and-copy to get the files to the secondary, mirroring uses TCP endpoints and individual transaction log records. This means that it’s possible to have the log data transferred synchronously to the secondary – in fact, that’s the ONLY way to do it on Standard Edition. With Enterprise Edition, you may have an asynchronous mirror, but on standard you’re stuck with synchronous.

Mirroring has a longer list of requirements than log shipping (TCP endpoints, keys, certificates if used, database name, etc. ) to keep track of, but generally speaking, it’s more of “different” requirements than “more difficult” requirements.

Practical tips

There is an old but still very relevant blog post about the differences here: https://nilebride.wordpress.com/2011/07/24/log-shipping-vs-mirroring-vs-replication/

Log shipping step by step:
http://www.mssqltips.com/sqlservertip/2301/step-by-step-sql-server-log-shipping/

Database mirroring step by step:
http://www.mssqltips.com/sqlservertip/2464/configure-sql-server-database-mirroring-using-ssms/

Migrating databases, part 1

Migrating databases

Most of us poor souls that for some reason enjoy this work called “database administration” get to meet “Mr. Migration” from time to time. I’ve lost count of the number of customers in need of moving one or more (most often A LOT) of databases from one server to another. The reasons for this are as many as the databases themselves: licensing issues, new underlying design, performance problem alleviation or frankly anything under the sun. Since the ball usually lands squarely in our court when it comes to trying to please all members of the project – be it the end users that don’t want any downtime, the project management that don’t want any risk or the sysadmins that don’t want to do any work ( 😀 ) – I decided to write down my different takes on the issue. Basically there are four ways to transfer data, each with it’s own specific challenges. This is part 1, covering detach/attach and backup and recovery.

As always when doing I/O in SQL Server it is well worth mentioning alignment and NTFS allocation size. By default, Windows formats a disk to an NTFS cluster size of 4K. Since SQL Server generally works with 64Kb block size (8 pages á 8K) this means an increased load on the I/O system – up to 16(!) I/O operations per SQL Server block retrieved. This can lead to a serious performance degradation even before we’re out of the gate, so the best practice is to set this to 64Kb. Trouble is, you have to reformat the drive to do it…
Alignment used to be a much larger problem than it is today, but still rears its ugly head from time to time. Windows 2008 and above (usually) takes care of this by itself, but always double check your alignment.
Checking NTFS allocation size is done by putting the below into an administrator-level command prompt, changing the drive letter:

fsutil fsinfo ntfsinfo e:

and looking for Bytes per Cluster.

Checking the alignment can be done by putting the below into an administrator-level command prompt, looking for DeviceID and StartingOffset:

wmic partition list full

Detach/attach

The simplest migration technique I know of is probably an old-fashioned detach-copy-attach. First of all, let’s get one thing out of the way: due to the way SQL Server is designed, most of the time the logins are not included in the backup/database file copies. (I say most of the time, since from SQL Server 2012 and onwards there exists provisions for a “self contained” database which can include logins). This lack of users is a problem associated with every type of migration I’ll go through here, so keep it in mind.

Detaching a database means excising it from the master database of the source server, copying the actual database and transaction log files to the destination and plugging them into the new master database. This means no extra time for backing up or restoring the data in itself, and in the case of a move, no need for extra disk space. The detach and attach operations always take the same time regardless of database size. In my opinion, that’s where the appeal ends. Consider the scenario where the database detaches fine, but for some reason won’t attach. Yes, it can happen even if it is very rarely, and no, it’s neither a great time to troubleshoot, nor is it very good for your blood pressure. I’ve been a DBA for quite some time, and I always strive to keep everything redundant. In the detach/attach scenario, I have the potential to saw off the branch of the tree I’m sitting on, and most of the time I don’t enjoy falling unceremoniously to the ground as this does nothing for my image.
Also consider that the entire files are copied – even if there is 99% free space. Nothing is done to the data/transaction log files, so a top tip is to make sure that you’ve done your housekeeping well before embarking on the detach/attach journey.
Onward to a more elegant migration path instead!

Backup/recovery

Another solution with very low risk is a simple backup and recovery. The database(s) are backed up on the source system, transferred over to the destination and restored. There are a few issues that might sink this migration path, though. Downtime and the need for a lot of disk space comes to mind. The added disk space can be somewhat alleviated using SQL Server backup compression, available from SQL Server 2008 Enterprise Edition and SQL Server 2008R2 Standard Edition. It is turned off by default, but can yield surprisingly good results if turned on. Try it out and see what the result is for your specific data.
The downtime issue is a bit more difficult to handle. There will be downtime – from the moment you start the backup, through the transfer to the destination host and the restore. There are ways to shorten the backup/restore time though, but before starting to fiddle with the knobs, we first need to find out what settings are used by default. This is easily done via two trace flags:

USE master
GO
DBCC TRACEON (3605, -1) 
DBCC TRACEON (3213, -1) 
GO

These flags will log the exact backup settings used to the SQL Server log file. Armed with the information about what settings are used at the moment, let’s see how we can increase the backup speed.

1. Use compression – this shortens the backup time and the restore time. The actual savings depend entirely on what kind of data you have and how compressible it is. But do try compression as it generally rocks.

BACKUP DATABASE [BackupPerfTest] 
TO 
DISK = N'S:\MSSQL\Backup\BackupPerfTest.bak'
WITH FORMAT, INIT, MEDIANAME = N'BackupTest', NAME = N'BackupTestFull', SKIP, NOREWIND, 
NOUNLOAD, STATS = 1, COMPRESSION
GO

2. Set BUFFERCOUNT – this specifies the total number of I/O buffers that will be used for the backup operation. The total space used by the backup buffers is determined by: buffercount * maxtransfersize. Increasing BUFFERCOUNT can significantly increase your backup performance, but keep in mind that you are using up memory in the SQLSERVR.EXE process. As always, test this out before throwing it in production.

BACKUP DATABASE [BackupPerfTest] 
TO 
DISK = N'S:\MSSQL\Backup\BackupPerfTest.bak'
WITH FORMAT, INIT, MEDIANAME = N'BackupTest', NAME = N'BackupTestFull', SKIP, NOREWIND, 
NOUNLOAD, STATS = 1, COMPRESSION, BUFFERCOUNT=400
GO

3. Set MAXTRANSFERSIZE – this setting specifies the largest unit of transfer used between SQL Server and the backup media.
The possible values are multiples of 64 KB ranging up to 4194304 bytes (4 MB). The default is 1 MB. This is also displayed in the log with the trace flags above.

BACKUP DATABASE [BackupPerfTest] 
TO 
DISK = N'S:\MSSQL\Backup\BackupPerfTest.bak'
WITH FORMAT, INIT, MEDIANAME = N'BackupTest', NAME = N'BackupTestFull', SKIP, NOREWIND, 
NOUNLOAD, STATS = 1, COMPRESSION, BUFFERCOUNT=400, MAXTRANSFERSIZE=2097152
GO

4. Set BLOCKSIZE – this setting specifies the physical block size for the backup files, in bytes. The trouble is that it defaults to 512 bytes for everything that isn’t a tape…

BACKUP DATABASE [BackupPerfTest] 
TO 
DISK = N'S:\MSSQL\Backup\BackupPerfTest.bak'
WITH FORMAT, INIT, MEDIANAME = N'BackupTest', NAME = N'BackupTestFull', SKIP, NOREWIND, 
NOUNLOAD, STATS = 1, COMPRESSION, BUFFERCOUNT=400, MAXTRANSFERSIZE=2097152, BLOCKSIZE=65536
GO

5. Increase the number of backup files. This does complicate the restore somewhat, but the rewards vastly compensate for this. As long as your disk subsystem can keep up (especially if you’re running SSDs), this setting alone will give you a serious performance boost.

BACKUP DATABASE [BackupPerfTest] 
TO 
DISK = N'S:\MSSQL\Backup\BackupPerfTest1.bak',
DISK = N'S:\MSSQL\Backup\BackupPerfTest2.bak',
DISK = N'S:\MSSQL\Backup\BackupPerfTest3.bak',
DISK = N'S:\MSSQL\Backup\BackupPerfTest4.bak'
WITH FORMAT, INIT, MEDIANAME = N'BackupTest', NAME = N'BackupTestFull', SKIP, NOREWIND, 
NOUNLOAD, STATS = 1, COMPRESSION, BUFFERCOUNT=400, MAXTRANSFERSIZE=2097152, BLOCKSIZE=65536
GO

The only way to increase the transfer speed between the hosts that I know is to transfer less data – this is where compression comes in Depending on the data, I’ve seen between 30% and 80% reduction in backup file size. Less data to transfer, faster restore due to compression.

Combining all the ideas above with a differential backup solution is another interesting way to shorten the actual down time. Let’s assume, for the sake of the argument, that the database is 1TB and it takes 1 hour to backup, transfer and restore. An idea is to first do a full backup, transfer and restore it without recovering the database – this is done without downtime, when is convenient for everybody involved. When it is time for the actual migration, stop the application, do a differential backup, transfer that and restore. Depending on the transaction volume this will most likely be a much smaller backup file, meaning that the entire operation can be done much faster. This is nothing new in any way, but I’m just pointing it out so no one forgets old school stuff.

While we’re on the subject of differential backups, what if there was a way to keep the destination host in sync or similar? Well, that’s the subject of the next post!

Better Index Usage DMV

Slightly altered Aaron Bertrand’s “Better Index Usage DMV” to script out CREATE INDEX-statements.

 

SELECT
 d.[object_id],
 s = OBJECT_SCHEMA_NAME(d.[object_id]),
 o = OBJECT_NAME(d.[object_id]),
 d.equality_columns,
 d.inequality_columns,
 d.included_columns,
 s.unique_compiles,
 s.user_seeks, s.last_user_seek,
 s.user_scans, s.last_user_scan,
 'CREATE INDEX [IX_' + OBJECT_NAME(d.OBJECT_ID,d.database_id) + '_'
 + REPLACE(REPLACE(REPLACE(ISNULL(d.equality_columns,''),', ','_'),'[',''),']','') +
 CASE
 WHEN d.equality_columns IS NOT NULL AND d.inequality_columns IS NOT NULL THEN '_'
 ELSE ''
 END
 + REPLACE(REPLACE(REPLACE(ISNULL(d.inequality_columns,''),', ','_'),'[',''),']','')
 + ']'
 + ' ON ' + d.statement
 + ' (' + ISNULL (d.equality_columns,'')
 + CASE WHEN d.equality_columns IS NOT NULL AND d.inequality_columns IS NOT NULL THEN ',' ELSE
 '' END
 + ISNULL (d.inequality_columns, '')
 + ')'
 + ISNULL (' INCLUDE (' + d.included_columns + ')', '') AS Create_Statement
 INTO #candidates
 FROM sys.dm_db_missing_index_details AS d
 INNER JOIN sys.dm_db_missing_index_groups AS g
 ON d.index_handle = g.index_handle
 INNER JOIN sys.dm_db_missing_index_group_stats AS s
 ON g.index_group_handle = s.group_handle
 WHERE d.database_id = DB_ID()
 AND OBJECTPROPERTY(d.[object_id], 'IsMsShipped') = 0;
 CREATE TABLE #planops
 (
 o INT,
 i INT,
 h VARBINARY(64),
 uc INT,
 Scan_Ops INT,
 Seek_Ops INT,
 Update_Ops INT
 );
DECLARE @sql NVARCHAR(MAX) = N'';
with xmlnamespaces (default 'http://schemas.microsoft.com/sqlserver/2004/07/showplan')
 insert #planops
 select o = coalesce(T1.o, T2.o),
 i = coalesce(T1.i, T2.i),
 h = coalesce(T1.h, T2.h),
 uc = coalesce(T1.uc, T2.uc),
 Scan_Ops = isnull(T1.Scan_Ops, 0),
 Seek_Ops = isnull(T1.Seek_Ops, 0),
 Update_Ops = isnull(T2.Update_Ops, 0)
 from
 (
 select o = i.object_id,
 i = i.index_id,
 h = t.plan_handle,
 uc = t.usecounts,
 Scan_Ops = sum(case when t.LogicalOp in ('Index Scan', 'Clustered Index Scan') then 1 else 0 end),
 Seek_Ops = sum(case when t.LogicalOp in ('Index Seek', 'Clustered Index Seek') then 1 else 0 end)
 from (
 select
 r.n.value('@LogicalOp', 'varchar(100)') as LogicalOp,
 o.n.value('@Index', 'sysname') as IndexName,
 pl.plan_handle,
 pl.usecounts
 from sys.dm_exec_cached_plans as pl
 cross apply sys.dm_exec_query_plan(pl.plan_handle) AS p
 cross apply p.query_plan.nodes('//RelOp') as r(n)
 cross apply r.n.nodes('*/Object') as o(n)
 where p.dbid = db_id()
 and p.query_plan is not null
 ) as t
 inner join sys.indexes as i
 on t.IndexName = quotename(i.name)
 where t.LogicalOp in ('Index Scan', 'Clustered Index Scan', 'Index Seek', 'Clustered Index Seek')
 and exists (select 1 from #candidates as c where c.object_id = i.object_id)
 group by i.object_id,
 i.index_id,
 t.plan_handle,
 t.usecounts
 ) as T1
 full outer join
 (
 select o = i.object_id,
 i = i.index_id,
 h = t.plan_handle,
 uc = t.usecounts,
 Update_Ops = count(*)
 from (
 select
 o.n.value('@Index', 'sysname') as IndexName,
 pl.plan_handle,
 pl.usecounts
 from sys.dm_exec_cached_plans as pl
 cross apply sys.dm_exec_query_plan(pl.plan_handle) AS p
 cross apply p.query_plan.nodes('//Update') as r(n)
 cross apply r.n.nodes('Object') as o(n)
 where p.dbid = db_id()
 and p.query_plan is not null
 ) as t
 inner join sys.indexes as i
 on t.IndexName = quotename(i.name)
 where exists
 (
 select 1 from #candidates as c where c.[object_id] = i.[object_id]
 )
 and i.index_id > 0
 group by i.object_id,
 i.index_id,
 t.plan_handle,
 t.usecounts
 ) as T2
 on T1.o = T2.o and
 T1.i = T2.i and
 T1.h = T2.h and
 T1.uc = T2.uc;
 SELECT [object_id], index_id, user_seeks, user_scans, user_lookups, user_updates
 INTO #indexusage
 FROM sys.dm_db_index_usage_stats AS s
 WHERE database_id = DB_ID()
 AND EXISTS (SELECT 1 FROM #candidates WHERE [object_id] = s.[object_id]);
 ;WITH x AS
 (
 SELECT
 c.[object_id],
 potential_read_ops = SUM(c.user_seeks + c.user_scans),
 [write_ops] = SUM(iu.user_updates),
 [read_ops] = SUM(iu.user_scans + iu.user_seeks + iu.user_lookups),
 [write:read ratio] = CONVERT(DECIMAL(18,2), SUM(iu.user_updates)*1.0 /
 SUM(iu.user_scans + iu.user_seeks + iu.user_lookups)),
 current_plan_count = po.h,
 current_plan_use_count = po.uc
 FROM
 #candidates AS c
 LEFT OUTER JOIN
 #indexusage AS iu
 ON c.[object_id] = iu.[object_id]
 LEFT OUTER JOIN
 (
 SELECT o, h = COUNT(h), uc = SUM(uc)
 FROM #planops GROUP BY o
 ) AS po
 ON c.[object_id] = po.o
 GROUP BY c.[object_id], po.h, po.uc
 )
 SELECT [object] = QUOTENAME(c.s) + '.' + QUOTENAME(c.o),
 c.equality_columns,
 c.inequality_columns,
 c.included_columns,
 x.potential_read_ops,
 x.write_ops,
 x.read_ops,
 x.[write:read ratio],
 x.current_plan_count,
 x.current_plan_use_count,
 c.create_statement
 FROM #candidates AS c
 INNER JOIN x
 ON c.[object_id] = x.[object_id]
 ORDER BY x.[write:read ratio];
/*
 drop table #candidates
 drop table #indexusage
 drop table #planops
 */