Basic settings for SQL Server 2012/2014, part 3

Part 3: simplifying the life of the DBA

When I’ve got the environment set up quite the way I like it, I usually set up a repository where I keep all my scripts. I designate one server as the “master” for my purposes, and set up a database called DBADB. In here I put all my standard scripts (like the bunch from Brent Ozar & co., SP_WHOISACTIVE from Adam Machanic, the log table for Ola Hallengren’s maintenance solution and such) to keep track of them, and to have only one place to update.

In my standard setup scripts for new databases, I also create a local DBADB as well as a local user for replication. When this is done, I enable replication between the “master” server and the new server, and presto! I’ve got a) a central repository for scripts and b) all the scripts available at all my servers.

Set up a good index maintenance solution, like either Ola Hallengrens script suite or Minion Reindex from Sean McCown of MidnightDBA fame. These run circles around maintenance plans and do a much better job of keeping your database server in good working order.

Olas scripts here, and the Minion Reindex scripts here.

Create DNS aliases for each and every database connection there is. After making sure that each and every system is accessing the DNS alias instead of the actual server name, I’ve simplified my life tremendously should I ever have to move a database to another server. I’ve got a good example of when this was a customer of mine that had to move a whole lot of databases from an old server to a new one. Since we had the DNS alias solution in place, we only had to stop the databases, move them, change the DNS pointer and restart – we knew for a fact that we did not have to change anything in the client connection strings.

Document everything. Yes, this falls squarely under the category “simplifying the life of the DBA”. Good documentation simplifies your life. Every. Single. Time. Keep a good record of what jobs runs when on which servers. Document what users have sysadmin privileges and why. Make sure that any deviations from the standard disk setup is meticulously written down. Catalog what install options you’ve chosen for each server. The list can go on and on, but I think you get the point.

Add operators, alerts and be proactive. There are several problems that can be proactively monitored by SQL Server itself. By creating operators and connecting these to e-mail addresses or pager alerts, it is possible to wake people up in the middle of the night if something goes wrong, as opposed to remaining completely oblivious to the fact that the server is slowly sinking in the depths. With alerts, whenever SQL Server logs an error of a specific severity, the alert is triggered and something interesting happens in the life of the DBA.

Brent Ozar has a script to create the alerts here: http://www.brentozar.com/blitz/configure-sql-server-alerts/ Also remember that when creating the operator, it is quite possible to specify several e-mail addresses if you just separate them with a comma.

Set up a failsafe operator – this is basically the go-to guy if no-one else can be found to harass. Take a look at this post on MSDN: https://technet.microsoft.com/en-us/magazine/gg313743.aspx

Change the settings on the model database to some sensible settings. Increasing a database file with 1MB every time it becomes full is not only dumb but also very wasteful. The problem becomes even bigger with transaction logs and virtual log files (see this youtube link for more info – https://youtu.be/lcmYeE-cqQo). The solution is to change the settings for the model database to something sensible (50-100MB size increase a good starting point – adapt to your specific needs!) so that every database that gets created at least have some sense in the growth settings.

These are the things I do every time and at every client. Nothing is new, nothing is exotic, it’s just hard, consistent work. And despite it being nothing new, it can’t be said often enough…

Double Whammy

So there I was, gearing up for a trip to a customer 90 minutes away and my audiobook had just finished. Having listened to Matan Yungman at SQL Pass in Copenhagen, I vaguely remembered him saying something about SQL Server Radio, so I decied to check it out. Turns out they’ve been doing it for a while, and the show length (around 60 minutes) suits me just fine. Matan Yungman and Guy Glantser both work for Madeira SQL Serverices, and are well known speakers in the SQL Server Community. The only problems I’ve found are that they’ve only done 19 shows so far (and I spend A LOT of time in my car 😛 ) and that they have some issues with sound and mixing levels. This I’m sure will be fixed over time, as it’s not an easy proposition to create and run a podcast.

So, 10 minutes into my journey I fire up epsiode 18, where they speak to Sean McCown of Midnight DBA fame. He’s running a consulting company with his wife Jen, as well as hosting a weekly SQL Server web show called DBAs@Midnight. The subject of the discussion on SQL Server Radio was index maintenance, and more specifically Seans take on it with his script Minion Reindex. Now, I’ve been using Ola Hallengren’s script suite for YEARS and like it very much. It has a few limitations and I’ve done some wrapper scripts around it a few years back, but in all, I find it to be a great solution for most of my clients. I was a bit put off with Sean’s bashing of Ola’s script and I feel personally that he was a bit too harsh. It’s one thing to feel that something is good but I could do better, but it’s something else to say that it’s bad and then do something better. Of course, I’m Swedish and Ola is Swedish, so my countryman defense probably kicked into gear. But I digress.
The discussion was exceptionally interesting and I found myself wanting to pull over and download the documentation for the solution then and there. Fast forward to today, I’ve had had the time to download the script and play with it a bit and my first opinion is that it might be the coolest thing since sliced bread, as the saying goes. The granularity is a real gem and the included documentation is excellent. I very much recommend everyone to check out Minion Reindex.

So there you have it – an ordinarily boring drive turned into a double whammy with both SQL Server Radio and Minion Reindex. Eagerly looking forward to both Minion Backup and the upcoming Minion CheckDB.

 

Links:

Ola Hallengren

MidnightDBA

SQL Server Radio

 

 

Basic settings for SQL Server 2012/2014, part 2

Updated: 2016-11-07

Part 2: settings inside SQL Server

When the server is all good and installed, it’s time to go through and set up a bunch of server settings. First of all – dedicated admin connections. This is used as a “get out of jail for free”-card for a DBA if all the schedulers are cranky and won’t accept connections for one reason or another. Using the DAC, it might be possible to sneak in through the “back door” and solve whatever problem that is plaguing the system without having to go through the hassle of a reboot/service restart. Turn. It. On. Period.

https://msdn.microsoft.com/en-us/library/ms190468.aspx
https://msdn.microsoft.com/en-us/library/ms189595.aspx

Optimize for adhoc was introduced way, way back to minimize the impact of adhoc plans in the plan cache. Simply put, it works bu not storing a complete plan in the cache until it is reused at least once. Instead a “stub” is stored, and less memory is consumed by plans that are never reused. Having read the same recommendation from several sources, I always turn this on – BUT – keep an eye out for surprises in the plan cache.

Kimberly L. Tripp lays it all out here: http://www.sqlskills.com/blogs/kimberly/plan-cache-and-optimizing-for-adhoc-workloads/ and her post links to a gazillion other posts.

Memory, then. The default setting for MIN_SERVER_MEMORY is 0 and MAX_SERVER_MEMORY is 2147483647 MB. Yes. The technical term for that amount is “a lot”. It’s a really bad idea to leave this setting as is, as SQL Server is a bit like my late lawnmower – it will eat EVERYTHING in sight. This comes with some consequences for wildlife, geography and neighbours, or in this case – other consumers of memory. There are several ways to come up with a number for MAX_SERVER_MEMORY, but a simple rule of thumb is 90% of total memory or 4GB, whichever is greater. Suddenly it wasn’t so great to give the server 6GB of RAM, riiight?
NOTE: the previous number is ONLY applicable if you have ONLY one instance on the server and NO other consumers (Fulltext daemon, reporting/analysis/integration services, etc.)

http://www.brentozar.com/archive/2012/11/how-to-set-sql-server-max-memory-for-vmware/
Somewhat older but still very much sound:
https://sqlserverperformance.wordpress.com/2009/10/29/suggested-max-memory-settings-for-sql-server-20052008/

TEMPDB has been a hot topic for contention for many years. I’ve read everything between 2-8 files depending on number of cores, and then tune upwards if you run into problems. I make it simple: on a SAN with lots of spindles or SSDs: 8 files and be done with. Too many files can actually hurt performance if you have too few spindles (but then again – everything hurts if with too few spindles).

Paul Randal lays out a lot of information here: http://www.sqlskills.com/blogs/paul/correctly-adding-data-files-tempdb/ and http://www.sqlskills.com/blogs/paul/a-sql-server-dba-myth-a-day-1230-tempdb-should-always-have-one-data-file-per-processor-core/
This KB is also relevant: https://support.microsoft.com/kb/2154845

Lock pages in memory – perhaps the closest thing to a religious argument the community has ever had. I fall into the “on the hedge” category of DBAs. I personally don’t use it most of the time, but it’s nice to know that it is there should I need it.

Jonathan Kehayias has a great blog post with both sides of the argument and lots of interesting information: https://www.simple-talk.com/sql/database-administration/great-sql-server-debates-lock-pages-in-memory/

Backup compression is another no-brainer. Or is it? The prevailing opinion is to turn it on as the servers out there usually has a lot more CPU than I/O resources. That is undoubtedly the case, but keep in mind the environment outside the database. If you’re running a compressing backup solution like TSM, NetBackup or similar, the compression and subsequent deduplication of that system will suffer greatly. In essence: make sure you only compress your backups at one place, and if at all possible do it at the SQL Server.

Paul Randal again has a great post on the subject: http://www.sqlskills.com/blogs/paul/sql-server-2008-backup-compression-cpu-cost/

Maximum degree of parallelism will hurt you out of the box. There is no established consensus, but my opinion is this: most of the OLTP environments out there generally don’t benefit from parallelism – quite the contrary. I’ve found that more often than not the database spends more time reassembling the query results (as seen by a CXPACKET wait) than the query would have taken just burning one CPU. That’s why I generally set MAXDOP to 2 for 1-4 CPUs and MAYBE 2-4 for 4-8 CPUs. Then I keep an eye on the cache and tune accordingly. It is beoynd me why the default is 0 (as in PARALLELIZE OVER EVERY CPU THERE IS!).

This runs hands in hand with cost threshold for parallelism. This defaults to 5, which is a ridiculously low number and results in SQL Server trying to parallelize EVERYTHING. Combine that with the above MAXDOP of 0 and well, all hell is bound to break loose. Again, there is no “right” way to do it, but I start out with 50 as a baseline and tune up or down accordingly. In OLTP workloads, expensive queries are generally a bad idea anyway.

Trace flags are usually something to be wary of, but there are a few that are a downright good idea to use.
3226 for instance, suppresses the backup success messages in the SQL Server log. There’s enough information in there already, so a little less stuff to wade through is only a good thing.
http://msdn.microsoft.com/en-us/library/ms188396.aspx

1118 tells the server to avoid mixed extents and give each new object its own 64KB of allocated data.
1117 tells the server to grow each file in a filegroup equally. This can come in handy to avoid hotspots as a result of uneven file growth. Both these flags are documented in in  http://support.microsoft.com/kb/2154845/en-us

2371 modifies the behavior of statistics auto update – very useful on large tables.
http://blogs.msdn.com/b/saponsqlserver/archive/2011/09/07/changes-to-automatic-update-statistics-in-sql-server-traceflag-2371.aspx

2562 and 2549 pertains to improvements for DBCC CHECKDB.
https://support.microsoft.com/en-us/kb/2634571
4199 enables non-default query processor enhancements and while IT SHOULD BE TESTED, I’ve found it to be quite safe to use.
https://support.microsoft.com/en-us/kb/974006

 

 

SP_HELP_REVLOGIN

Very useful to transfer logins with their passwords intact.

USE master
GO
IF OBJECT_ID ('sp_hexadecimal') IS NOT NULL
  DROP PROCEDURE sp_hexadecimal
GO
CREATE PROCEDURE sp_hexadecimal
    @binvalue varbinary(256),
    @hexvalue varchar (514) OUTPUT
AS
DECLARE @charvalue varchar (514)
DECLARE @i int
DECLARE @length int
DECLARE @hexstring char(16)
SELECT @charvalue = '0x'
SELECT @i = 1
SELECT @length = DATALENGTH (@binvalue)
SELECT @hexstring = '0123456789ABCDEF'
WHILE (@i <= @length)
BEGIN
  DECLARE @tempint int
  DECLARE @firstint int
  DECLARE @secondint int
  SELECT @tempint = CONVERT(int, SUBSTRING(@binvalue,@i,1))
  SELECT @firstint = FLOOR(@tempint/16)
  SELECT @secondint = @tempint - (@firstint*16)
  SELECT @charvalue = @charvalue +
    SUBSTRING(@hexstring, @firstint+1, 1) +
    SUBSTRING(@hexstring, @secondint+1, 1)
  SELECT @i = @i + 1
END

SELECT @hexvalue = @charvalue
GO
 
IF OBJECT_ID ('sp_help_revlogin') IS NOT NULL
  DROP PROCEDURE sp_help_revlogin
GO
CREATE PROCEDURE sp_help_revlogin @login_name sysname = NULL AS
DECLARE @name sysname
DECLARE @type varchar (1)
DECLARE @hasaccess int
DECLARE @denylogin int
DECLARE @is_disabled int
DECLARE @PWD_varbinary  varbinary (256)
DECLARE @PWD_string  varchar (514)
DECLARE @SID_varbinary varbinary (85)
DECLARE @SID_string varchar (514)
DECLARE @tmpstr  varchar (1024)
DECLARE @is_policy_checked varchar (3)
DECLARE @is_expiration_checked varchar (3)

DECLARE @defaultdb sysname
 
IF (@login_name IS NULL)
  DECLARE login_curs CURSOR FOR

      SELECT p.sid, p.name, p.type, p.is_disabled, p.default_database_name, l.hasaccess, l.denylogin FROM 
sys.server_principals p LEFT JOIN sys.syslogins l
      ON ( l.name = p.name ) WHERE p.type IN ( 'S', 'G', 'U' ) AND p.name <> 'sa'
ELSE
  DECLARE login_curs CURSOR FOR


      SELECT p.sid, p.name, p.type, p.is_disabled, p.default_database_name, l.hasaccess, l.denylogin FROM 
sys.server_principals p LEFT JOIN sys.syslogins l
      ON ( l.name = p.name ) WHERE p.type IN ( 'S', 'G', 'U' ) AND p.name = @login_name
OPEN login_curs

FETCH NEXT FROM login_curs INTO @SID_varbinary, @name, @type, @is_disabled, @defaultdb, @hasaccess, @denylogin
IF (@@fetch_status = -1)
BEGIN
  PRINT 'No login(s) found.'
  CLOSE login_curs
  DEALLOCATE login_curs
  RETURN -1
END
SET @tmpstr = '/* sp_help_revlogin script '
PRINT @tmpstr
SET @tmpstr = '** Generated ' + CONVERT (varchar, GETDATE()) + ' on ' + @@SERVERNAME + ' */'
PRINT @tmpstr
PRINT ''
WHILE (@@fetch_status <> -1)
BEGIN
  IF (@@fetch_status <> -2)
  BEGIN
    PRINT ''
    SET @tmpstr = '-- Login: ' + @name
    PRINT @tmpstr
    IF (@type IN ( 'G', 'U'))
    BEGIN -- NT authenticated account/group

      SET @tmpstr = 'CREATE LOGIN ' + QUOTENAME( @name ) + ' FROM WINDOWS WITH DEFAULT_DATABASE = [' + @defaultdb + ']'
    END
    ELSE BEGIN -- SQL Server authentication
        -- obtain password and sid
            SET @PWD_varbinary = CAST( LOGINPROPERTY( @name, 'PasswordHash' ) AS varbinary (256) )
        EXEC sp_hexadecimal @PWD_varbinary, @PWD_string OUT
        EXEC sp_hexadecimal @SID_varbinary,@SID_string OUT
 
        -- obtain password policy state
        SELECT @is_policy_checked = CASE is_policy_checked WHEN 1 THEN 'ON' WHEN 0 THEN 'OFF' ELSE NULL END FROM sys.sql_logins WHERE name = @name
        SELECT @is_expiration_checked = CASE is_expiration_checked WHEN 1 THEN 'ON' WHEN 0 THEN 'OFF' ELSE NULL END FROM sys.sql_logins WHERE name = @name
 
            SET @tmpstr = 'CREATE LOGIN ' + QUOTENAME( @name ) + ' WITH PASSWORD = ' + @PWD_string + ' HASHED, SID = ' + @SID_string + ', DEFAULT_DATABASE = [' + @defaultdb + ']'

        IF ( @is_policy_checked IS NOT NULL )
        BEGIN
          SET @tmpstr = @tmpstr + ', CHECK_POLICY = ' + @is_policy_checked
        END
        IF ( @is_expiration_checked IS NOT NULL )
        BEGIN
          SET @tmpstr = @tmpstr + ', CHECK_EXPIRATION = ' + @is_expiration_checked
        END
    END
    IF (@denylogin = 1)
    BEGIN -- login is denied access
      SET @tmpstr = @tmpstr + '; DENY CONNECT SQL TO ' + QUOTENAME( @name )
    END
    ELSE IF (@hasaccess = 0)
    BEGIN -- login exists but does not have access
      SET @tmpstr = @tmpstr + '; REVOKE CONNECT SQL TO ' + QUOTENAME( @name )
    END
    IF (@is_disabled = 1)
    BEGIN -- login is disabled
      SET @tmpstr = @tmpstr + '; ALTER LOGIN ' + QUOTENAME( @name ) + ' DISABLE'
    END
    PRINT @tmpstr
  END

  FETCH NEXT FROM login_curs INTO @SID_varbinary, @name, @type, @is_disabled, @defaultdb, @hasaccess, @denylogin
   END
CLOSE login_curs
DEALLOCATE login_curs
RETURN 0
GO

Delete large number of rows

SELECT 'Starting' --sets @@ROWCOUNT
 WHILE @@ROWCOUNT <>; 0
 DELETE TOP (xxx) MyTable

Table size information

SELECT 
t.NAME AS TableName, 
i.name as indexName, 
sum(p.rows) as RowCounts, 
sum(a.total_pages) as TotalPages, 
sum(a.used_pages) as UsedPages, 
sum(a.data_pages) as DataPages, 
(sum(a.total_pages) * 8) / 1024 as TotalSpaceMB, 
(sum(a.used_pages) * 8) / 1024 as UsedSpaceMB, 
(sum(a.data_pages) * 8) / 1024 as DataSpaceMB 
FROM sys.tables t 
INNER JOIN sys.indexes i ON t.OBJECT_ID = i.object_id 
INNER JOIN sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id 
INNER JOIN sys.allocation_units a ON p.partition_id = a.container_id 
WHERE t.NAME NOT LIKE 'dt%' 
AND i.OBJECT_ID > 255 
AND i.index_id <= 1 
GROUP BY t.NAME, i.object_id, i.index_id, i.name 
ORDER BY object_name(i.object_id)

Erland Sommarskogs wait_stats_monitor

CREATE PROCEDURE wait_stats_monitor AS
SET NOCOUNT ON
DECLARE @latest datetime,
 @now    datetime
SELECT @now = getdate()
IF object_id('waitstats') IS NULL
 BEGIN
 CREATE TABLE waitstats (
 sample_time       datetime     NOT NULL,
 wait_type         nvarchar(60) NOT NULL,
 sample_length_sec int          NULL,
 delta_tasks_count bigint       NULL,
 delta_wait_time   bigint       NULL,
 delta_signal      bigint       NULL,
 max_wait_time_ms  bigint       NOT NULL,
 tot_tasks_count   bigint       NOT NULL,
 tot_wait_time     bigint       NOT NULL,
 tot_signal        bigint       NOT NULL,
 CONSTRAINT pk_waitstats PRIMARY KEY (wait_type, sample_time)
 )
CREATE INDEX sample_time ON waitstats (sample_time)
 END
IF NOT EXISTS (SELECT * FROM waitstats)
 BEGIN
 INSERT waitstats (sample_time, wait_type, max_wait_time_ms,
 tot_tasks_count, tot_wait_time, tot_signal)
 SELECT @now, wait_type, max_wait_time_ms,
 waiting_tasks_count, wait_time_ms, signal_wait_time_ms
 FROM   sys.dm_os_wait_stats
 END
 ELSE
 BEGIN
 SELECT @latest = MAX(sample_time) FROM waitstats
INSERT waitstats (sample_time, wait_type, sample_length_sec,
 delta_tasks_count,
 delta_wait_time,
 delta_signal,
 max_wait_time_ms, tot_tasks_count, tot_wait_time,
 tot_signal)
 SELECT @now, ws.wait_type, datediff(ss, m.sample_time, @now),
 ws.waiting_tasks_count - m.tot_tasks_count,
 ws.wait_time_ms        - m.tot_wait_time,
 ws.signal_wait_time_ms - m.tot_signal,
 ws.max_wait_time_ms, ws.waiting_tasks_count, ws.wait_time_ms,
 ws.signal_wait_time_ms
 FROM   sys.dm_os_wait_stats ws
 CROSS  APPLY (SELECT TOP 1 *
 FROM   waitstats m
 WHERE  m.wait_type = ws.wait_type
 ORDER  BY sample_time DESC) m
 WHERE  ws.waiting_tasks_count <> m.tot_tasks_count OR
 ws.wait_time_ms        <> m.tot_wait_time   OR
 ws.signal_wait_time_ms <> m.tot_signal
 END

Get statements from waiter list

if exists (select * from sys.objects where object_id = object_id(N'[dbo].[get_statements_from_waiter_list]') and OBJECTPROPERTY(object_id, N'IsProcedure') = 1)
 drop procedure [dbo].[get_statements_from_waiter_list]
 go
create proc get_statements_from_waiter_list (@wait_type nvarchar(60)=NULL)
 as
 select
 r.wait_type
 ,r.wait_time
 ,SUBSTRING(qt.text,r.statement_start_offset/2,
 (case when r.statement_end_offset = -1
 then len(convert(nvarchar(max), qt.text)) * 2
 else r.statement_end_offset end -r.statement_start_offset)/2)
 as query_text
 ,qt.dbid, dbname=db_name(qt.dbid)
 ,qt.objectid
 ,r.sql_handle
 ,(SELECT query_plan FROM sys.dm_exec_query_plan (r.plan_handle)) as plan_handle
 FROM sys.dm_exec_requests r
 cross apply sys.dm_exec_sql_text(r.sql_handle) as qt
 where r.session_id > 50
 and r.wait_type = isnull(upper(@wait_type),r.wait_type)
 go
exec get_statements_from_waiter_list

Dude, where’s my disk?

Today I took a look at a clients 10g Oracle RAC environment where one node apparently had started misbehaving quite some time ago. The scenario was that both nodes were up, cluster ready services was up, nodeapps were up, ASM was up but the instance was down. Starting from the top with the database alert log, it was apparent that ASM had no available disk groups for this database. Okay, let’s ask ASM what it feels, then. Looking in the ASM log file it was equally obvious that the ASM instance was online but didn’t service any disk groups. That’s odd.
Let’s bring out the whole toolbox and start from the top. It didn’t take very long for the first problem to appear: ASMTOOL couldn’t see any stamped disks. Even worse, running asmtool -list gave me OS Error 21 – disk not ready. The server saw the disks via Disk Manager, but ASM couldn’t see the disk headers. That’s novel. Not very helpful, but novel.
Considering what might interfere with disk headers and disk presentation in general, volume managers and similar comes to mind. Knowing that the customer runs NetApp in the organization, I took a look at SnapDrive. SnapDrive should never be installed (or heaven forbid – be running!) on a RAC cluster machine. The node that worked actually HAD SnapDrive – albeit disabled. The node that didn’t work also had SnapDrive – running happily. Time to end that.
A reboot and back to square one. Still couldn’t see the ASM stamps. Time to check DISKPART for automount setting. It turned out to be disabled, so that was rectified:

DISKPART> automount enable

Automatic mounting of new volumes enabled.

But still no diskgroups for me, despite a reboot and a CRS restart. On a whim I took the disks offline and online again. To my surprise that was the push that awoke the ASM stack. From that point everything worked like charm; ASM saw the stamps, ASM could be persuaded to mount the disk groups and then the database was started without any issues. Some points to take home:

  • Just because something is ancient, don’t expect it to be removed from use (i.e old knowledge always comes in handy)
  • When people tell you that they haven’t done anything to the environment – politely smile and expect that they have.
  • Oracle RAC is a finicky beast – always pay attention to the support matrix.
  • RAC is only as highly available as the administrator tasked with supporting it.