Author: Lakshmi Kiran Reddy

The tipping point and covering indexes

So you’re tuning a slow query and you create a new index to eliminate a table or clustered index scan.  In your testing you find that it works well for some search arguments, but is ignored by the Optimizer for others.  You notice that the “ignored” queries tend to return larger record sets.  What’s going on here and how do you persuade the Optimizer to use your index (spoiler alert:  you don’t!).

Odds are good that you’ve created a non-covering index.  This is simply an index that doesn’t have all of the columns necessary to execute a query.    Quick quiz…

Which (if any) of the queries below are covered by this index?

CREATE INDEX ix_LastName_FirstName__MiddleName
 ON Person.Person (LastName, FirstName) INCLUDE (MiddleName);
  1. SELECT Title, FirstName, LastName FROM Person.Person WHERE LastName = N’Martinez’;
  2. SELECT * FROM Person.Person WHERE LastName LIKE N’M%’;
  3. SELECT FirstName, LastName FROM Person.Person WHERE LastName = N’Martinez’;
  4. SELECT FirstName, LastName FROM Person.Person WHERE FirstName = N’Daniel’;

The answer is both C and D.  In each case, the index contains all the data necessary to execute the query.  You can see that in the respective query plans:

Note:  Even though we need to scan the nonclustered index to find the Daniels, it’s still a net win over the cost of having to scan the clustered index (107 vs 3821 logical I/Os).

In the absence of a covering index (queries A and B), the Optimizer must choose between index seeks plus key lookups or a clustered index scan as illustrated below:

And its decision is driven primarily by I/O cost which in turn depends on row counts and index structure.

B-tree indexes (for both clustered and nonclustered indexes) are built of 8K pages arranged in levels.  Each level is an index for the layer immediately below it, and there’s always a single page at the top (or root).  You can find out how deep and wide your indexes are using the sys.dm_db_index_physical_stats() function:

SELECT i.name, s.index_type_desc, s.alloc_unit_type_desc, s.index_depth,
   s.index_level, s.page_count
FROM sys.indexes i
    CROSS APPLY sys.dm_db_index_physical_stats ( DB_ID (), i.object_id, i.index_id, NULL, 'DETAILED' ) s
WHERE i.object_id = OBJECT_ID ( 'Person.Person' ) AND
 i.index_id IN ( 1, 2 ) AND
s.alloc_unit_type_desc = N'IN_ROW_DATA';

Be careful using DETAILED mode – it requires that the full index be read.  That’s a lot of expensive, slow I/O for big tables/indexes.  LIMITED mode returns less information, but is a safer option for production systems.

The query returned data on two indexes on the Person table – its clustered index and a nonclustered index.  The clustered index (blue in the diagram below) is 3 levels deep and 3811 pages wide at the leaf level (level = 0).  The nonclustered index (green) is both shallower (2 levels) and narrower (105 pages at the leaf level):

An Index Seek starts at the topmost root page then traverses any intermediate layers to get to the leaf level pages where the column data you’re after resides.  The cost of each seek is the depth of the index.  [Note that deeper indexes (larger tables and/or wider index keys) will have a higher seek costs.]  Once we get to the leaf level, we might scan additional pages laterally to return a range of data (e.g., all the persons whose last name is Martinez). 

An Index Scan just reads through all the leaf level pages searching for data that matches your filter criteria.  The cost is the number of pages at the leaf level.

To execute the this query using our nonclustered (non-covering) index…

SELECT Title, FirstName, LastName FROM Person.Person WHERE LastName = N'Martinez'; 

…the Optimizer first performs a seek + range scan on the nonclustered index to gather the first and last names plus the clustered index keys for all the folks named Martinez.  Then, one at a time, it performs seeks into the clustered index (here referred to as key lookups) to retrieve their titles.  The kicker is that each individual key lookup costs 3 logical reads, so to return 173 rows the cost is in the ballpark of 173 * 3 = 519 logical I/Os.  And as the row counts increase the I/O costs increase 3X faster.  As we approach 1300 rows we get to a point where it would be cheaper to simply scan the clustered index once (at a cost of ~3800 logical reads) and be done with it.

And this is why the Optimizer stops using your non-covering index – it’s simply too costly to use it.  The point where this transition happens is called the tipping point

You can’t predict where exactly this switch will happen.  There’s lots of secret, internal magic around determining tipping points, but per Kimberly Tripp a ballpark figure is when you’re reading around 30% of the pages in the table.

If you’re feeling adventurous you can go hunting for tipping points using an approach like this:

SET STATISTICS IO ON;

-- Query an ever widening range of indexed values
SELECT Title, FirstName, LastName FROM Person.Person WHERE LastName LIKE N'D%';            
/*  556 rows; 1717 logical reads */

SELECT Title, FirstName, LastName FROM Person.Person WHERE LastName LIKE N'[DE]%';            
/*  743 rows; 2291 logical reads */

/* The tipping point is here.  The queries above use key lookups, those below a clustered index scan */

SELECT Title, FirstName, LastName FROM Person.Person WHERE LastName LIKE N'[D-F]%';  
/* 1111 rows; 3821 logical reads */

SELECT Title, FirstName, LastName FROM Person.Person WHERE LastName LIKE N'[D-G]%';  
/* 2465 rows; 3821 logical reads */

Somewhere north of 743 rows the key lookup I/O cost becomes high enough that the Optimizer switches to using a clustered index scan. 

The tipping point determination isn’t always spot on, but it will be close enough most of the time, so generally speaking, you shouldn’t be overriding the Optimizer’s judgement, although you can.

So what if you really want the Optimizer to use your lovely index?  You can literally “force” the issue with a Table Hint, but it’s probably not a good idea as it’s likely to increase your I/O costs – and more I/O means slower queries.  Let’s use the FORCESEEK hint on the last couple of queries where the Optimizer chose to use a clustered index scan:

SELECT Title, FirstName, LastName FROM Person.Person WITH (FORCESEEK) WHERE LastName LIKE N'[D-F]%';   
/* 1111 rows; 3420 logical reads */
SELECT Title, FirstName, LastName FROM Person.Person WITH (FORCESEEK) WHERE LastName LIKE N'[D-G]%';   
/* 2465 rows; 7575 logical reads */

In the first instance, the I/O costs are actually a bit lower, but, most of the time this won’t be the case.  Note that the I/O cost for the 2465 row query is nearly double that when the Optimizer was allowed to do its thing.

A better solution here is to create a covering index which will eliminate both key lookups and clustered index seeks!  Since indexes are costly to create and maintain, focus on creating indexes to support your most important queries well.  Let’s create an index that covers that last set of test queries then rerun them:

CREATE INDEX ix_LastName_FirstName__Title 
ON Person.Person (LastName, FirstName) INCLUDE (Title);

SELECT Title, FirstName, LastName FROM Person.Person WHERE LastName LIKE N'D%';            
/*  556 rows; 6 logical reads */

SELECT Title, FirstName, LastName FROM Person.Person WHERE LastName LIKE N'[DE]%';            
/*  743 rows; 7 logical reads */

SELECT Title, FirstName, LastName FROM Person.Person WHERE LastName LIKE N'[D-F]%';  
/* 1111 rows; 9 logical reads */

SELECT Title, FirstName, LastName FROM Person.Person WHERE LastName LIKE N'[D-G]%';  
/* 2465 rows; 16 logical reads */

Now all 4 can be executed with index seeks and the I/O cost plummets greatly improving performance!

Happy Exploring!

I’d like to grow my readership. If you enjoyed this blog post, please share it with your friends!

How to start SQL Server with Minimal Configuration

Problem Statement:

We recently migrated one of On-prem sql server to SQL server in Azure VM, As part of configuration settings, when we try to change the MAX memory of sql server, it accidentally changed the value to 128MB . We were unable to connect to sql server as sql server was not restarting due to low memory. This blog post explains steps that we followed the steps to fix this issue

  1. RDP to the server
  2. Make sure there are no other users RDP’d to the server, if they are, kick them off. You want to be the only administrator on the box.
  3. Open SQL Configuration Manager.
  4. Make sure all services are stopped. If they aren’t, right click stop them all.
  5. Right click SQL Server (InstanceName)>Properties>Startup Parameters tab>type -f in “Specify a startup parameter:”>Click Add
  6. This will add -f to the list on the bottom list (again not screenshotting that because I don’t actually want to change it right now).
    INFO: The -f flag, starts an instance of SQL Server with minimal configuration. This is useful if the setting of a configuration value (for example, over-committing memory) has prevented the server from starting. Starting SQL Server in minimal configuration mode places SQL Server in single-user mode. For more information, see the description for -m that follows.

OK out of this window. (It will warn you it won’t take effect until the next time SQL services are started, click OK on that box)

Open PowerShell ISE as administrator and put this code in the scripting pane:

net stop “SQL Server (InstanceName)”
net start “SQL Server (InstanceName)” -m”SQLCMD”

SQLCMD -S .\<InstanceName> -q "sp_configure 'show advanced options', 1;
GO
RECONFIGURE;
GO
sp_configure 'max server memory', 4096;
GO
RECONFIGURE;
GO"

If all the SQL services are turned off, you don’t need to execute the net stop line but is included incase the SQLCMD fails to get a connection, you don’t need to stop services from Configuration manager or services.msc.

Execute the Powershell code. I don’t have a good screenshot of the output or possible errors but if it throws any errors related to not being able to start the instance because its already running, or can’t connect to the instance during the SQLCMD part. Stop the script if its running, and run it again. I had to run it a few times (may have been due to trying to get all the code syntax right, or it was because something was making it in before SQLCMD (that service account had an RDP session open until I logged it off) could get a connection. Eventually you should see output that indicates the max memory was set and that you need to run a RECONFIGURE. Since the RECONFIGURE is part of the code to execute, it will execute, it just won’t output that the reconfigure was run. All done, Max memory is set to 4GB.

After getting Max memory set to 4GB, the instance is still up in single user mode and minimal configuration so all that’s left is to put it back to normal.

Stop the SQL services.

Remove the -f Startup Parameter.

Start all the SQL services.

Connect with SSMS and set the Max memory to what you actually want it to be.

Few additional notes:

  1. The main idea behind why this works is a combination of net start “SQL Server (InstanceName)” -m”SQLCMD” and using a single SQLCMD command (connect to the instance AND run a query in the same PosH command) in PowerShell in the same script to be executed all as close to each other as possible. I suspect doing it from CMD prompt may not be fast enough as separate commands.
  2. Net start with -m”SQLCMD” should force the instance to ONLY allow connections from SQLCMD. That means no other application should be able to steal our single admin session before we can connect.
  3. we choose 4096 for the max server memory because we only wanted to get enough memory to start up the instance like normal and we could adjust to final max memory after the fact. we didn’t want to go overboard on the emergency. It is also beneficial because most servers should have at least this much space to be able to allocate so if we use the script for other instances, it’s one less thing to change to make it work.
Key take away everyone should remember:

It doesn’t matters what issue you are trying to solve but if we EVER need to get into an instance in minimal configuration and single user mode, PowerShell net start -m”SQLCMD” and SQLCMD -q, all in the same script executed at the same time is very likely your best friend due to it limiting the single user to someone trying to get in with SQLCMD and then running SQLCMD immediately after so you get the only connection.

I’d like to grow my readership. If you enjoyed this blog post, please share it with your friends!

How to re-queue an email in SQL that has been successfully sent at first time?

Let’s say you send an email from inside a SQL instance using something like the following:

EXEC msdb.dbo.sp_send_dbmail  
    @profile_name = 'SQLZEALOTS_Manager',  
    @recipients = 'Lkiran@xxx.com',  
    @body = 'This email Sent',  
    @subject = 'SqLZeaLoTs' ;

And recipient DOES successfully receive the email.

Now for one reason or another, they want the original email sent again (not an execution of msdb.dbo.sp_send_dbmail, but instead to basically re queue the original email to be sent again).

Step 1: Execute the following:

SELECT mailItem_id
       FROM [msdb].[dbo].[sysmail_mailitems]
       WHERE subject = 'SqLZeaLoTs'
       ORDER BY sent_date desc

Copy the mailItem_id value for the specific email you want to requeue, in this example 2738274.

Use msdb
GO
DECLARE @rc INT
,@sendmailxml VARCHAR(max)
,@mailID INT = 2738274 -- This is from the mailItem_ID column in [msdb].[dbo].[sysmail_mailitems] for the mail you want to resend. 2738274 is just an example.

SET @sendmailxml = '<requests:SendMail xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://schemas.microsoft.com/databasemail/requests RequestTypes.xsd" xmlns:requests="http://schemas.microsoft.com/databasemail/requests"><MailItemId>'
                        + CONVERT(NVARCHAR(20), @mailID) + N'</MailItemId></requests:SendMail>'

    -- Send the send request on queue.
    EXEC @rc = sp_SendMailQueues @sendmailxml
    IF @rc <> 0
    BEGIN
       RAISERROR(14627, 16, 1, @rc, 'send mail')
END

If the query completes successfully, it has queued the original email again and the original recipient will receive another email with exactly the same subject and body.

Step 2 has only been confirmed to work on SQL 2008 R2 SP3.

Note: If you want to try it for later SQL Server versions, you may need to verify the code of step 2 by following the steps.

Script out the system procedure: msdb.dbo.sp_send_dbmail and look for the section near the bottom that executes sp_SendMailQueues. The most important piece to verify is the “SET @sendmailxml…” statement. The reason for this is that sp_SendMailQueues sends the email to a queue that is monitored by a broker that calls DatabaseMail.exe. It is entirely possible that different versions of SQL may have different versions of DatabaseMail.exe that requires the XML to be formatted differently so you want to make sure that the XML is formatted for the version required.

Please try this out on your environment and feedback if you have any comments.

I’d like to grow my readership. If you enjoyed this blog post, please share it with your friends!

Configure BlockedProcessReport in SQL Server

In this blog we will discuss on how to configure Blocked Process report on a particular database. This will be useful for troubleshooting any processes which are getting blocked for more than a specific time threshold. Even though blocking is common behavior in sql, but if the block duration is longer than expected, then we can experience performance issues. We have many ways to identify the blocking in sql (which are not going to be discussed here) but I am going to cover one method using event notifications.

By default, the “blocked process threshold” is zero, meaning that SQL Server won’t generate the blocked process reports. We need to configure by the “blocked process threshold” to a specific value (in seconds) using the sp_configure option. For example, if we set it to 15 seconds, then the event will fire three times if a session is blocked for 45 seconds.

Lets see the step by step process how to configure the blocked process report:

First the database should be enabled for service broker.

step1

We need to create QUEUE,SERVICE,ROUTE and EVENT NOTIFICATION to capture the blocking/deadlock events.

step2

step3

Please note in the above step , we have created a Event Notification at Server level , and on the database where you want to track the blocking for more than the configured value. All the above 3 steps are specific to DB level where as creating notification is at server level.  You can get the list of  already configured event notifications on your server with the below query:

SELECT * FROM sys.server_event_notifications

Now lets create a table to hold the blocking information

IF OBJECT_ID('[dbo].[BlockedProcessReports]') IS NULL
CREATE TABLE [dbo].[BlockedProcessReports]
(
blocked_process_id int IDENTITY(1,1) PRIMARY KEY,
database_name sysname,
post_time datetime,
blocked_process_report xml
);
GO

let’s create a stored procedure to read the messages receives from service broker queue  and capture the event in the above table we created:

CREATE PROCEDURE [dbo].[ProcessBlockProcessReports]
WITH EXECUTE AS OWNER
AS
SET NOCOUNT ON
DECLARE @message_body XML
,@message_type INT
,@dialog UNIQUEIDENTIFIER
,@subject VARCHAR(MAX)
,@body VARCHAR(MAX)

WHILE (1 = 1)
BEGIN
BEGIN
BEGIN TRANSACTION
-- Receive the next available message from the queue
WAITFOR (
RECEIVE TOP(1) -- just handle one message at a time
@message_type=message_type_id, --the type of message received
@message_body=CAST(message_body AS XML), -- the message contents
@dialog = conversation_handle -- the identifier of the dialog this message was received on
FROM dbo.BlockedProcessReportQueue
), TIMEOUT 1000 -- if the queue is empty for one second, give UPDATE and go away
-- If we didn't get anything, bail out
IF (@@ROWCOUNT = 0)
BEGIN
ROLLBACK TRANSACTION
BREAK
END

INSERT INTO [dbo].[BlockedProcessReports]
(
database_name
,post_time
,blocked_process_report
)
SELECT DB_NAME(CAST(@message_body AS
XML).value('(/EVENT_INSTANCE/DatabaseID)[1]',
'int'))
,CAST(@message_body AS XML).value('(/EVENT_INSTANCE/PostTime)[1]',
'datetime')
,CAST(@message_body AS XML)

END

-- Commit the transaction. At any point before this, we could roll
-- back - the received message would be back on the queue AND the response
-- wouldn't be sent.
COMMIT TRANSACTION
END;
GO

The final step is to activate the queue to call the stored procedure

ALTER QUEUE BlockedProcessReportQueue
WITH
ACTIVATION
(STATUS=ON,
PROCEDURE_NAME = [dbo].[ProcessBlockProcessReports],
MAX_QUEUE_READERS = 1,
EXECUTE AS OWNER);
GO

We will test this by creating a blocking scenario manually and lets check whether we can see the blocking information on the table we created:

Lets create an exclusive lock and shared locks on a table and see our table for blocking info.
create table
concurrent screen
table data
block xml.JPG

As you can see above we can see the queries which are involved in blocking from the above xml.

Hope you have enjoyed the post and share your comments

Python Dictionaries

A dictionary is a set of unordered key, value pairs. In a dictionary, the keys must be unique and they are stored in an unordered manner.

In this tutorial you will learn the basics of how to use the Python dictionary.

Creating a Dictionary:

dict1

Accessing Items:
You can access the items of a dictionary by referring to its key name, inside square brackets:

dict2

Updating Dictionary:
You can update a dictionary by adding a new entry or a key-value pair, modifying an existing entry, or deleting an existing entry as shown below in the simple example −

dict3

Loop Through a Dictionary:
You can loop through a dictionary by using a for loop.When looping through a dictionary, the return value are the keys of the dictionary, but there are methods to return the values as well.

dict4.JPG

Check if Exists:

You can test the presence of a key using ‘in’ or ‘not in’

dict5

Restrictions on Dictionary Keys:
Almost any type of value can be used as a dictionary key in Python. As an example,  integer, float, and Boolean objects are used as keys:

dict6

However, there are a couple restrictions that dictionary keys must abide by.

First, a given key can appear in a dictionary only once. Duplicate keys are not allowed. A dictionary maps each key to a corresponding value, so it doesn’t make sense to map a particular key more than once.
You could see below that when you assign a value to an already existing dictionary key, it does not add the key a second time, but replaces the existing value:

dict7

Restrictions on Dictionary Values:
By contrast, there are no restrictions on dictionary values. Literally none at all. A dictionary value can be any type of object Python supports, including mutable types like lists and dictionaries, and user-defined objects.There is also no restriction against a particular value appearing in a dictionary multiple times.

Hope you have enjoyed the post. Keep reading