String_Split function in SQL Server 2016

Introduction:

SQL Server 2016 has introduced a built-in table valued function string_split to split a string into rows based on the separator.

Earlier, developers need to write custom/user defined function to do the split work. However, with the new function, its a built-in and can be used directly. The problem with user defined function is the way the function is defined and used.There are multiple ways usually developers write custom functions like scalar/Table valued functions. And its observed in many cases, scalar functions are not scaling as desired and causing serious performance issues.

Usages:

1. Split the variable string


Select * From string_split('first,second,third',',')

2. Split the table rows with the help of CROSS APPLY


Create Table splitTable(Col int, string varchar(100))
Insert into splitTable values(1,'First,Second'),(2,'Primary,Secondary')

Select * From splitTable a
Cross apply string_split(a.string,',') B

Drop Table splitTable

If we analyse the execution plan of the above query, we can see “Table Valued Function” operator is used to implement the function.

Limitations:

1. ONLY a single-character delimiters can be used at a time. If you want to apply multiple delimiter, you can use multiple CROSS APPLY.

2. There is no way to understand the ordinal position/eliminate duplicates/empty strings in the string.

3. This function will work ONLY if the compatibility of your database is 130 or later irrespective of SQL Server version.

I would recommend to use this built-in function for future development and provide your experience as part of learning and sharing.

Advertisements

SQL Server 2016 Database Scoped Configuration to force The Legacy CE

Problem Statement:

Recently, we identified a performance issue with a query once we upgraded the database server to SQL Server 2016. If you look at the query, it uses system catalogs to fetch some data. This query can be optimized or rewritten in better way, however, we are going to discuss about how the same query behaves with different scenario/settings.


SELECT UPPER(ccu.table_name) AS SourceTable
    ,UPPER(ccu.constraint_name) AS SourceConstraint
    ,UPPER(ccu.column_name) AS SourceColumn
    ,LOWER(ccu.TABLE_SCHEMA) AS SourceSchema
    ,UPPER(kcu.table_name) AS TargetTable
    ,UPPER(kcu.column_name) AS TargetColumn
    ,LOWER(kcu.TABLE_SCHEMA) as TargetSchema
    ,LOWER(kcu.constraint_name) as TargetConstraint
FROM INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE ccu
    INNER JOIN INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS rc
    ON ccu.CONSTRAINT_NAME = rc.CONSTRAINT_NAME
    INNER JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE kcu
    ON kcu.CONSTRAINT_NAME = rc.UNIQUE_CONSTRAINT_NAME
WHERE ccu.TABLE_NAME not like 'tablename%' and ccu.TABLE_SCHEMA = 'dbo'
ORDER BY ccu.TABLE_SCHEMA, ccu.table_name, ccu.COLUMN_NAME 

The above query runs for 3 minutes in SQL Server 2016 with compatibility 130, but quite faster in SQL Server 2008 R2 that returns in 3 seconds. It is also observed that, when we change the compatibility 100 in SQL Server 2016, the query runs faster as expected.

Analysis:

 
--SQL 2016 (with compatibility 130) Took 3 min 16 sec
(424 row(s) affected)
Table 'sysidxstats'. Scan count 7971, logical reads 38154, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sysschobjs'. Scan count 13063, logical reads 104556, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'syscolpars'. Scan count 3265, logical reads 8681635, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sysiscols'. Scan count 2696, logical reads 6120, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


--SQL 2008 R2 (with compatibility 100) Took 0 min 3 sec
(424 row(s) affected)
Table 'sysidxstats'. Scan count 235, logical reads 1280, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sysschobjs'. Scan count 11, logical reads 1314787, physical reads 0, read-ahead reads 6, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'syscolpars'. Scan count 1, logical reads 2659, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sysiscols'. Scan count 2, logical reads 410, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

If we see the IO statistics, we can clearly see that the number of IO is more with compatibility 130 compared to 100. So it is evident that there are some fundamental changes at optimizer level that causes this difference.

Solutions:

We have multiple ways to solve this difference:

1. Change the compatibility to 100


Alter database DBNAME SET COMPATIBILITY_LEVEL = 100

Considerations:

Once we set lower compatibility, we will not be able to use new features introduced in SQL 2016 like RLS/In Memory OLTP and many others.

2. Use Trace Flag 9481

Trace Flag 9481 forces Legacy CE model used by SQL Server 7.0 – 2012.
There are three ways to use this option for different scopes.

a. Query Scope: Use QUERYTRACEON query hint specified in the OPTION clause for a query.


Select * From Table OPTION ( QUERYTRACEON 9481 )

Considerations:

1. This option is to use trace flags at the query level. The change is ONLY applicable for the query, It does not have any impact on other queries.
2. This option requires sysadmin fixed server role. In production, it is unlikely to have this privillage in most cases for application logins.
3. This requires changes in the procedures/application code.

b. Session Scope: If you dont want to make any modification in query or procedures or there are many places we need to make the changes, then you can use this option at session level to use the legacy cardinality estimator.


DBCC TRACEON (9481);
GO
Select * From Table
GO
DBCC TRACEOFF (9481);

Considerations:

1. It may appear to be a quick fix in first place. But, this will not use the latest CE abilities.
2. If the procedure plan is already cached, then it will still be continuing to use the old one.

c. Global Scope: Trace flags can be set at server level and the changes will impact on every connection to the SQL Server instance.

Global trace flags can be set in two different ways as below:


	1. Using Query
		DBCC TRACEON (9481, -1);
	2. Using Startup parameters
                Go To "SQL Server 2016 Configuration Manager" -> "SQL Server services" -> Right Click on the Service -> Properties -> Startup Parameters - > Add "-T9481".
	

Considerations:

1. When you want to move your databases to use compatibility of 130 (to use new feature), but, you still need some time to test the existing code performance with new CE.
2. You may be restricting the new CE cost based benefits while creating the plans.

3. Use LEGACY_CARDINALITY_ESTIMATION Database Scoped Configuration

This is a new feature added to SQL Server 2016 – Database Scoped Configuration. To stick with the subject, lets cover LEGACY_CARDINALITY_ESTIMATION option in this feature. We have seen QUERYTRACEON option at different scopes like Query/session or Server level. SQL Server 2016 provides another option at DATABASE level using LEGACY_CARDINALITY_ESTIMATION. There are two ways, we can use this option.

a. Using GUI:


OR


ALTER DATABASE SCOPED CONFIGURATION
SET LEGACY_CARDINALITY_ESTIMATION = ON;

b. Using Query HINT: SQL Server 2016 SP1 introduced a new query hint which will influence the query optimizer. This does not require the sysadmin fixed server role unlike QUERYTRACEON, however, it is logically equivalent of the same.


SELECT * FROM Tablename
OPTION (USE HINT('FORCE_LEGACY_CARDINALITY_ESTIMATION'));

Additional information: Using DMV sys.database_scoped_configurations, we would be able to see the settings with Database scoped configuration.


SELECT configuration_id, name, value, value_for_secondary
FROM sys.database_scoped_configurations OPTION (RECOMPILE);

Hope, this helps you to understand various options to use Legacy CE in SQL Server 2016.
Happy reading and learning!!!

WITH Encryption clause in SQL Server

This is quite an old feature, however, a very useful one in SQL Server.

Applies to: SQL Server 2005 and above.

Why do we need this or its Usages?

“WITH ENCRYPTION” clause can be used for procedures and function in SQL Server.
I personally used this clause in many places, where I wanted to hide my procedure definition. When we need to manage multiple environments and different people have access to environments, its possible, the procedure definition gets changed by others. So, if we need to have to keep the same definition across all environments, we need to get the definition encrypted which would prevent the users to change the definition.

Usually, DBAs will use this option to prevent any changes or hide the logic implemented for certain purposes.Sometimes business critical logic are also encrypted using this option to hide from others.

How to Implement?

The implementation is very simple to provide the clause at the time of creation or alter.


CREATE OR ALTER procedure Test_Encryption
WITH ENCRYPTION
AS
Begin
	
	Select 'This is an encrypted procedure.'
		
End

Let us try to get the definition using sp_helptext, we will be getting a message “The text for object ‘Test_Encryption’ is encrypted.”

If we try to get the definition through SSMS, we will get the below error message.


And you can see a lock symbol to identify the encrypted procedures.

Gotchas:
1. Once the procedure is encrypted, you will not be ale to see the execution plan for the procedure.
2. It is always important to save the definition safe, you will never be able to retrieve the procedure definition once its encrypted.
3. CLR procedures and CLR User Defined Functions cannot be encrypted.
4. Encrypted procedures or functions cannot be replicated.

Python Dictionaries

A dictionary is a set of unordered key, value pairs. In a dictionary, the keys must be unique and they are stored in an unordered manner.

In this tutorial you will learn the basics of how to use the Python dictionary.

Creating a Dictionary:

dict1

Accessing Items:
You can access the items of a dictionary by referring to its key name, inside square brackets:

dict2

Updating Dictionary:
You can update a dictionary by adding a new entry or a key-value pair, modifying an existing entry, or deleting an existing entry as shown below in the simple example −

dict3

Loop Through a Dictionary:
You can loop through a dictionary by using a for loop.When looping through a dictionary, the return value are the keys of the dictionary, but there are methods to return the values as well.

dict4.JPG

Check if Exists:

You can test the presence of a key using ‘in’ or ‘not in’

dict5

Restrictions on Dictionary Keys:
Almost any type of value can be used as a dictionary key in Python. As an example,  integer, float, and Boolean objects are used as keys:

dict6

However, there are a couple restrictions that dictionary keys must abide by.

First, a given key can appear in a dictionary only once. Duplicate keys are not allowed. A dictionary maps each key to a corresponding value, so it doesn’t make sense to map a particular key more than once.
You could see below that when you assign a value to an already existing dictionary key, it does not add the key a second time, but replaces the existing value:

dict7

Restrictions on Dictionary Values:
By contrast, there are no restrictions on dictionary values. Literally none at all. A dictionary value can be any type of object Python supports, including mutable types like lists and dictionaries, and user-defined objects.There is also no restriction against a particular value appearing in a dictionary multiple times.

Hope you have enjoyed the post. Keep reading

How to identify table rows and size in a database in SQL Server

Problem Statement:

At times, we may need to understand the size of the tables in a database for various reasons. I often query these details to understand the growth of the data and clean up data for lower environments.

Code:
Here is a simple script to understand the rows, size of the table in a database.
You can further make changes to filter for any tables/group the results based on the indexes/partition etc.


SELECT  sc.name + '.' + t.NAME AS TableName,  
        SUM(p.[Rows]) NumerOfRecords,
        -- p.partition_number, i.index_id,i.name,
        ( SUM(a.total_pages) * 8 ) / 1024 AS TotalReservedSpaceMB, 
        ( SUM(a.used_pages) * 8 ) / 1024 AS UsedDataSpaceMB,  
        ( SUM(a.data_pages) * 8 ) / 1024 AS FreeUnusedSpaceMB  
FROM    sys.tables t  
        INNER JOIN sys.schemas sc ON sc.schema_id = t.schema_id  
        INNER JOIN sys.indexes i ON t.OBJECT_ID = i.object_id  
        INNER JOIN sys.partitions p ON i.object_id = p.OBJECT_ID  
                                            AND i.index_id = p.index_id  
        INNER JOIN sys.allocation_units a ON p.partition_id = a.container_id  
WHERE   t.type_desc = 'USER_TABLE'  
        -- Replace with valid table name if you want to filter for a table
        --AND t.NAME='YourTableName' 
GROUP BY sc.name + '.' + t.NAME,  
        i.[object_id]
        --,i.index_id, i.name, p.[Rows], p.partition_number
ORDER BY ( SUM(a.total_pages) * 8 ) / 1024 DESC  

Find Max and Min value of dates in a row in SQL Server

Problem Statement:
How to identify the Max and Min date value in a row in SQL Server?

Code:
Here is a solution to identify Max and Min date value with the help of CROSS APPLY in SQL Server.


create Table FindMaxMindate
(
    ID int, 
    Firstdate datetime,
    Seconddate datetime, 
    ThirdDate datetime
)

Insert into FindMaxMindate 
Values
(1,'2018-01-26 19:14:07.587','2018-04-26 19:14:07.587','2018-12-26 19:14:07.587'),
(2,'2018-12-26 19:14:07.587','2018-01-26 19:14:07.587','2018-03-26 19:14:07.587')

Select ID,Min(Dates) MinDate,Max(Dates) MaxDate
From FindMaxMindate A
Cross apply (values (Firstdate),(SecondDate),(ThirdDate))B (Dates)
Group by ID

Drop Table FindMaxMinDate

How to identify Process Name of a Process Id in SQL Server

We had an immediate requirement to identify process names of process ids in SQL Server. The first reaction to the requirement was to refer the program name. However, program name is customized through application, it was not the expected result.

I tried multiple ways to identify a simple solution within SQL Server, but failed to identify a straight forward solution.(You can share if you have any simpler solution.)

With the help of Powershell script, developed a small script to get the process name as below. Hope, this will be helpful for you in the similar situation.

Code:

create Table #TblProcessName(ProcessName varchar(max))

Insert into #TblProcessName
Exec xp_cmdshell 'powershell.exe "Get-Process | Select ProcessName,Id | % {$_.ProcessName + " "- + $_.Id}"' 

;With cte as
(
Select Substring(Processname,0,charindex('-',Processname)) as ProcessName, Substring(Processname,charindex('-',Processname)+1 , Len(processname)) as ProcessID 
From #TblProcessName A
)
Select * From cte A
Inner join sys.sysprocesses B on A.ProcessID = B.hostprocess

DROP Table #TblProcessName

Output: