HASHBYTES for a large string in SQL Server

HASHBYTES function in SQL Server is used to hash the input using the algorithm specified.

This is a very efficient method to compare the string, for an example, lets compare the view definition between two different databases etc.

I do not really want to cover the usage or basic information of HASHBYTES in this post, but, a limitation and how to overcome the same. The first input parameter of HASHBYTES is the algorithm that needs to be used and the second one is the actual character or binary string that needs to be hashed.Now, the limitation is it will not accept the string more than 8000 bytes.

Lets quickly look at the example.


CREATE TABLE dbo.Test1 (c1 nvarchar(MAX));  
Insert into dbo.Test1 Select Replicate('a',4001) 
Select hashbytes('SHA1',c1) From dbo.Test1
Drop table dbo.Test1

The above code will throw an exception “String or binary data would be truncated.” as below:

To overcome the limitation, I have come up with a solution to break down the string into multiple and apply the hashing separately and later combined.

The script is as below:


Create FUNCTION [dbo].[GenerateHASHforLargeValue]
(   
    @TextValue nvarchar(max)
)

RETURNS varbinary(20)

AS
BEGIN

    if @TextValue = null
        return hashbytes('SHA1', 'null')

    Declare @TextLength as integer
	Declare @BinaryValue as varbinary(20)

    Set @TextLength = len(@TextValue)
	Declare @LenCount int  = 3500
    if @TextLength > @LenCount
    Begin
    ;With cte 
    as
    (
	Select substring(@TextValue,1, @LenCount) textval, @LenCount+1 as start, @LenCount Level,
               hashbytes('SHA1', substring(@TextValue,1, @LenCount)) hashval
	Union All 
	Select substring(@TextValue,start,Level), start+Level ,@LenCount  Level, 
               hashbytes('SHA1', substring(@TextValue,start,Level) + convert( varchar(20), hashval )) 
	From cte where Len(substring(@TextValue,start,Level))>0
    ) Select @BinaryValue = (Select Top 1 hashval From cte Order by start desc)
			return @BinaryValue
    End
    else
    Begin
	Set @BinaryValue = hashbytes('SHA1', @TextValue)
	return @BinaryValue
    End
    return null
END

If we apply the above function, then it will generate the hash code without the string truncation issue.

Hope this will help you whenever you may need to generate hash for larger strings!!!

Advertisements

Its always important to look at the License type of your SQL Server

Its always important to look at the License type of your SQL Server!!!

Problem Statement:

We recently had an issue with CPU utilization reaching more than 95% always for database server in one of our performance test environment. Load test environments are resource intensive test, hence it is expected to have high CPU utilization. However, we could observe the number of tests processed and number of transactions are very less spiking the CPU utilization to 95%.

Let me explain a bit more on my environment, We have 4 sockets with 10 physical cores and HT enabled in our test environment. As per the configuration we have total 80 logical CPU available. SQL Server version information as below:


Microsoft SQL Server 2014 (SP2) (KB3171021) - 12.0.5000.0 (X64) 
               Jun 17 2016 19:14:09 
               Copyright (c) Microsoft Corporation
               Enterprise Edition ((missing))Core based Licensing>((missing)) (64-bit) on Windows NT 6.3  (Build 9600: )

Here we can observe that the version information is missing “Core based Licensing”, that means, the SQL version is not Core based, but CAL based. Let us look at the excerpt from the MSDN article:

“Enterprise Edition with Server + Client Access License (CAL) based licensing (not available for new agreements) is limited to a maximum of 20 cores per SQL Server instance. There are no limits under the Core-based Server Licensing model”

So, though we have 4 sockets with 10 Cores, ONLY 20 cores are VISIBLE for SQL Server. In our environment, this was 20*2(HT enabled) = 40 Logical CPUs are visible.

Ref: https://technet.microsoft.com/en-us/library/cc645993(v=sql.120).aspx

Lets confirm the above with other parameters DMV -sys.dm_os_schedulers.


Select parent_node_id,Count(cpu_id) Total_Schedulers,
 count(Case when Status = 'VISIBLE ONLINE' Then 1 Else null End) Visible_Count,
 count(Case when Status = 'VISIBLE OFFLINE' Then 1 Else null End) NotVisible_Count
From sys.dm_os_schedulers 
where status in ('VISIBLE ONLINE','VISIBLE OFFLINE') 
and parent_node_id not in (64)--DAC
Group by parent_node_id

The result looks like below:


This is clear that SQL server was not able to utilize more than 40 logical CPU in the above environment.

Once we upgraded the license to Core Based License, we were able to use all available CPUs in our environment and observed the database CPU utilization has come down to 65% resulting more number of tests and transactions.

Let me reiterate, Its always important to look at the License type of your SQL Server !!!

Extending sp_helpindex to get more useful information about index in SQL Server

This post is to help people who had tough time to get information about index and its included columns in SQL Server. Whenever I need, I always depend on object explorer and get the definition of the index.So I thought of compiling a snippet as below that can provide these information handy.

I created a procedure called – sp_helpindexExtended as below and the Tablename as parameter(this is an optional parameter).

Procedure Definition


Create Proc sp_helpindexExtended (@TableName sysname = NULL) 
as 
Begin 
	  ;With cte as 
	  (
		 Select Object_name(A.object_id) objectname,A.name index_name,
			Lower(type_desc) + Case When IS_Unique =1 then ', unique' Else '' end +
			Case when is_primary_key =1 Then ', primary key' Else '' End+
			Case when IS_Unique_Constraint = 1 Then ',unique key' Else '' End  index_description,
			Case when is_included_column=0 then c.name  + '('+
            Case When System_type_id = User_Type_id Then Type_Name(System_type_id) Else Type_Name(User_Type_id) End +')'  Else NULL end ColumnName,
            Case when is_included_column=1 then c.name  + '('+
            Case When System_type_id = User_Type_id Then Type_Name(System_type_id) Else Type_Name(User_Type_id) End +')'  Else NULL end IncludedColumnnames
            ,Is_Unique,Is_Primary_Key,IS_Unique_Constraint,Fill_factor,key_ordinal
    From sys.indexes A  
    Inner Join sys.index_columns B On A.object_id = B.object_id And A.index_id = B.index_id 
    Inner Join sys.columns C On c.object_id = B.object_id  And C.column_id  = B.column_id 
	Where A.Object_ID = Case when @TableName is not null Then OBJECT_ID(@TableName) Else A.object_id End
	)	SELECT objectname,index_name,index_description,
       stuff(( SELECT ','+ColumnName AS [text()] FROM cte p2
          WHERE p2.objectname = p1.objectname and p2.index_name = p1.index_name
          ORDER BY key_ordinal asc FOR XML PATH('') ), 1, 1,'') AS index_keys,
		stuff( ( SELECT IsNull(','+IncludedColumnnames ,'')  FROM cte p2
          WHERE p2.objectname = p1.objectname and p2.index_name = p1.index_name
          ORDER BY key_ordinal asc FOR XML PATH('') ), 1, 1,'')  AS included_keys
		  ,fill_factor 
      FROM cte p1 GROUP BY objectname,index_name ,index_description,fill_factor 
	 order by objectname asc
End

Procedure Usage

-- To get for all tables
exec sp_helpindexextended
-- To get only for a table
exec sp_helpindexextended 'tablename'

How to extract ONLY numbers from a string in SQL Server

Here is a script to extract *ONLY* numbers from a string.


DECLARE @Sample TABLE(string VARCHAR(2000));
INSERT INTO @Sample
Values( 'hhjjj12345hhhkk'),('-7655'),('asd5-5dffgdfg105'),('hhjjj12345hhhkk'),('a1b2c3d4ee55fff666gggg7777h'),('Al1')
,('2Bob$'),('!Carl%'),('%D#?:"{}|\][+_)(*&^	%$#@!~`_+?>a*vi()(d&*(&*#?:"{}|\][J+o_n)es(*&^%$#@!~`_+?><,./;:"' ),('    Eddie     ')
;WITH Num1 (n) AS (SELECT 1 UNION ALL SELECT 1),
Num2 (n) AS (SELECT 1 FROM Num1 AS X, Num1 AS Y),
Num3 (n) AS (SELECT 1 FROM Num2 AS X, Num2 AS Y),
Num4 (n) AS (SELECT 1 FROM Num3 AS X, Num3 AS Y),
Nums (n) AS (SELECT ROW_NUMBER() OVER(ORDER BY n) FROM Num4),
FinalOutput AS (
SELECT string, (SELECT CASE WHEN SUBSTRING(string, n, 1) NOT LIKE '[^0-9]'
                       THEN SUBSTRING(string, n, 1)
                      ELSE ''
                  END + ''
           FROM Nums
           WHERE n <= LEN(string)
           FOR XML PATH('')) AS stringout
FROM @Sample)
SELECT string, stringout FROM FinalOutput;

How to replace first occurrence of a word in a sentence in SQL Server

Have you ever thought of How REPLACE is working? It just replaces all the occurrences of a word in the sentence.

Lets look at an example.


CREATE TABLE #StringTable ( id INT, string VARCHAR(128) )

INSERT #StringTable VALUES (1,'It was a pleasant morning, and he was the only kid in the garden')
,(1,'There is no match word')
Declare @searchstring varchar(100)='was',@Replacestring varchar(100)='is'

select id,string,
	Replace(string,@searchstring,@replacestring)
	from #StringTable
		cross apply (select (charindex(@searchstring, string))) as Search1(Pos)

Drop table #StringTable

In the above example, there are two occurrences of the word “was” for the first record. And the REPLACE function results in replacing all occurrences.

Suppose, if you want to only replace the first occurrence, How do you do?

First approach with CROSS APPLY and SUBSTRING:


CREATE TABLE #StringTable ( id INT, string VARCHAR(128) )

INSERT #StringTable VALUES (1,'It was a pleasant morning, and he was the only kid in the garden')
,(1,'There is no match word')

Declare @searchstring varchar(100)='was',@Replacestring varchar(100)='is'

select id,string,
	case when Search1.Pos != 0 then 
				substring(string, 0,Search1.Pos )+ @Replacestring +
				substring(string, Search1.Pos+LEN(@searchstring), LEN(string) )
	else string end ChangedText
	from #StringTable
		cross apply (select (charindex(@searchstring, string))) as Search1(Pos)

Drop table #StringTable

Another method with STUFF and PATINDEX:


CREATE TABLE #StringTable ( id INT, string VARCHAR(128) )

INSERT #StringTable VALUES (1,'It was a pleasant morning, and he was the only kid in the garden')
,(1,'There is no match word')

Declare @searchstring varchar(100)='was',@Replacestring varchar(100)='is'

Select id, string, Case when charindex(@searchstring, string) != 0 Then
				Stuff(string, patindex('%'+@searchstring+'%',string),len(@searchstring),@Replacestring) 
				Else string End ChangedText
	from #StringTable

Drop table #StringTable

The identifier ‘source column name’ cannot be bound. Only target columns and columns in the clause scope are allowed in the ‘WHEN NOT MATCHED BY SOURCE’ clause of a MERGE statement.

Problem Statement:

When you want to have a condition in a MERGE with source and target for WHEN NOT MATCHED clause, you may likely to get an error message as below, if you put the condition directly to the MERGE statement.

The identifier ‘source column name’ cannot be bound. Only target columns and columns in the clause scope are allowed in the ‘WHEN NOT MATCHED BY SOURCE’ clause of a MERGE statement.

Lets quickly check an example:


Drop table T1,T2
create Table T1(Col1 int, Col2 int)
Insert into T1 Values(1,1),(1,2),(1,3),(1,4),(3,1)--,(2,1)

create Table T2(Col1 int, Col2 int)
Insert into T2 Values(1,1),(1,2),(1,3),(1,5),(2,1)

Select * From T1

		MERGE T1 TG
			USING T2 SR
			ON TG.Col1= SR.Col1 and TG.Col2 = SR.Col2
			WHEN MATCHED THEN
				UPDATE
				SET TG.Col2 = SR.Col2
			WHEN NOT MATCHED BY TARGET THEN
				INSERT  (Col1, Col2)
				VALUES (Col1, Col2)
			WHEN NOT MATCHED BY SOURCE AND TG.Col1 = SR.COl1 THEN
				DELETE;

Select * From T1

Solution:

Here is a simple way of solving the mentioned issue.


Drop table T1,T2
create Table T1(Col1 int, Col2 int)
Insert into T1 Values(1,1),(1,2),(1,3),(1,4),(3,1)--,(2,1)

create Table T2(Col1 int, Col2 int)
Insert into T2 Values(1,1),(1,2),(1,3),(1,5),(2,1)

Select * From T1

		MERGE T1 TG
			USING T2 SR
			ON TG.Col1= SR.Col1 and TG.Col2 = SR.Col2
			WHEN MATCHED THEN
				UPDATE
				SET TG.Col2 = SR.Col2
			WHEN NOT MATCHED BY TARGET THEN
				INSERT  (Col1, Col2)
				VALUES (Col1, Col2)
			WHEN NOT MATCHED BY SOURCE AND TG.Col1 in (Select Col1 From T2) THEN
				DELETE;

Select * From T1

Temporary table caching in SQL Server

Caching of Temporary table is an interesting topic but not appreciated/recognized one in SQL Server due to several reasons.

First off, Let us see whats caching of temporary tables? – Caching temporary objects like temp tables/table variables will improve the performance of execution by eliminating the
recreation of objects.There are certain operations which will prevent the caching of temporary objects are below.

1. DDL commands once the object is created
2. Recompilation of Procedure associated
3. Dynamic SQL is used to create the objects
4. Named Constraints etc

Today, we are going to see how DDL commands is preventing the caching of temp objects.

Lets Create a Procedure and create a temp object:


--Create a procedure 
create Procedure Test_TempCaching
as
Begin
	create Table #Temp (EmpId int, FirstName varchar(50), LastName Varchar(50))
	create clustered index ix_test on #Temp(EmpId)
	Insert into #Temp Values(1,'SQL','Server'),(2, 'Biz','Talk')
End

We are going to see the temp tables creation rate using the below query:


SELECT cntr_value FROM sys.dm_os_performance_counters WHERE counter_name = 'Temp Tables Creation Rate'

The above will result a cumulative information on the number of times the temp objects created.

For the first time, when we create the procedure and executing the procedure, it will create a plan for the procedure and temp object is being created and cached.
If you look at the procedure, A temp table is created and later the object has been modified with DDL to add an index on the temp table.

Every time, the procedure gets executed, you can see the value gets changed,
Use the below query multiple times, and see the difference.


SELECT cntr_value FROM sys.dm_os_performance_counters WHERE counter_name = 'Temp Tables Creation Rate'
EXEC  Test_TempCaching
SELECT cntr_value FROM sys.dm_os_performance_counters WHERE counter_name = 'Temp Tables Creation Rate'


The above behavior shows that, if there is any DDL applied after the temp object, then the object will become obsolete and evict from the cache(later) and create a new object. You may find a performance issue if the procedure is executing very frequently in your application.

Now, Let us modify the procedure to comine the DDL command inline at the creation of temporary table as below:


ALTER Procedure Test_TempCaching
as
Begin
	create Table #Temp (EmpId int, FirstName varchar(50), LastName Varchar(50), Primary Key(EmpID))
	Insert into #Temp Values(1,'SQL','Server'),(2, 'Biz','Talk')

End

If you execute for the first time, you can execute the value gets changed. This is because, for the first time, the plan from the cache will be deleted once the procedure altered and a new plan will be created for the procedure. But for subsequent execution, the cntr_value never gets changed as the object has already been cached.


SELECT cntr_value FROM sys.dm_os_performance_counters WHERE counter_name = 'Temp Tables Creation Rate'
EXEC  Test_TempCaching
SELECT cntr_value FROM sys.dm_os_performance_counters WHERE counter_name = 'Temp Tables Creation Rate'

Conclusion:
If you see more high temp tables creation rate in your application, you may find the procedures and see if you can take advantage of the above scenario.
There is a caveat – statistics with temp objects(I am not covering here),you may need to verify and confirm the changes thoroughly to make sure you do not have any other performance issues.