HASHBYTES for a large string in SQL Server

HASHBYTES function in SQL Server is used to hash the input using the algorithm specified.

This is a very efficient method to compare the string, for an example, lets compare the view definition between two different databases etc.

I do not really want to cover the usage or basic information of HASHBYTES in this post, but, a limitation and how to overcome the same. The first input parameter of HASHBYTES is the algorithm that needs to be used and the second one is the actual character or binary string that needs to be hashed.Now, the limitation is it will not accept the string more than 8000 bytes.

Lets quickly look at the example.


CREATE TABLE dbo.Test1 (c1 nvarchar(MAX));  
Insert into dbo.Test1 Select Replicate('a',4001) 
Select hashbytes('SHA1',c1) From dbo.Test1
Drop table dbo.Test1

The above code will throw an exception “String or binary data would be truncated.” as below:

To overcome the limitation, I have come up with a solution to break down the string into multiple and apply the hashing separately and later combined.

The script is as below:


Create FUNCTION [dbo].[GenerateHASHforLargeValue]
(   
    @TextValue nvarchar(max)
)

RETURNS varbinary(20)

AS
BEGIN

    if @TextValue = null
        return hashbytes('SHA1', 'null')

    Declare @TextLength as integer
	Declare @BinaryValue as varbinary(20)

    Set @TextLength = len(@TextValue)
	Declare @LenCount int  = 3500
    if @TextLength > @LenCount
    Begin
    ;With cte 
    as
    (
	Select substring(@TextValue,1, @LenCount) textval, @LenCount+1 as start, @LenCount Level,
               hashbytes('SHA1', substring(@TextValue,1, @LenCount)) hashval
	Union All 
	Select substring(@TextValue,start,Level), start+Level ,@LenCount  Level, 
               hashbytes('SHA1', substring(@TextValue,start,Level) + convert( varchar(20), hashval )) 
	From cte where Len(substring(@TextValue,start,Level))>0
    ) Select @BinaryValue = (Select Top 1 hashval From cte Order by start desc)
			return @BinaryValue
    End
    else
    Begin
	Set @BinaryValue = hashbytes('SHA1', @TextValue)
	return @BinaryValue
    End
    return null
END

If we apply the above function, then it will generate the hash code without the string truncation issue.

Hope this will help you whenever you may need to generate hash for larger strings!!!

Advertisements