HASHBYTES function in SQL Server is used to hash the input using the algorithm specified.
This is a very efficient method to compare the string, for an example, lets compare the view definition between two different databases etc.
I do not really want to cover the usage or basic information of HASHBYTES in this post, but, a limitation and how to overcome the same. The first input parameter of HASHBYTES is the algorithm that needs to be used and the second one is the actual character or binary string that needs to be hashed.Now, the limitation is it will not accept the string more than 8000 bytes.
Lets quickly look at the example.
CREATE TABLE dbo.Test1 (c1 nvarchar(MAX));
Insert into dbo.Test1 Select Replicate('a',4001)
Select hashbytes('SHA1',c1) From dbo.Test1
Drop table dbo.Test1
The above code will throw an exception “String or binary data would be truncated.” as below:
To overcome the limitation, I have come up with a solution to break down the string into multiple and apply the hashing separately and later combined.
The script is as below:
Create FUNCTION [dbo].[GenerateHASHforLargeValue]
(
@TextValue nvarchar(max)
)
RETURNS varbinary(20)
AS
BEGIN
if @TextValue = null
return hashbytes('SHA1', 'null')
Declare @TextLength as integer
Declare @BinaryValue as varbinary(20)
Set @TextLength = len(@TextValue)
Declare @LenCount int = 3500
if @TextLength > @LenCount
Begin
;With cte
as
(
Select substring(@TextValue,1, @LenCount) textval, @LenCount+1 as start, @LenCount Level,
hashbytes('SHA1', substring(@TextValue,1, @LenCount)) hashval
Union All
Select substring(@TextValue,start,Level), start+Level ,@LenCount Level,
hashbytes('SHA1', substring(@TextValue,start,Level) + convert( varchar(20), hashval ))
From cte where Len(substring(@TextValue,start,Level))>0
) Select @BinaryValue = (Select Top 1 hashval From cte Order by start desc)
return @BinaryValue
End
else
Begin
Set @BinaryValue = hashbytes('SHA1', @TextValue)
return @BinaryValue
End
return null
END
If we apply the above function, then it will generate the hash code without the string truncation issue.
Hope this will help you whenever you may need to generate hash for larger strings!!!