Author: Latheesh NK

Compress and Decompress in SQL Server 2016

Recently, to enforce a very small level of masking of data, I was thinking of different ways and finally used COMPRESS & DECOMPRESS functions introduced in SQL Server 2016.

Beware, This is a very bad example of implementing as a solution for data encryption! Since, it was just so trivial functionality and I do not really want to spend more time on encryption and do not bother on its performance, also it was not for a full time production application and just for demo purpose, it has been taken as an option.But, again, I would like to reiterate this is a wrong example! Do not apply !

If you are looking for a full fledged solution for encryption/data masking – SQL Server has many options, you can refer the options here.

Now, Lets quickly look into the two functions introduced in SQL Server 2016 – COMPRESS & DECOMPRESS.

COMPRESS function quickly compresses the input using GZIP algorithm. More details here.

DECOMPRESS function otherwise, it decompresses the compressed value using GZIP algorithm. More details here.

Since Microsoft covers a good explanation on the subject, let us not try to re-invent the wheel, instead, let us quickly see some of characteristics and its usages.

Storage and its significance

Storage is important factor, hmm, not really in modern world, but yes since if you want to save cost. This is really matters a lot if you have any plan for cloud migration or something like “you pay for what you use”. So COMPRESS and DECOMPRESS makes more sense in such situations. Let us demonstrate a simple example as below.

I am going to create a table with varchar and nvarchar columns to store values.

drop table if exists SQLZealot_Compress_Test
CREATE TABLE SQLZealot_Compress_Test
(
ID int identity(1,1) NOT NULL Primary Key,
Varchar_Value VARCHAR(MAX),
NVarchar_Value NVARCHAR(MAX)
)

Now, let us insert some sample data as below.

INSERT INTO SQLZealot_Compress_Test(Varchar_Value, NVarchar_Value)
VALUES 
 ( 'This is a Compress test by SQLZealot!', N'This is a Compress test by SQLZealot!!')
,( '1234567890', N'1234567890')
,( REPLICATE('X', 9000), REPLICATE('X' , 4500))
,( REPLICATE(CAST('X' AS VARCHAR(MAX)), 9000), REPLICATE(CAST(N'X' AS NVARCHAR(MAX)), 4500))
,( REPLICATE(CAST('SQLZealot|' AS VARCHAR(MAX)), 9000), REPLICATE(CAST(N'SQLZealot|' AS NVARCHAR(MAX)), 4500))

Now, let us insert some more big sized data as below. I used a sample table from my database, you can use any of your big table in your environment by replacing “tablename”.

Insert into SQLZealot_Compress_Test
Select (Select * From tablename FOR JSON PATH, ROOT('Tables')) varcharval,
		(Select * From tablename FOR JSON PATH, ROOT('Tables')) nvarcharval

Here comes our testing script. We are going to see the length of the values using DATALENGTH as below.

Select ID,
		Varchar_Value,
		DataLength(Varchar_Value) DL_NonCompress_Varchar,
		Datalength(Compress(Varchar_Value)) DL_Compress_Varchar,
		NVarchar_Value,
		DataLength(NVarchar_Value) DL_NonCompress_NVarchar,
		Datalength(Compress(NVarchar_Value)) DL_Compress_NVarchar
From SQLZealot_Compress_Test

Complete Demo Script

drop table if exists SQLZealot_Compress_Test
CREATE TABLE SQLZealot_Compress_Test
(
ID int identity(1,1) NOT NULL Primary Key,
Varchar_Value VARCHAR(MAX),
NVarchar_Value NVARCHAR(MAX)
)

INSERT INTO SQLZealot_Compress_Test(Varchar_Value, NVarchar_Value)
VALUES 
 ( 'This is a Compress test by SQLZealot!', N'This is a Compress test by SQLZealot!!')
,( '1234567890', N'1234567890')
,( REPLICATE('X', 9000), REPLICATE('X' , 4500))
,( REPLICATE(CAST('X' AS VARCHAR(MAX)), 9000), REPLICATE(CAST(N'X' AS NVARCHAR(MAX)), 4500))
,( REPLICATE(CAST('SQLZealot|' AS VARCHAR(MAX)), 9000), REPLICATE(CAST(N'SQLZealot|' AS NVARCHAR(MAX)), 4500))

Insert into SQLZealot_Compress_Test
Select (Select * From OC_TABLEDEFS FOR JSON PATH, ROOT('Tables')) varcharval,
		(Select * From OC_TABLEDEFS FOR JSON PATH, ROOT('Tables')) nvarcharval

Select ID,
		Varchar_Value,
		DataLength(Varchar_Value) DL_NonCompress_Varchar,
		Datalength(Compress(Varchar_Value)) DL_Compress_Varchar,
		NVarchar_Value,
		DataLength(NVarchar_Value) DL_NonCompress_NVarchar,
		Datalength(Compress(NVarchar_Value)) DL_Compress_NVarchar
From SQLZealot_Compress_Test

Screenshot

Observations

1. Do not use COMPRESS & DECOMPRESS as a replacement for encryption/data masking.

2. If table has less size of data, then COMPRESS will not have a benefit, instead there may be a small fraction of overhead of long value of varbinary datatype.

3. If table has big size data, then COMPRESS seems to be a good option.

4. If table has unicode character datatype, the benefit seems to be lesser than character datatype.

5. Need to evaluate the CPU cycles effort (if you do COMPRESS & DECOMPRESS) in SQL Server.

6. It would be a good option to consider doing COMPRESS & DECOMPRESS functions in application layer.

7. If you have audit feature, it is a good option to consider the compressed data as part of audited info instead of actual data (if audited data is not being frequently used).

I’d like to grow my readership. If you enjoyed this blog post, please share it with your friends!

Handy Troubleshooting T-SQL queries for Always On Availability Groups in SQL Server

Today, as part of a database migration from (one server to another server), wanted to check few things related to Always On Availability Groups to make sure all are intact. I am sharing those troubleshooting queries prepared for my verification with this blog post, hope this helps you as well.

How to identify available AGs to the server as Primary replica?

SELECT Groups.[Name] AS [AGname], [Primary_recovery_health_desc],[synchronization_health_desc],
automated_backup_preference_desc,dtc_support,db_failover,is_distributed, count(AGDatabases.database_name) [# Databases]
FROM sys.dm_hadr_availability_group_states States
INNER JOIN master.sys.availability_groups Groups ON States.group_id = Groups.group_id
INNER JOIN sys.availability_databases_cluster AGDatabases ON Groups.group_id = AGDatabases.group_id
WHERE primary_replica = @@Servername
Group by Groups.[Name] , [Primary_recovery_health_desc],[synchronization_health_desc],
automated_backup_preference_desc,dtc_support,db_failover,is_distributed

How to identify available AGs to the server as Secondary replica?

SELECT Groups.[Name] AS [AGname], [secondary_recovery_health_desc],[synchronization_health_desc],
automated_backup_preference_desc,dtc_support,db_failover,is_distributed
FROM sys.dm_hadr_availability_group_states States
INNER JOIN master.sys.availability_groups Groups ON States.group_id = Groups.group_id
WHERE primary_replica != @@Servername

How to identify AG – Databases distribution of a Primary replica?

SELECT
Groups.[Name] AS AGname,
AGD.database_name AS Databasename
FROM sys.dm_hadr_availability_group_states States
INNER JOIN master.sys.availability_groups Groups ON States.group_id = Groups.group_id
INNER JOIN sys.availability_databases_cluster AGD ON Groups.group_id = AGD.group_id
WHERE primary_replica = @@Servername
ORDER BY AGname ASC, Databasename ASC;

How to identify AG – Databases distribution of a Secondary replica?

SELECT
Groups.[Name] AS AGname,
AGD.database_name AS Databasename
FROM sys.dm_hadr_availability_group_states States
INNER JOIN master.sys.availability_groups Groups ON States.group_id = Groups.group_id
INNER JOIN sys.availability_databases_cluster AGD ON Groups.group_id = AGD.group_id
WHERE primary_replica != @@Servername
ORDER BY AGname ASC, Databasename ASC;

A quick health check of your AG

SELECT DISTINCT
Groups.name AS AGname,
AGReplicas.replica_server_name,
AGStates.role_desc,
AGStates.synchronization_health_desc,
ISNULL(AGReplicaStates.suspend_reason_desc,'N/A') AS suspend_reason_desc
FROM sys.availability_groups Groups
INNER JOIN sys.dm_hadr_availability_replica_states as AGStates ON AGStates.group_id = Groups.group_id
INNER JOIN sys.availability_replicas as AGReplicas ON AGStates.replica_id = AGReplicas.replica_id
INNER JOIN sys.dm_hadr_database_replica_states as AGReplicaStates ON AGReplicas.replica_id = AGReplicaStates.replica_id
Order by Groups.name

A quick health check of your AG – DAG


SELECT 
   ag.[name] AS [AG Name], 
   ag.is_distributed, 
   ar.replica_server_name AS [Underlying AG], 
   ars.role_desc AS [Role], 
   ars.synchronization_health_desc AS [Sync Status]
FROM  sys.availability_groups AS ag
INNER JOIN sys.availability_replicas AS ar 
   ON  ag.group_id = ar.group_id        
INNER JOIN sys.dm_hadr_availability_replica_states AS ars       
   ON  ar.replica_id = ars.replica_id

A quick health check of your AG – DB

select DB_NAME(database_id),* from sys.dm_hadr_database_replica_states 
where synchronization_health_desc != 'HEALTHY' 

In addition, SSMS provides a good dashboard to understand the Always On Availability Group and health status as below. I would highly recommend to refer for more details.

I’d like to grow my readership. If you enjoyed this blog post, please share it with your friends!

Service Broker enabled by default while creating a database through T-SQL in SQL Server

Recently, we decided to disable Service Broker for all databases as part of our deployment. While doing a preliminary check, we observed that when we create a database through T-SQL, by default, the database is enabled with Service Broker.

Please find the below code to replicate the case.

Script:

IF NOT EXISTS(SELECT * FROM sys.databases WHERE [name] = 'testdb')
BEGIN
CREATE DATABASE testdb
Select 'Database created...'
END
GO
select is_broker_enabled,name from sys.databases where name = 'testdb'
GO
Drop database if exists  testdb
Select 'Database dropped...'

screenshot:

We also observed that while creating a database through SSMS interface/wizard, the service broker is disabled by default. Out of curiosity, SQL profiler has been set up, captured the statements and identified that when we create a database through SSMS interface, there is an explicit execution of disabling the Service Broker is happening followed by create database statement as below in the screenshot.

Conclusion:

Always DISABLE service broker explicitly ONLY if you do not really need it to be enabled on your database. Please note, there is no overhead when service broker is enabled but not used. On the flip side, disabling service broker can prevent the applications from using the feature of queuing and usage of components like SqlDependency/QueryNotifications that rely on it. Please never try to turn off service broker on your system databases like msdb as it can cause break in functionalities like database mail etc.

I’d like to grow my readership. If you enjoyed this blog post, please share it with your friends!

Fun with SQL – Find out numbers where adjacent digit differs by 1

Today, I have come through a post by one of my good friends and one of the most famous SQL Server experts in India – Madhivanan.

The question was : “Given a number N, write a code to print all positive numbers less than N in which all adjacent digits differ by 1” Here is my attempt to solve this using SQL Server T-SQL
--Provide input value 
declare @number int
set @number=105
 
--Create a temporary table to store numbers
Drop table if exists #temp
create table #temp (number int)

;with ctebuilder as(
select
    row_number () over (order by (select null)) as sno 
from
    sys.sysobjects as t1 cross join sys.sysobjects as t2 ) 
Insert into #temp(number) Select sno From ctebuilder where sno<=@number

;with cte as(
select
    t1.number number, 
    cast(substring(cast(t1.number as varchar(20)),t2.number,1) as int) as number2, 
	row_number () over(partition by t1.number order by (select null) asc) Rn
from #temp as t1 cross join #temp as t2
where t2.number <= len(t1.number) and t1.number <=@number
)
Select number From(
 Select * ,LAG(number2,1) OVER (partition by number
		ORDER BY (select NULL) asc
	) previous_val From cte
	) A group by number 
having count(case when previous_Val - number2 in (1,-1) then 1 Else NULL End )  
= count(number2)-1

I’d like to grow my readership. If you enjoyed this blog post, please share it with your friends!

Database Properties – Difference between GUI and sys.master_files in SQL Server

Few days back, One of my colleagues was asking why there is a discrepancy in GUI and system table for Database Properties -> File wizard. Here is an example similar to what he showed me.

GUI Screenshot

SSMS Query Screenshot

My first response was GUI is a user friendly and Query results is a SQL Server Professional friendly. There will be lots of differences like this between GUI and actual table how it saved.But, those are neither discrepancies nor defects, that is how it is by design.

To know this difference, sys.master_files is a system catalog view that represents properties of each file of your database ( data and log). size is always representing in 8 KB pages. Refer the link for more details for other properties.

So, to make him understand, I changed his query a bit to get both looks equal as below.

Select A.name [Logical Name], A.type_desc [File Type], ISNULL(B.Name, 'Not Applicable') 'Filegroup',
(size*8)/1024 [Size (MB)],
'By ' + Cast((growth*8)/1024 as varchar(max)) + 'MB, ' + 
Case when max_size = -1 then 'Unlimited' Else Cast(Max_size as varchar(max)) End + ' MB' [AutoGrowth/MaxSize],
substring(physical_name,0, Len(Physical_name)-charindex('\',reverse(Physical_name),0)+1) [Path],
Right(physical_name,charindex('\',reverse(Physical_name),0)-1) [File Name]
From sys.master_files A
Left Join sys.filegroups B on A.data_space_id = B.data_space_id
where db_name(database_id) = 'DBATools'

I’d like to grow my readership. If you enjoyed this blog post, please share it with your friends!