CDC Jobs in SQL Server – Capture and Cleanup

Earlier, we found how do we set up Change Data Capture for a Database. Today, lets us quickly look at how CDC is working internally. Once CDC is configured on a database, we can see there two SQL Agent Jobs are created – Capture and Cleanup.

By now, we know CDC will capture the data modification information of a CDC enabled table and stored in separate placeholder – Changetables. Changetables are created for every table for which CDC is enabled under a schema – cdc. In addition to that, there is another system table called – cdc.change_tables which stores all information about CDC enabled tables(changetables).

Select * From cdc.change_tables

Now, we will see How these changetables are getting populated and maintained by SQL Server. This is been done by two SQL Agent jobs – Capture and Cleanup.

CDC Capture Job

Capture Job is responsible to push the DML changes into change tables.This job is created per database in SQL Server once the database is enabled CDC.Capture Job reads Transaction logs and writes the DML changes to respective change tables (those are CDC tracked) asynchronously.
CDC uses the same technology(sp_replcmds) as of Replication to read the transaction logs. So if we use both replication and CDC on an environment, it uses and shares the same log reader that minimizes any resource contentions at environment. However, the SQL agent job does not really uses sp_replcmds directly, but another procedure sys.sp_MScdc_capture_job which again internally uses sys.sp_cdc_Scan.

sys.sp_cdc_scan uses few configuration parameter from msdb.dbo.cdc_jobs system table for capture jobs as below:

Query:

Select db_name(database_id) database_name, job_type, B.name,
maxtrans,continuous,pollinginterval,retention,threshold from msdb.dbo.cdc_jobs A
inner join msdb.dbo.sysjobs B on A.job_id= B.job_id
Order by job_type asc

To change the settings, you may need to use another procedure sys.sp_cdc_change_job as below:

EXEC sp_cdc_change_job @job_type='capture', @maxtrans = 500, @maxscans = 10, @continuous = 1, @pollinginterval = 5

If you make changes to these settings, a restart of the capture job is required. To stop the capture job, use:

EXEC sys.sp_cdc_stop_job @job_type = 'capture'

And to start the job again, use:

EXEC sys.sp_cdc_start_job @job_type = 'capture'

Let us quickly see the parameters of sp_cdc_change_job @job_type=’capture’ as below:

maxtrans
The maxtrans configuration option set the maximum number of transactions to read from the transaction log in each scan cycle. The default is 500. If this parameter is set to NULL, it is interpreted as no change.

maxscans
Configures how many scans will be done to capture all rows from the transaction log. Each scan reads the number of transaction specified in the maxtrans option. The default is 10. 10 scans times 500 transactions means 5000 rows will be read at the most each time. If this parameter is set to NULL, it is interpreted as no change.

continuous
A bit where 1 means the job will run continuously (the default) or only one time (0). If the job is set to run continuously, maxtrans rows will be read from the log maxscans times. The job will then wait pollinginterval seconds and then read from the log again. If this parameter is set to NULL, it is interpreted as no change. Normally, 1 would be used. One time reads (0) are for testing purposes only and must not be used in a production environment since records in the transaction log will be kept active until CDC captures them.

pollinginterval
The number of seconds between log scan cycles. The default is 5. This parameter is only used if continuous = 1. If this parameter is set to NULL, it is interpreted as no change. In effect, a WAITFOR is issued between reads from the log, and this 5 second wait is the main reason why you may have to wait a few seconds before the change is captured to the change table.

Cleanup Job

Cleanup job is responsible to clean up the records from the changetables. This job is created automatically by SQL Server to minimize the number of records in the changetables, failing this job execution will be resulting to a larger changetable. This job internally invokes a procedure sys.sp_MScdc_cleanup_job in MSDB database with no parameter.The procedure reads the configuration again from the table – dbo.cdc_jobs that uses two columns retention and threshold.

Retention
This value specifies, in minutes, how long rows in the change table are kept. If NULL is used as value, this is interpreted as “no change”. The default is 4320 minutes (or 72 hours / 3 days). The maximum value is 52494800 minutes (100 years).

Threshold
This parameter is used to limit the number of rows that can be deleted in a cleanup delete statement. The default is 5000 rows. If NULL is used as value, this is interpreted as “no change”.

Again, To change the settings, you may need to use another procedure sys.sp_cdc_change_job as below:

EXEC sp_cdc_change_job @job_type='cleanup', @retention = 4320, @threshold = 5000

If you make changes to these settings, a restart of the cleanup job is required. To stop the cleanup job, use:

EXEC sys.sp_cdc_stop_job @job_type = 'cleanup'

And to start the job again, use:

EXEC sys.sp_cdc_start_job @job_type = 'cleanup'

Few points to be noted:

1. You must make sure SQL Agent is up and running all the time
2. cdc_jobs configurations are very important to set correctly.Overestimating/underestimating the configurations will have a deterimental impact on you application performance. You may need to genuinely configure as per your workload, a performance test can be carried out as per your workload to reach out your optimal values
3. Cleanup job is scheduled by default to run at 02:00 AM every day
4. Capture job is scheduled as “Start automatically when SQL Server Agent starts”. As it uses continuous parameter further, you may not need to make any change for “Schedule type”.

See Also:

Could not update the metadata that indicates database is enabled for Change Data Capture. The failure occurred when executing the command ‘SetCDCTracked(Value = 1)’
How do we move or restore a database with CDC enabled in SQL Server
Msg 22841, Level 16, State 1, Procedure sp_cdc_vupgrade, Line 323 – While restoring from a cdc enabled database backup
Tuning the Performance of Change Data Capture in SQL Server

Advertisements

Could not update the metadata that indicates database <> is enabled for Change Data Capture. The failure occurred when executing the command ‘SetCDCTracked(Value = 1)’. The error returned was 15517: ‘Cannot execute as the database principal because the principal “dbo” does not exist, this type of principal cannot be impersonated, or you do not have permission.’. Use the action and error to determine the cause of the failure and resubmit the request.

While configuring Change Data Capture on a database that is restored from different environment, we might end up with an error message as below:

Could not update the metadata that indicates database <> is enabled for Change Data Capture. The failure occurred when executing the command ‘SetCDCTracked(Value = 1)’. The error returned was 15517: ‘Cannot execute as the database principal because the principal “dbo” does not exist, this type of principal cannot be impersonated, or you do not have permission.’. Use the action and error to determine the cause of the failure and resubmit the request.

To resolve the issue, we need to change the owner of database and then enable CDC for the database as below. Please note that, the database would have restored irrespective of the error. You just have to run the below statement to avoid similar issues in future.

EXEC sp_changedbowner 'sa'
EXEC sys.sp_cdc_enable_db ;

Ref: https://docs.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sp-changedbowner-transact-sql?view=sql-server-2017

Decoding hexa error code to windows error message in SQL Server

Problem Statement:
Sometimes, the error messages generated are not friendly to understand the underlined issues. This is going to be an extra effort to google out for most of us with the error code (mostly a hexa value) and identify the corresponding windows error message.

Solution:
A simple solution would be to use NET HELPMSG with the errocode. But if the error code is a hexa value, then we need to identify the corresponding Decimal value to use as a parameter to NET HELPMSG.

Here is a script to identify actual error message from hex error code. This is important to be aware that this approach will only work for Win32 error codes from Microsoft Windows. If the net helpmsg command does not return a result, the cause is likely the error has not originated from Windows, or no longer a valid Win32 process.

DECLARE @hex VARCHAR(64) = '0x80070643'
Set @hex = Right(@hex,4)

DECLARE @rez BIGINT;
SELECT @rez = ISNULL(@rez,0) * 16 + 
CHARINDEX(substring(@hex,n.number+1,1),'0123456789ABCDEF') - 1
FROM MASTER..spt_values n WHERE n.TYPE='P' AND n.number<len(@hex)

/*
	--To use NET HELPMSG, need to enable xp_cmdshell as below:

	EXEC sp_configure 'show advanced options',1
	RECONFIGURE WITH OVERRIDE
	EXEC sp_configure 'xp_cmdshell',1
	RECONFIGURE WITH OVERRIDE
	EXEC sp_configure 'show advanced options',0
	RECONFIGURE WITH OVERRIDE
*/

Declare @Table Table(WindowsErrorMessage nVarchar(MAX))
Declare @s nvarchar(500) = 'NET HELPMSG ' + Cast(@rez as varchar(MAX))
Insert into @Table 
exec xp_cmdshell @s

Select @hex HexaVal,@rez DecimalVal,* From @Table 
where WindowsErrorMessage is not null

How to Replay a SQL Server Profiler trace in SQL Server

It is one of the most interesting topic in SQL Server “How to Replay profiler”. But, I have not seen this feature is used efficiently by many. Today, we are going to see how can we use Replay profiler in SQL Server.

Usages:

1. Can be used to reproduce issues in one environment to another environment(production environment to lower environments)
2. Can be used to run the trace multiple times as part of investigation

Now, Let us quickly see how can we use the Replay option in SQL Server.

1. Collect the Profiler trace with Replay option.

This is a very important step to collect what needs to replay. SQL Profiler provides a default template to choose the Replay option. We need to select the template and run the profiler on the source server/database while we execute the functionality that needs to be applied to target database/server.

2. Save the collected trace to a Table

3. Once saved the Replay trace, you can connect to target server and open the trace collected

File -> Open -> Trace Table -> Connect the server that trace saved

4. Once the trace is opened in the Profiler window, we can run the Replay Trace

Replay -> Start -> Select “Target Server”
You can select Replay Configuration in the window as below:(Not covering all the option here, please explore and configure as required)

I (my personal choice) usually include “Display Execution Time”, but its optional.

Once the configurations are set, we can “Start” the Replay.

If you see the above sample snapshot, we can see that the trace has been run successfully. However, we see errors in the screenshot as when I tried to run the trace in the same server and database.

How do we run this trace in different server and different database?

Its always a case that we cannot expect the database name will be same in all environment. For an example, developer would have kept multiple version of same database like database_dev, database_unittest etc to test various purposes. If we try to run the trace as it is, we will end up with errors. To run on different environment, you need to make few changes as below:

1. Go to Management Studio (SSMS) and select the database where the trace table collected
2. Update the database Name and database id columns of the table to your target database name and database id


That means, the trace data needs to be changed for its database id and database name as per the target environment.Once the trace table data has been modified, we can follow the same procedure mentioned above(3 and 4).

How to identify parallel plans in SQL Server

Few years back, on one of my engagements, identified lots of performance issues with an application. As part of a detailed analysis, it is observed there are lots of queries under performing with wait type “CXPACKET”. In short, CXPACKET wait types are involved in parallel query execution. These wait types indicate that the SPID is waiting on a parallel process to complete or start. There are many reasons your query may undergo a parallel execution like missing index, non optimized sql script etc. But, CXPACKET is not ALWAYS a bad thing.We will see about CXPACKET in detail with another post.

Now, We will see how to identify the queries with parallel execution. Once you identified the queries, you can further examine to identify the actual cause of the issues and optimize.

select p.*,q.*,cp.plan_handle
from sys.dm_exec_cached_plans cp
cross apply sys.dm_exec_query_plan(cp.plan_handle) p
cross apply sys.dm_exec_sql_text(cp.plan_handle) as q
where
cp.cacheobjtype = 'Compiled Plan' and
p.query_plan.value('declare namespace
p="http://schemas.microsoft.com/sqlserver/2004/07/showplan";
max(//p:RelOp/@Parallel)', 'float') > 0

Please note that the above query gets information from cached plan in SQL Server. Hence, I will recommend to run a load test with a real time workload and use the script to identify the parallel executing queries. It is also important that the cached plans may get flushed due various factors like memory pressure, recompile of procedures, auto stats updates etc. But, this helped me a lot at times!!!

How do we move or restore a database with CDC enabled in SQL Server

This post explains how efficiently we can move a CDC enabled database to different environment.

To move a database from one environment to another environment, Firstly, we need to take back up of the database from the source environment.

BACKUP DATABASE CDCOrigin TO DISK = 'C:\Backup\CDCOrigin.BAK'

Once we move the backup file to the new environment, we have two options:

1. Restore database as always and configure/set up CDC freshly as in this post.

2. Restore database with KEEP_CDC option.

RESTORE DATABASE CDCCopy_CDC From DISK = 'C:\Backup\CDCOrigin.BAK'
WITH
	MOVE 'CDCOrigin' To 'C:\Program Files\Microsoft SQL Server\MSSQL13.SQL2016\MSSQL\DATA\CDCCopy1.mdf',
	MOVE 'CDCOrigin_log' To 'C:\Program Files\Microsoft SQL Server\MSSQL13.SQL2016\MSSQL\DATA\CDCCopy1_log.ldf'
	--The below option would keep the CDC of original database
	, KEEP_CDC	

Once restore is completed successfully, we can quickly verify the configurations with below queries.

Select is_CDC_Enabled ,name from sys.databases where is_CDC_Enabled=1 
USE CDCCopy_CDC
Select name from sys.tables where is_tracked_by_cdc = 1

Wait! still there are few things to be done, configuring jobs for capture and cleanup process. KEEP_CDC option with RESTORE command will keep the CDC enabled and the tables tracked by default. However, we need to create capture and cleanup jobs manually as below.

USE CDCCopy_CDC
EXEC sys.sp_cdc_add_job @job_type = N'capture'; 
EXEC sys.sp_cdc_add_job @job_type = N'cleanup'; 

Caveats:

Do not enable this feature unless there is a good reason. The feature adds more space requirement to store the change data and increases disk activities. Be careful on configuring the capture and clean up jobs. If there are many databases with CDC ( due to non-multi tenant support), CDC on multiple databases will have a performance impact.