CDC Jobs in SQL Server – Capture and Cleanup

Earlier, we found how do we set up Change Data Capture for a Database. Today, lets us quickly look at how CDC is working internally. Once CDC is configured on a database, we can see there two SQL Agent Jobs are created – Capture and Cleanup.

By now, we know CDC will capture the data modification information of a CDC enabled table and stored in separate placeholder – Changetables. Changetables are created for every table for which CDC is enabled under a schema – cdc. In addition to that, there is another system table called – cdc.change_tables which stores all information about CDC enabled tables(changetables).

Select * From cdc.change_tables

Now, we will see How these changetables are getting populated and maintained by SQL Server. This is been done by two SQL Agent jobs – Capture and Cleanup.

CDC Capture Job

Capture Job is responsible to push the DML changes into change tables.This job is created per database in SQL Server once the database is enabled CDC.Capture Job reads Transaction logs and writes the DML changes to respective change tables (those are CDC tracked) asynchronously.
CDC uses the same technology(sp_replcmds) as of Replication to read the transaction logs. So if we use both replication and CDC on an environment, it uses and shares the same log reader that minimizes any resource contentions at environment. However, the SQL agent job does not really uses sp_replcmds directly, but another procedure sys.sp_MScdc_capture_job which again internally uses sys.sp_cdc_Scan.

sys.sp_cdc_scan uses few configuration parameter from msdb.dbo.cdc_jobs system table for capture jobs as below:

Query:

Select db_name(database_id) database_name, job_type, B.name,
maxtrans,continuous,pollinginterval,retention,threshold from msdb.dbo.cdc_jobs A
inner join msdb.dbo.sysjobs B on A.job_id= B.job_id
Order by job_type asc

To change the settings, you may need to use another procedure sys.sp_cdc_change_job as below:

EXEC sp_cdc_change_job @job_type='capture', @maxtrans = 500, @maxscans = 10, @continuous = 1, @pollinginterval = 5

If you make changes to these settings, a restart of the capture job is required. To stop the capture job, use:

EXEC sys.sp_cdc_stop_job @job_type = 'capture'

And to start the job again, use:

EXEC sys.sp_cdc_start_job @job_type = 'capture'

Let us quickly see the parameters of sp_cdc_change_job @job_type=’capture’ as below:

maxtrans
The maxtrans configuration option set the maximum number of transactions to read from the transaction log in each scan cycle. The default is 500. If this parameter is set to NULL, it is interpreted as no change.

maxscans
Configures how many scans will be done to capture all rows from the transaction log. Each scan reads the number of transaction specified in the maxtrans option. The default is 10. 10 scans times 500 transactions means 5000 rows will be read at the most each time. If this parameter is set to NULL, it is interpreted as no change.

continuous
A bit where 1 means the job will run continuously (the default) or only one time (0). If the job is set to run continuously, maxtrans rows will be read from the log maxscans times. The job will then wait pollinginterval seconds and then read from the log again. If this parameter is set to NULL, it is interpreted as no change. Normally, 1 would be used. One time reads (0) are for testing purposes only and must not be used in a production environment since records in the transaction log will be kept active until CDC captures them.

pollinginterval
The number of seconds between log scan cycles. The default is 5. This parameter is only used if continuous = 1. If this parameter is set to NULL, it is interpreted as no change. In effect, a WAITFOR is issued between reads from the log, and this 5 second wait is the main reason why you may have to wait a few seconds before the change is captured to the change table.

Cleanup Job

Cleanup job is responsible to clean up the records from the changetables. This job is created automatically by SQL Server to minimize the number of records in the changetables, failing this job execution will be resulting to a larger changetable. This job internally invokes a procedure sys.sp_MScdc_cleanup_job in MSDB database with no parameter.The procedure reads the configuration again from the table – dbo.cdc_jobs that uses two columns retention and threshold.

Retention
This value specifies, in minutes, how long rows in the change table are kept. If NULL is used as value, this is interpreted as “no change”. The default is 4320 minutes (or 72 hours / 3 days). The maximum value is 52494800 minutes (100 years).

Threshold
This parameter is used to limit the number of rows that can be deleted in a cleanup delete statement. The default is 5000 rows. If NULL is used as value, this is interpreted as “no change”.

Again, To change the settings, you may need to use another procedure sys.sp_cdc_change_job as below:

EXEC sp_cdc_change_job @job_type='cleanup', @retention = 4320, @threshold = 5000

If you make changes to these settings, a restart of the cleanup job is required. To stop the cleanup job, use:

EXEC sys.sp_cdc_stop_job @job_type = 'cleanup'

And to start the job again, use:

EXEC sys.sp_cdc_start_job @job_type = 'cleanup'

Few points to be noted:

1. You must make sure SQL Agent is up and running all the time
2. cdc_jobs configurations are very important to set correctly.Overestimating/underestimating the configurations will have a deterimental impact on you application performance. You may need to genuinely configure as per your workload, a performance test can be carried out as per your workload to reach out your optimal values
3. Cleanup job is scheduled by default to run at 02:00 AM every day
4. Capture job is scheduled as “Start automatically when SQL Server Agent starts”. As it uses continuous parameter further, you may not need to make any change for “Schedule type”.

See Also:

Could not update the metadata that indicates database is enabled for Change Data Capture. The failure occurred when executing the command ‘SetCDCTracked(Value = 1)’
How do we move or restore a database with CDC enabled in SQL Server
Msg 22841, Level 16, State 1, Procedure sp_cdc_vupgrade, Line 323 – While restoring from a cdc enabled database backup
Tuning the Performance of Change Data Capture in SQL Server

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s