Author: latheeshnk

CONTAINSTABLE is not returning records from full text enabled table in SQL Server

Recently, we have an issue, FULL TEXT search was not working with CONTIANSTABLE. My friend was searching for a word in a document where the table is enabled with FULL TEXT feature, however the search was not bringing any records.

My first thought was there could be a missing join condition that could not get the results. The actual query in the issue was really a big one with lots of tables and filters involved.So the first thing that we did remove all joins or conditions and see if CONTAINSTABLE works good, but that attempt was also failing. It was a clear indication that there is something wrong that it could not return results at FULL TEXT functionality.

For everyone to understand better, Let us go through a sample script to replicate the issue:

1. Create a table for our example

CREATE TABLE [DBO].[DOCUMENTTABLE](
	[DOCUMENTID] [UNIQUEIDENTIFIER] NOT NULL,
	[DOCUMENT_CONTENT] [IMAGE] NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

CREATE INDEX IX_DOCUMENTTABLE ON DOCUMETTABLE(DOCUMENTID)

2. Load some data into Table – We uploaded the same file in question which is a .xlsx type

3. Query table with CONTAINSTABLE

Select * From CONTAINSTABLE (DocumentTable, DOCUMENT_CONTENT,'reporting')

Msg 7601, Level 16, State 2, Line 2
Cannot use a CONTAINS or FREETEXT predicate on table or indexed view ‘DocumentTable’ because it is not full-text indexed.

4. Enable FULL TEXT Indexing on the table

5. While executing the query, it is observed again no data returned

Further, we decided to check the document and its content for the search word manually and found the word was present and the document was also the same “.xlsx”. It is to be noted that we saved the data in an IMAGE datatype as shown in the example. With that information, we wanted to know the supporting file types in our full text enabled table with the below query.

select document_type, path from sys.fulltext_document_types

The above query was not returning the document type – “.xlsx” and it clearly indicated that the reason why the document could not search the word as the document type was a non-supporting one. [DOCUMENTTABLE] is a table that enabled FULL TEXT and the datatype of the field “[DOCUMENT_CONTENT]” was IMAGE. Then, my analysis took me a very important information that if we need to index a document type for which a pre-installed iFilter does not exists at OS level, we need to install the appropriate iFilter in the OS and then load those into SQL Server for a successful search functionality.

How to install the iFilter and Load into SQL Server

1. Download the iFilter , the below supports a wide range of document types

Microsoft Office 2010 Filter Packs

(Please note that MSmight retire the above as latest version is released, you may need to check for the latest.)

The following file types are supported by the Office 2010 Filter Pack:

2. Install the downloaded set

3. Load the installed OS resources to SQL Server as below.

exec sp_fulltext_service 'load_os_resources', 1;
exec sp_fulltext_service 'verify_signature', 0;
exec sp_fulltext_service 'Update_languages'
exec sp_fulltext_service 'Restart_all_fdhosts'

Once the above steps are completed, we were able to query the word and the results were returned as expected.

Hope this helps you in a similar situation and it was a good lesson learned to me!

References:
How to register iFilters

Advertisements

“The database is encrypted by database master key, you need to provide valid password when adding to the availability group.”

Recently, one of my colleague has encountered an issue an error while adding a database to availability group as below:

Error Message:
“The database is encrypted by database master key, you need to provide valid password when adding to the availability group.”

Reason:

The error message is self explanatory one as this database does not meet a requirement to add AG as this database is protected with a master key. We may need to provide the master key while configuring the database into AG.Please note that this is not due to TDE, but due to master key.

One observation is that this is recently added as a validation in SQL Server 2016 version. Earlier version, you would be able to add seamlessly.

Solutions:

There are two solutions to add these databases into AG:

1. Use T-SQL to add the database into AG


ALTER AVAILABILITY GROUP AGName ADD DATABASE DBName

2. Provide the password and use GUI with few extra steps

In GUI, this is not so clear for the user that the password can be provided directly and configure AG.

Double click in the Password column
Type the password in the column
Click Refresh (To enable check box for the database)
Check the checkbox for the database.
Click Refresh again (To enable the Next button)
Click Next to progress

Toulouse Chapter

This post has nothing to do with SQL Server, but sharing my experiences at Toulouse in France.

I stayed near by Blagnac airport, Fasthotel, a decent hotel with all amenities and a clam place to live. And another reason, the hotel is just behind my office!!!

It was an official visit to Toulouse for 2 weeks from 12th May 2019 to 25th May 2019. As this was my first visit, I was really excited. At this point, I need to mention about one of my colleague – Pascal Benguigui . He was right there at Toulouse airport waiting for my arrival, have to say a big thanks to him!

Few things I really felt energized at Toulouse:

1. The French start their day with a warm greet – “bonjour” with a hand shake!

2. People are “Working to Live” unlike “Living to Work” 🙂 – Really appreciate the commitment towards the work during office time and family during the rest!

Another person – Alexis Molteni, I have to mention about him as a most kind and down earth, a perfect gentle man! We three had really really good time. We have been there out almost every day visiting many places like Spanish Tapas “Las Tapas Locas“, Lebanian restaurants, “Capitole de Toulouse” , “Castle Carcassonne” and long walks at nights in the street of Toulouse and many… They really made me feel like I am part of them.

I am really missing these two guys while writing this blog, keeping a good faith that we will meet on another day soon. until then Thank you Pascal and Alexis…

We have a lunch team of 7 people including me and all were trying to find a better food for me by trying different dishes and even different restaurants everyday. Thanks to everyone accommodating me as part of your lunch family!!!

Here are few food items I tried…


How do we identify LOB datatype columns in SQL Server

Large OBject (LOB) data types are datatypes that can store large data. These datatypes can hold values larger than 8K row size.Prior to SQL Server 2005, we have only TEXT, NTEXT and IMAGE to hold large data. With SQL 2005, Microsoft has introduced additional three datatypes to hold large values like VARCHAR(MAX),NVARCHAR(MAX) and VARBINARY(MAX).

How to identify LOB columns in your database?

The below query is useful to identify LOB columns. Please note that the query contains a filter not to fetch CDC schema objects. Feel free to add more filters as required.


SELECT T.TABLE_CATALOG DATABASE_NAME,T.TABLE_SCHEMA AS SCHEMA_NAME,
    T.TABLE_NAME, 
    C.COLUMN_NAME,C.DATA_TYPE DATA_TYPE,C.CHARACTER_MAXIMUM_LENGTH MAXIMUM_LENGTH,C.COLLATION_NAME
FROM INFORMATION_SCHEMA.COLUMNS C
    INNER JOIN INFORMATION_SCHEMA.TABLES T
        ON C.TABLE_SCHEMA = T.TABLE_SCHEMA
        AND C.TABLE_NAME = T.TABLE_NAME
WHERE T.TABLE_TYPE = 'BASE TABLE' 
AND ((C.DATA_TYPE IN ('VARCHAR', 'NVARCHAR') AND C.CHARACTER_MAXIMUM_LENGTH = -1)
OR DATA_TYPE IN ('TEXT', 'NTEXT', 'IMAGE', 'VARBINARY', 'XML', 'FILESTREAM'))
AND T.TABLE_SCHEMA NOT IN('CDC') -- EXCEPTION LIST
ORDER BY T.TABLE_SCHEMA, T.TABLE_NAME

How do we identify the length or size of LOB datatype in SQL Server

At times, you may need to understand the length or size of your LOB columns. One example is to set a right value for “max text repl size (B)” to avoid certain issues like this. Usually, we use a function LEN to get the length of the data, however, LEN is not compatible certain datatypes like TEXT,NTEXT and IMAGE.


To resolve, SQL Server has another function – DATALENGTH(). The function returns the number of length or size of the data in bytes. The below query will help to identify the length or size of LOB datatypes like TEXT, NTEXT, IMAGE etc.

--Table creation script
Create Table tablename (columnname image)

--Sample data insert (The data has been trimmed for readability)
Insert into tablename values('0xFAEFCD8FFE000104A46494600010100000100010000FFDB00840009060')

SELECT  DATALENGTH(columnname) as inBytes,
        DATALENGTH(columnname) / 1024.0 as inKb,      
        DATALENGTH(columnname) / 1024.0 / 1024.0 as inMb FROM   tablename

DROP Table tablename


Thats it with this, hoping to see you with another blog post soon!

Length of LOB data to be replicated exceeds configured maximum 65536 in SQL Server

Problem Statement:

Sometimes When we do a DML operation in SQL Server, we end up with the below error –

“Length of LOB data to be replicated exceeds configured maximum 65536. Use the stored procedure sp_configure to increase the configured maximum value for max text repl size option, which defaults to 65536. A configured value of -1 indicates no limit, other that the limit imposed by the data type. The statement has been terminated.”

From the error message, its clear that the operation has been terminated by SQL Server.

Analysis:

I would say, this is a clear error message that means it contains the error information, steps to resolve the exception and its implication. Let us expand the error message a bit more than what its provided for our clear understanding with this blog post.

Details:
Length of LOB data to be replicated exceeds configured maximum 65536.

This means that the length of a LOB data is exceeding the configured value of 65536 which defaults to 65536. There is a configuration “max text repl size option” in SQL Server which can be viewed by sp_configure system procedure or querying sys.configurations system table. By default, the value of the configuration has been set to 65536 in SQL Server.This has an impact on systems configured with replication, Change Data Capture. The configuration “max text repl size (B)” is not an advanced configuration, so you do not need to reconfigure with “show advanced options” to see this setting.

From the above picture, the description of max text repl size (B) configuration is clear that it denotes the maximum size of a text field in replication. However, we need to understand the value is also applicable for LOB datatypes as well. The issue can be raised whenever a LOB data that contains more length than the specified value for this configuration. The operation gets terminated for the same reason.

What do we do when we get this error?

The simple solution is to set the configuration “max text repl size (B)” to “-1”. This will ensure there is no limit other than imposed by the datatype of the column. However, this may have some detrimental impact on replication as the size increases we need to make sure the replication has no other side effects due to the size of the data to be replicated over the network etc.

1. Identify the LOB datatype column in the table
2. Assess the size or length of the data in the table and confirm that the size of the data is exceeding the configured value of “max text repl size (B)”
3. Identify if there are any replication/CDC configured with the table/database
4. If Change data capture does not require the LOB column to be tracked, avoid the column from capture by specifying ONLY required columns in captured_column_list for the table while configuring CDC for the table
5. Assess your data and conclude the maximum size of data it can reached up to as per the business requirement. As an example, a photo field can be restricted with a specified size through application that can be uploaded etc.
6. Set the required or expected size for max text repl size (B) configuration if possible instead of maximum -1

Note: We recently observed similar issue with a database configured AlwaysOn-Availability Group.

CDC Jobs in SQL Server – Capture and Cleanup

Earlier, we found how do we set up Change Data Capture for a Database. Today, lets us quickly look at how CDC is working internally. Once CDC is configured on a database, we can see there two SQL Agent Jobs are created – Capture and Cleanup.

By now, we know CDC will capture the data modification information of a CDC enabled table and stored in separate placeholder – Changetables. Changetables are created for every table for which CDC is enabled under a schema – cdc. In addition to that, there is another system table called – cdc.change_tables which stores all information about CDC enabled tables(changetables).

Select * From cdc.change_tables

Now, we will see How these changetables are getting populated and maintained by SQL Server. This is been done by two SQL Agent jobs – Capture and Cleanup.

CDC Capture Job

Capture Job is responsible to push the DML changes into change tables.This job is created per database in SQL Server once the database is enabled CDC.Capture Job reads Transaction logs and writes the DML changes to respective change tables (those are CDC tracked) asynchronously.
CDC uses the same technology(sp_replcmds) as of Replication to read the transaction logs. So if we use both replication and CDC on an environment, it uses and shares the same log reader that minimizes any resource contentions at environment. However, the SQL agent job does not really uses sp_replcmds directly, but another procedure sys.sp_MScdc_capture_job which again internally uses sys.sp_cdc_Scan.

sys.sp_cdc_scan uses few configuration parameter from msdb.dbo.cdc_jobs system table for capture jobs as below:

Query:

Select db_name(database_id) database_name, job_type, B.name,
maxtrans,continuous,pollinginterval,retention,threshold from msdb.dbo.cdc_jobs A
inner join msdb.dbo.sysjobs B on A.job_id= B.job_id
Order by job_type asc

To change the settings, you may need to use another procedure sys.sp_cdc_change_job as below:

EXEC sp_cdc_change_job @job_type='capture', @maxtrans = 500, @maxscans = 10, @continuous = 1, @pollinginterval = 5

If you make changes to these settings, a restart of the capture job is required. To stop the capture job, use:

EXEC sys.sp_cdc_stop_job @job_type = 'capture'

And to start the job again, use:

EXEC sys.sp_cdc_start_job @job_type = 'capture'

Let us quickly see the parameters of sp_cdc_change_job @job_type=’capture’ as below:

maxtrans
The maxtrans configuration option set the maximum number of transactions to read from the transaction log in each scan cycle. The default is 500. If this parameter is set to NULL, it is interpreted as no change.

maxscans
Configures how many scans will be done to capture all rows from the transaction log. Each scan reads the number of transaction specified in the maxtrans option. The default is 10. 10 scans times 500 transactions means 5000 rows will be read at the most each time. If this parameter is set to NULL, it is interpreted as no change.

continuous
A bit where 1 means the job will run continuously (the default) or only one time (0). If the job is set to run continuously, maxtrans rows will be read from the log maxscans times. The job will then wait pollinginterval seconds and then read from the log again. If this parameter is set to NULL, it is interpreted as no change. Normally, 1 would be used. One time reads (0) are for testing purposes only and must not be used in a production environment since records in the transaction log will be kept active until CDC captures them.

pollinginterval
The number of seconds between log scan cycles. The default is 5. This parameter is only used if continuous = 1. If this parameter is set to NULL, it is interpreted as no change. In effect, a WAITFOR is issued between reads from the log, and this 5 second wait is the main reason why you may have to wait a few seconds before the change is captured to the change table.

Cleanup Job

Cleanup job is responsible to clean up the records from the changetables. This job is created automatically by SQL Server to minimize the number of records in the changetables, failing this job execution will be resulting to a larger changetable. This job internally invokes a procedure sys.sp_MScdc_cleanup_job in MSDB database with no parameter.The procedure reads the configuration again from the table – dbo.cdc_jobs that uses two columns retention and threshold.

Retention
This value specifies, in minutes, how long rows in the change table are kept. If NULL is used as value, this is interpreted as “no change”. The default is 4320 minutes (or 72 hours / 3 days). The maximum value is 52494800 minutes (100 years).

Threshold
This parameter is used to limit the number of rows that can be deleted in a cleanup delete statement. The default is 5000 rows. If NULL is used as value, this is interpreted as “no change”.

Again, To change the settings, you may need to use another procedure sys.sp_cdc_change_job as below:

EXEC sp_cdc_change_job @job_type='cleanup', @retention = 4320, @threshold = 5000

If you make changes to these settings, a restart of the cleanup job is required. To stop the cleanup job, use:

EXEC sys.sp_cdc_stop_job @job_type = 'cleanup'

And to start the job again, use:

EXEC sys.sp_cdc_start_job @job_type = 'cleanup'

Few points to be noted:

1. You must make sure SQL Agent is up and running all the time
2. cdc_jobs configurations are very important to set correctly.Overestimating/underestimating the configurations will have a deterimental impact on you application performance. You may need to genuinely configure as per your workload, a performance test can be carried out as per your workload to reach out your optimal values
3. Cleanup job is scheduled by default to run at 02:00 AM every day
4. Capture job is scheduled as “Start automatically when SQL Server Agent starts”. As it uses continuous parameter further, you may not need to make any change for “Schedule type”.

See Also:

Could not update the metadata that indicates database is enabled for Change Data Capture. The failure occurred when executing the command ‘SetCDCTracked(Value = 1)’
How do we move or restore a database with CDC enabled in SQL Server
Msg 22841, Level 16, State 1, Procedure sp_cdc_vupgrade, Line 323 – While restoring from a cdc enabled database backup
Tuning the Performance of Change Data Capture in SQL Server

Could not update the metadata that indicates database <> is enabled for Change Data Capture. The failure occurred when executing the command ‘SetCDCTracked(Value = 1)’. The error returned was 15517: ‘Cannot execute as the database principal because the principal “dbo” does not exist, this type of principal cannot be impersonated, or you do not have permission.’. Use the action and error to determine the cause of the failure and resubmit the request.

While configuring Change Data Capture on a database that is restored from different environment, we might end up with an error message as below:

Could not update the metadata that indicates database <> is enabled for Change Data Capture. The failure occurred when executing the command ‘SetCDCTracked(Value = 1)’. The error returned was 15517: ‘Cannot execute as the database principal because the principal “dbo” does not exist, this type of principal cannot be impersonated, or you do not have permission.’. Use the action and error to determine the cause of the failure and resubmit the request.

To resolve the issue, we need to change the owner of database and then enable CDC for the database as below. Please note that, the database would have restored irrespective of the error. You just have to run the below statement to avoid similar issues in future.

EXEC sp_changedbowner 'sa'
EXEC sys.sp_cdc_enable_db ;

Ref: https://docs.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sp-changedbowner-transact-sql?view=sql-server-2017