Tag: SQL Server

CONTAINSTABLE is not returning records from full text enabled table in SQL Server

Recently, we have an issue, FULL TEXT search was not working with CONTIANSTABLE. My friend was searching for a word in a document where the table is enabled with FULL TEXT feature, however the search was not bringing any records.

My first thought was there could be a missing join condition that could not get the results. The actual query in the issue was really a big one with lots of tables and filters involved.So the first thing that we did remove all joins or conditions and see if CONTAINSTABLE works good, but that attempt was also failing. It was a clear indication that there is something wrong that it could not return results at FULL TEXT functionality.

For everyone to understand better, Let us go through a sample script to replicate the issue:

1. Create a table for our example

CREATE TABLE [DBO].[DOCUMENTTABLE](
	[DOCUMENTID] [UNIQUEIDENTIFIER] NOT NULL,
	[DOCUMENT_CONTENT] [IMAGE] NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

CREATE INDEX IX_DOCUMENTTABLE ON DOCUMETTABLE(DOCUMENTID)

2. Load some data into Table – We uploaded the same file in question which is a .xlsx type

3. Query table with CONTAINSTABLE

Select * From CONTAINSTABLE (DocumentTable, DOCUMENT_CONTENT,'reporting')

Msg 7601, Level 16, State 2, Line 2
Cannot use a CONTAINS or FREETEXT predicate on table or indexed view ‘DocumentTable’ because it is not full-text indexed.

4. Enable FULL TEXT Indexing on the table

5. While executing the query, it is observed again no data returned

Further, we decided to check the document and its content for the search word manually and found the word was present and the document was also the same “.xlsx”. It is to be noted that we saved the data in an IMAGE datatype as shown in the example. With that information, we wanted to know the supporting file types in our full text enabled table with the below query.

select document_type, path from sys.fulltext_document_types

The above query was not returning the document type – “.xlsx” and it clearly indicated that the reason why the document could not search the word as the document type was a non-supporting one. [DOCUMENTTABLE] is a table that enabled FULL TEXT and the datatype of the field “[DOCUMENT_CONTENT]” was IMAGE. Then, my analysis took me a very important information that if we need to index a document type for which a pre-installed iFilter does not exists at OS level, we need to install the appropriate iFilter in the OS and then load those into SQL Server for a successful search functionality.

How to install the iFilter and Load into SQL Server

1. Download the iFilter , the below supports a wide range of document types

Microsoft Office 2010 Filter Packs

(Please note that MSmight retire the above as latest version is released, you may need to check for the latest.)

The following file types are supported by the Office 2010 Filter Pack:

2. Install the downloaded set

3. Load the installed OS resources to SQL Server as below.

exec sp_fulltext_service 'load_os_resources', 1;
exec sp_fulltext_service 'verify_signature', 0;
exec sp_fulltext_service 'Update_languages'
exec sp_fulltext_service 'Restart_all_fdhosts'

Once the above steps are completed, we were able to query the word and the results were returned as expected.

Hope this helps you in a similar situation and it was a good lesson learned to me!

References:
How to register iFilters

Advertisements

Could not update the metadata that indicates database <> is enabled for Change Data Capture. The failure occurred when executing the command ‘SetCDCTracked(Value = 1)’. The error returned was 15517: ‘Cannot execute as the database principal because the principal “dbo” does not exist, this type of principal cannot be impersonated, or you do not have permission.’. Use the action and error to determine the cause of the failure and resubmit the request.

While configuring Change Data Capture on a database that is restored from different environment, we might end up with an error message as below:

Could not update the metadata that indicates database <> is enabled for Change Data Capture. The failure occurred when executing the command ‘SetCDCTracked(Value = 1)’. The error returned was 15517: ‘Cannot execute as the database principal because the principal “dbo” does not exist, this type of principal cannot be impersonated, or you do not have permission.’. Use the action and error to determine the cause of the failure and resubmit the request.

To resolve the issue, we need to change the owner of database and then enable CDC for the database as below. Please note that, the database would have restored irrespective of the error. You just have to run the below statement to avoid similar issues in future.

EXEC sp_changedbowner 'sa'
EXEC sys.sp_cdc_enable_db ;

Ref: https://docs.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sp-changedbowner-transact-sql?view=sql-server-2017

Decoding hexa error code to windows error message in SQL Server

Problem Statement:
Sometimes, the error messages generated are not friendly to understand the underlined issues. This is going to be an extra effort to google out for most of us with the error code (mostly a hexa value) and identify the corresponding windows error message.

Solution:
A simple solution would be to use NET HELPMSG with the errocode. But if the error code is a hexa value, then we need to identify the corresponding Decimal value to use as a parameter to NET HELPMSG.

Here is a script to identify actual error message from hex error code. This is important to be aware that this approach will only work for Win32 error codes from Microsoft Windows. If the net helpmsg command does not return a result, the cause is likely the error has not originated from Windows, or no longer a valid Win32 process.

DECLARE @hex VARCHAR(64) = '0x80070643'
Set @hex = Right(@hex,4)

DECLARE @rez BIGINT;
SELECT @rez = ISNULL(@rez,0) * 16 + 
CHARINDEX(substring(@hex,n.number+1,1),'0123456789ABCDEF') - 1
FROM MASTER..spt_values n WHERE n.TYPE='P' AND n.number<len(@hex)

/*
	--To use NET HELPMSG, need to enable xp_cmdshell as below:

	EXEC sp_configure 'show advanced options',1
	RECONFIGURE WITH OVERRIDE
	EXEC sp_configure 'xp_cmdshell',1
	RECONFIGURE WITH OVERRIDE
	EXEC sp_configure 'show advanced options',0
	RECONFIGURE WITH OVERRIDE
*/

Declare @Table Table(WindowsErrorMessage nVarchar(MAX))
Declare @s nvarchar(500) = 'NET HELPMSG ' + Cast(@rez as varchar(MAX))
Insert into @Table 
exec xp_cmdshell @s

Select @hex HexaVal,@rez DecimalVal,* From @Table 
where WindowsErrorMessage is not null

How to identify parallel plans in SQL Server

Few years back, on one of my engagements, identified lots of performance issues with an application. As part of a detailed analysis, it is observed there are lots of queries under performing with wait type “CXPACKET”. In short, CXPACKET wait types are involved in parallel query execution. These wait types indicate that the SPID is waiting on a parallel process to complete or start. There are many reasons your query may undergo a parallel execution like missing index, non optimized sql script etc. But, CXPACKET is not ALWAYS a bad thing.We will see about CXPACKET in detail with another post.

Now, We will see how to identify the queries with parallel execution. Once you identified the queries, you can further examine to identify the actual cause of the issues and optimize.

select p.*,q.*,cp.plan_handle
from sys.dm_exec_cached_plans cp
cross apply sys.dm_exec_query_plan(cp.plan_handle) p
cross apply sys.dm_exec_sql_text(cp.plan_handle) as q
where
cp.cacheobjtype = 'Compiled Plan' and
p.query_plan.value('declare namespace
p="http://schemas.microsoft.com/sqlserver/2004/07/showplan";
max(//p:RelOp/@Parallel)', 'float') > 0

Please note that the above query gets information from cached plan in SQL Server. Hence, I will recommend to run a load test with a real time workload and use the script to identify the parallel executing queries. It is also important that the cached plans may get flushed due various factors like memory pressure, recompile of procedures, auto stats updates etc. But, this helped me a lot at times!!!

How do we set up Change Data Capture in SQL Server

This may be an old topic for many who worked with SQL Server, but, As I was doing a proof of concept on CDC recently for one of my projects, thought of sharing about CDC for those who is yet to explore this feature in SQL Server.

What is Change Data Capture?

Change Data Capture or CDC in SQL Server is introduced to track the DML changes and records insert/update/delete operations in respective change tables. The source of truth is Transaction Logs in SQL Server. Capture process jobs will be picking up the required table information from the Transaction Log and records in the Change tables. As these jobs are asynchronous, you may see a latency ideally between the commit and population of change tables. However, this has a performance advantages compared to Trigger based solutions. Similarly there is another clean up job to prune the data (older) from the change tables.

Supporting Editions: Enterprise/Developer/Standard.

How do we implement CDC on a database?

Step 1: : Let us create a database and table to see the proof of Concept


--Create a Test Database
Create Database CDCORigin;

--Create a Table 
create Table TestCDC(FirstName Varchar(100),SecondName Varchar(100))

--Insert values into tables
Insert into TestCDC Values('Ron','David'),('Chirs','Martin')

Step 1: : Enable CDC (Need two steps as below)


EXEC sys.sp_cdc_enable_db
EXEC sys.sp_cdc_enable_table  
@source_schema = N'dbo',  
@source_name   = N'TestCDC',  
@role_name     = NULL

EXEC sys.sp_cdc_add_job @job_type = N'capture'; 
EXEC sys.sp_cdc_add_job @job_type = N'cleanup'; 

You need to make sure that the SQLServerAgent is up and running while configuring the CDC, failing on this, it may end up with not creating the respective jobs in the environment.

Step 3: : Verification of CDC configuration

Once, CDC has set up, we can see the CDC specific tables and Changes tables in Object explorer as below.

In addition, we can also verify the data changes using below scripts.


--Check the CDC tables
Select * From [cdc].[dbo_TestCDC_CT]  --No data

Insert into TestCDC Values('Reddick','Pitt')

--Check the CDC tables
Select * From [cdc].[dbo_TestCDC_CT]

Update TestCDC Set SecondName = 'Pi' Where FirstName='Chirs'

--Check the CDC tables
Select * From [cdc].[dbo_TestCDC_CT]

Delete From TestCDC Where FirstName='Ron'

--Check the CDC tables
Select * From [cdc].[dbo_TestCDC_CT]

Precisely, the __$operation will indicate the action on the table as below.

Hope you enjoyed this post, and I am excited to write more about CDC with my real client experience in next posts. Until then, happy and continuous learning!

Internal usage of TEMPDB by SQL Server DB engine

This post is actually a continuation post for TEMPDB – the most important system database in SQL Server which trying to cover one of the important usage of TEMPDB in SQL Server. Some operations in SQL Server uses TEMPDB internally to improve the performance of the operations. It may not be fully aware for the users, but we can understand the usage of tempdb using the below query.

--Query to understand the internal usage of TEMPDB
select
reserved_MB=(unallocated_extent_page_count+
			version_store_reserved_page_count+
			user_object_reserved_page_count+
			internal_object_reserved_page_count+
			mixed_extent_page_count)*8/1024. ,
unallocated_extent_MB =unallocated_extent_page_count*8/1024., 
internal_object_reserved_page_count,
internal_object_reserved_MB =internal_object_reserved_page_count*8/1024.
from sys.dm_db_file_space_usage

Few of operations as below:

1. DBCC CHECKDB/CHECKALLOC

DBCC CHECKDB/CHECKALLOC are using TEMPDB space for creating internal database snapshot to perform the operation. This is to efficiently avoid the locking behavior on database to obtain the consistency check. DBCC CHECKDB and CHECKALLOC has an option to estimate the usage of TEMPDB by providing ESTIMATEONLY option as below. Please note, this is an estimated value, may not be the correct one, however, this can be a good indication to estimate the space required for the operation. This can be used to make sure the TEMPDB drive has enough space to run on DBCC operations for large databases.

2. SORT operations can spill over to TEMPDB

When a query is executed, SQL optimizer will choose the plan already created if prsent or will create a new plan based on cost based algorithms. When optimizer creates a plan, SQL optimizer/relational engine will identify the right operator for the plan to execute the queries and it estimates the memory required to execute the query. This plan will be used for further execution. If the estimation is not happening correctly because of wrong statistics or parameters used while creating the plan is returning less number of records, the estimated number of records will be deviating from the actual number of records while executing the query. In such scenario, the memory granted for the execution may not be sufficient for the execution.

If SORT operator needs more memory to sort the data , other words, the memory granted is not sufficient to do the sorting, it will spill over to TEMPDB. This will have a performance impact for the query execution. The spill over to TEMPDB can be observed in the profiler as SORT warnings(below).

In the above snapshot, we can see Sort warnings in profiler with Event Subclass as single or multiple. Whenever the sort operation spills to tempdb, SQL Server raises the ‘Sort Warnings’ event and it takes single or multiple passes to tempdb.

As mentioned, Sort warnings or spilling to TempDB will have some detrimental impact on the query performance, we need to identify those queries and avoid if possible. Through profiler, we may not understand the query is being caused the Sort Warnings, but, we need to identify from the cached plan or by setting up Extended events to capture the Sort warnings.

Once we identified the query caused Sort Warning, the easiest solution would be re-write the query in a manner to avoid the sorting. I have seen queries with CTE using the ORDER BY even though the order by is not required specifically. Try to use ORDER BY genuinely to avoid performance issues. At the same time, it may not be easy for all cases to avoid the ORDER BY completely. Similar scenario, we may need to evaluate adding a supporting index or modify the existing indexes to avoid, but again, this needs more careful and clear understanding of code and index usage. If you are well aware of the reason for Sort warnings are due to incorrect statistics, you can update the stats or use the solutions to avoid the parameter sniffing(As the objective of this post is not going to explain the parameter sniffing, this post does not cover the topic now.).

3. Worktables/intermediate temp objects due to spooling/hash joins/aggregate Operations

Spools are special operators created by SQL optimizer to improve the performance of a query. A spool operator is not an independent operator,but a supporting operator for another operator like clustered index scan, Table scan or even Constant scan. A spool operator reads and stores intermediate “operated” data into TEMPDB from another operator, there by, increasing the performance of the query.

In the below snapshot, we can observe that the spool operator stores the data from the input, here its nothing but constant scan into a temporary tables to avoid multiple rewinds.

There are totally five types of spool operators – Eager, Lazy, Table, RowCount and Non-Clustered Index spools. All spool operators will store the data into TEMPDB, however, it may be different the way it behaves. Please explore further on the topic to know more details.

Hope you enjoyed this post, please post your feedback/thoughts in the comments.