Python in SQL Server

Python is an interpreted programming language. Python is a very strong programming language for multi purposes like machine learning/complex computing/ scientific and theoretical based solutions etc. I do not want to cover the details as We have a “Stairway to Python” initiative by Kiran Reddy. I strongly recommend to read the series to know and learn more about Python. For now, Let us cover Python in SQL Server…

Prerequisites

Python support in SQL Server, is a great move from Microsoft to challenge other technologies.SQL Server 2017 by default has in-built Python libraries to support the requirements.

Let us quickly see the prerequisites for Python in SQL Server.

1. Install SQL Server 2017 Developer or above edition

As SQL Server 2017 has introduced support of Python script/code execution, Lets us download and install SQL Server 2017 Developer version as a first step.

2. Select the Machine learning features that includes Python and R while installing SQL Server 2017

It is important for us to understand we need to select Machine Learning Services(previously known as R Services(SQL 2016) in “Feature Selection” while installing SQL Server 2017.

3. Make sure the Launchpad Service is selected with Automatic start-up

Launchpad is an integral part of Machine Learning Services execution meaning to executing the Python or R code in SQL Server. The below diagram explains the usage of Launchpad service.

Note, This is important when you have multiple Launchpad services exists in the same environment and one of them have any issues that could not start, none of the Launchpad services may not start at all(manually). To be safer side, we can make this setting Automatic at initial stage.

4. Enable “External Scripts Enabled” in the configuration

To run Python code in SQL Server, we need to use sp_execute_external_script system stored procedure. To use this procedure, we need to enable the configuration option “external scripts enabled” as below.

sp_configure 'external scripts enabled',1
reconfigure with override

5. Restart the SQL Service

This is important as “External Scripts Enabled” option is not a dynamic configuration that needs an explicit restart of the service to affect the changes.

A quick Demo

Now, we have set up the environment Lets quickly do a script execution.


EXEC sp_execute_external_script @language = N'Python', 
@script = N'print("Python in SQL Server is a revolution!!!")'

Thats it for now, hope you enjoyed the post.

Advertisements

SQL Server Configuration Manager – The remote procedure call failed. [0x800706be]

Problem Statement:

Recently, I encountered an issue as below with SQL Server Configuration Manager which is a very common issue for most of us. Most of the time, this type of error is related to not having enough permission to see the SQL Server Configuration Manager. However, the case was different for me as I am the administrator for my system.

Observations:

In my laptop, I have multiple SQL versions like 2008 R2, 2014, 2016 and 2017.For some reason, SQL Server Configuration Manager (by default it was pointing to oldest one, nothing but 2008 R2 version) was not able to correctly get the information for all the installed versions. This was causing an issue.

To resolve the issue,

1. we can identify the file location as “C:\windows\syswow64\” OR “C:\Windows\System32\”.
The SQL Server Configuration manager file name should be similar to SQLServerManager**.msc where ** denotes the version of the SQL Server.

2. Open the latest version of SQL Server configuration manager file. This will work as expected.

However, the above is not a complete solution while we open the configuration manager from the start window. This is because, the default configuration manager will still be pointing to the older version of the SQL Server. To change this behavior, we need to do the following steps.

1. Open the “SQL Server Configuration” in start and right click on the icon.


2. Right Click on “SQL Server Configuration Manager” and change the Target file to the latest mmc file.(You should have administrator privilege to do so)


3. Close the Window

From then, you will be successfully able to see the SQL Server Configuration manager that fetch the data as expected.

More about MSDTC

What is MSDTC?

MSDTC is a acronynm for Microsoft Distributed Transaction Coordinator which is a windows service to maintain the transactions in a distributed environment. The internals of MSDTC is a black box for users, however, this service is very important to maintain the transactions between two different systems. A Distributed transaction is a transaction that spans across multiple machines.Distributed transaction will ensure if there are multi operation happens between servers, if any of the operation is failed and cancelled, none of the operations in the transaction is getting committed in a multi server environment. I am not sure if I can simply explain than this.

Where do we need MSDTC?

Typically, in the context of SQL Server, yes we may need MSDTC for Linked Server usages.If transaction is opened to do some operations across servers in a linked server environment, this needs to be converted as a distributed transaction to maintain the integrity of the data. We need to understand two things clearly here.

1. When two physical machines are communicating with a transaction. If you are using two instances of the same machine, DTC will not be used. However, if the instances are in a cluster, you need to use DTC as you cannot ensure the instances are in the same physical nodes in the cluster environment.

2. Irrespective of type of transaction, whether its implicit or explicit, you need to use DTC. A typical example for implicit transaction is Trigger. Please find more detail here.

How to troubleshoot MSDTC issues?

Troubleshooting MSDTC issues is a pain area as you will not have complete information about the DTC component. We can use DTCPing tool to troubleshoot the issues which will provide much better insights.

DTCPing tool can be downloaded here.

How to configure and troubleshoot using DTCPing tool?

Test Network Connectivity
1. On Computer A, run DTCPing.exe.

2. On Computer B, run DTCPing.exe.

3. On Computer A, type the NetBIOS name of Computer B, and then click Ping.

4. On Computer B, type the NetBIOS name of Computer A, and then click Ping.

NOTE: The DTCPing log file can be found in the same folder as the DTCPing.exe file. The log file name has the following format: “NetBIOSName” + “ProcessID” + .log

You can see more details in the readme.txt available in the tool kit.

See Also:

How to troubleshoot msdtc issues with dtcping tool

How to identify NetBIOS Name of a computer?

Its not a SQL Server related, but I found it difficult to identify a way to get the NetBIOS name of a computer, hence sharing the same.

Recently, while troubleshooting an issue with MSDTC, I had to use DTCPing tool. I do not cover the DTCPing tool here, but will share more about it in another post. DTCPing tool was expecting NetBIOS name instead of IP address.

So, here are few options to identify the NetBIOS Name of a computer.


Using T-SQL

Select serverproperty('ComputerNamePhysicalNetBIOS')
(Or)
exec master..xp_regread 'HKEY_LOCAL_Machine', 'SYSTEM\CurrentControlSet\Control\ComputerName\ComputerName\','ComputerName'

Using Windows

nbtstat -n  (-n   -- Lists local NetBIOS names.)

Thats it for now, but if you think there is any other method, post it in the comment section.

Error Message: The operation could not be performed because OLE DB provider “SQLNCLI11” for linked server “” was unable to begin a distributed transaction.

Problem Statement:

We had an issue with Linked Server in one of our environment while executing a functionality. The error message is shown as below.

OLE DB provider “SQLNCLI11” for linked server “” returned message “The transaction manager has disabled its support for remote/network transactions.”.
Msg 7391, Level 16, State 2, Procedure “VIEWNAME” Line 8 [Batch Start Line 1]
The operation could not be performed because OLE DB provider “SQLNCLI11” for linked server “” was unable to begin a distributed transaction.

Analysis:

While analyzing, we identified the reason for the error is actually an INSERT operation on a view.To be more clear, lets discuss the scenario with an example as below.

Lets assume we have two servers ServerA and ServerB connected through linked server “LKSERVER”.

1. Create a table Called – dbo.TestLinkedServerTargetonPrimary in ServerA

Create Table dbo.TestLinkedServerTargetonPrimary(Col1 int)

2. Create a view called vw_LinkedServerTest in ServerB

create view vw_LinkedServerTest
		as
			Select Col1 From [ServerA].DatabaseA.dbo.TestLinkedServerTargetonPrimary

3. Create an INSTEAD OF Trigger for INSERT operation on the view created above

create TRIGGER [dbo].TR_LinkedServerTest on [dbo].vw_LinkedServerTest
		INSTEAD OF INSERT
		AS		
		BEGIN
                	INSERT INTO [ServerA].DatabaseA.dbo.TestLinkedServerTargetonPrimary
	                (
				Col1
                	)
	                SELECT 
				Col1
        	        from inserted
		END

4. Try inserting data into Linked Server table through view.

Insert into vw_LinkedServerTest Select 1

The insert operation is failing with the below error message.

OLE DB provider “SQLNCLI11” for linked server “LKSERVER” returned message “The transaction manager has disabled its support for remote/network transactions.”.
Msg 7391, Level 16, State 2, Procedure TR_LinkedServerTest, Line 7 [Batch Start Line 21]
The operation could not be performed because OLE DB provider “SQLNCLI11” for linked server “LKSERVER” was unable to begin a distributed transaction.

The error is due to the transaction created by the INSTEAD OF Trigger on the view. Though, we dont have any explicit transaction defined, the instead of trigger is creating an implicit transaction. As the transaction scope is across linked server, it tries to open a distributed transaction in the trigger and it fails due to non access to network DTC access.

Lets quickly confirm the cause of the issue by modifying the trigger code to get the transaction count.

From the above, we can clearly see the transaction count is increasing as the trigger is getting executed.

Solution:

There are two options to solve the issue.

1. Provide enough security or enable the configuration at Network DTC access

a. Open “Component Services” in both servers and change the security settings as below.

b. Once the settings are changed, try to execute the INSERT query.

One drawback of this solution is a high dependency on MSDTC. Unless there are no alternatives, I would not suggest to use this method .

2. Try to Avoid the Trigger

Yes, As already discussed,INSTEAD OF trigger on the view is causing the issue. If we can drop the trigger on the view, it will not create a distributed transaction and the query will be executed successfully.

Hope this post helps you for similar situations, please share your feedback/comments….

Internal usage of TEMPDB by SQL Server DB engine

This post is actually a continuation post for TEMPDB – the most important system database in SQL Server which trying to cover one of the important usage of TEMPDB in SQL Server. Some operations in SQL Server uses TEMPDB internally to improve the performance of the operations. It may not be fully aware for the users, but we can understand the usage of tempdb using the below query.

--Query to understand the internal usage of TEMPDB
select
reserved_MB=(unallocated_extent_page_count+
			version_store_reserved_page_count+
			user_object_reserved_page_count+
			internal_object_reserved_page_count+
			mixed_extent_page_count)*8/1024. ,
unallocated_extent_MB =unallocated_extent_page_count*8/1024., 
internal_object_reserved_page_count,
internal_object_reserved_MB =internal_object_reserved_page_count*8/1024.
from sys.dm_db_file_space_usage

Few of operations as below:

1. DBCC CHECKDB/CHECKALLOC

DBCC CHECKDB/CHECKALLOC are using TEMPDB space for creating internal database snapshot to perform the operation. This is to efficiently avoid the locking behavior on database to obtain the consistency check. DBCC CHECKDB and CHECKALLOC has an option to estimate the usage of TEMPDB by providing ESTIMATEONLY option as below. Please note, this is an estimated value, may not be the correct one, however, this can be a good indication to estimate the space required for the operation. This can be used to make sure the TEMPDB drive has enough space to run on DBCC operations for large databases.

2. SORT operations can spill over to TEMPDB

When a query is executed, SQL optimizer will choose the plan already created if prsent or will create a new plan based on cost based algorithms. When optimizer creates a plan, SQL optimizer/relational engine will identify the right operator for the plan to execute the queries and it estimates the memory required to execute the query. This plan will be used for further execution. If the estimation is not happening correctly because of wrong statistics or parameters used while creating the plan is returning less number of records, the estimated number of records will be deviating from the actual number of records while executing the query. In such scenario, the memory granted for the execution may not be sufficient for the execution.

If SORT operator needs more memory to sort the data , other words, the memory granted is not sufficient to do the sorting, it will spill over to TEMPDB. This will have a performance impact for the query execution. The spill over to TEMPDB can be observed in the profiler as SORT warnings(below).

In the above snapshot, we can see Sort warnings in profiler with Event Subclass as single or multiple. Whenever the sort operation spills to tempdb, SQL Server raises the ‘Sort Warnings’ event and it takes single or multiple passes to tempdb.

As mentioned, Sort warnings or spilling to TempDB will have some detrimental impact on the query performance, we need to identify those queries and avoid if possible. Through profiler, we may not understand the query is being caused the Sort Warnings, but, we need to identify from the cached plan or by setting up Extended events to capture the Sort warnings.

Once we identified the query caused Sort Warning, the easiest solution would be re-write the query in a manner to avoid the sorting. I have seen queries with CTE using the ORDER BY even though the order by is not required specifically. Try to use ORDER BY genuinely to avoid performance issues. At the same time, it may not be easy for all cases to avoid the ORDER BY completely. Similar scenario, we may need to evaluate adding a supporting index or modify the existing indexes to avoid, but again, this needs more careful and clear understanding of code and index usage. If you are well aware of the reason for Sort warnings are due to incorrect statistics, you can update the stats or use the solutions to avoid the parameter sniffing(As the objective of this post is not going to explain the parameter sniffing, this post does not cover the topic now.).

3. Worktables/intermediate temp objects due to spooling/hash joins/aggregate Operations

Spools are special operators created by SQL optimizer to improve the performance of a query. A spool operator is not an independent operator,but a supporting operator for another operator like clustered index scan, Table scan or even Constant scan. A spool operator reads and stores intermediate “operated” data into TEMPDB from another operator, there by, increasing the performance of the query.

In the below snapshot, we can observe that the spool operator stores the data from the input, here its nothing but constant scan into a temporary tables to avoid multiple rewinds.

There are totally five types of spool operators – Eager, Lazy, Table, RowCount and Non-Clustered Index spools. All spool operators will store the data into TEMPDB, however, it may be different the way it behaves. Please explore further on the topic to know more details.

Hope you enjoyed this post, please post your feedback/thoughts in the comments.

Table Variables in SQL Server

Table variables are another types of temporary objects to store transient data. Please refer Temporary Table objects for more details about Temporary Tables in SQL Server.

Differences between Table Variables and Temporary tables

When do you use Temporary Tables over Table Variables, vice versa

So, now we should be good enough to take a decision on when do we need to use table variable and Temporary tables. I would like to reiterate statistics as one of the most important factor. Table variable does not have statistics where as temporary table has statistics maintained. As SQL Server optimizer is based on cost- based optimization, statistics are very important factor for Cost based approach to identify the best plan for your query. If your transient data is more than 100, I would suggest to use Temporary tables(with right indexes) over table variables to make use of the statistics to help SQL Server to identify the best plan for us.


DECLARE @TabVariableTesting TABLE (
 id INT PRIMARY KEY,
 ScrambledData Varchar(7000)
)

INSERT INTO @TabVariableTesting 
SELECT number, Replicate('SQLZEalot',100)
 FROM master..spt_values WHERE NAME IS NULL

SELECT * FROM @TabVariableTesting WHERE id > 75


CREATE TABLE #TempTableTesting (
 id INT PRIMARY KEY,
 somecolumn Varchar(7000)
)

INSERT INTO #TempTableTesting
SELECT number, Replicate('SQLZEalot',100)
 FROM master..spt_values WHERE NAME IS NULL

SELECT * FROM #TempTableTesting WHERE id > 75

DROP TABLE #TempTableTesting

In other words, if you have very less data, then I would prefer to use table variables over temporary tables. If you need to store data in a user defined function, table variables are the way for you currently. The choice is not hard rule one, but choose the best for your needs/requirements.

Hope you enjoyed this post, please share your feedback and thoughts.