Table Partitioning – SQLZealots

There are two types of partitioning in SQL Server – Vertical & Horizontal. Today, we are quickly going to see these two types in SQL Server and its different use cases. One of the easiest way to explain Vertical partitioning is a partition at Column level otherwise Horizontal Partitioning is at Row level.

Vertical Partitioning

Vertical Partitioning is a process of splitting a really big table into multiple smaller tables for various reasons. Performance improvement is the major benefit of vertical partitioning. If your table is having large number of columns and lots of Large Object columns or very lengthy textual information, then partitioning the table into multiple smaller tables would really help to boost the performance especially if not all of your columns are frequently accessed. A simple pictorial representation as below:

Lets quickly see how vertical partitioning helps in performance. (Note that this is a very simple example for readers to understand the impact of query execution). Few guys will have different arguments, please post those questions in comment, we will discuss in upcoming blog post. But for now, I strictly wanted to limit the scope of this post ONLY to the context of vertical partitioning.

Script:

Drop table if exists Employee_Tab1
Create Table Employee_Tab1
(EmployeeID int,FirstName varchar(10),
MiddleName varchar(10)
,LastName varchar(10)
,DOB Date
)DECLARE @i int
SET @i = 1
BEGIN TRAN
WHILE @i<>100000
BEGIN
INSERT INTO Employee_Tab1 
VALUES(@i,'a' ,'b','c', getdate())
SET @i=@i+1
END
COMMIT TRAN
GO

Set statistics io on
Select EmployeeID,DOB From Employee_Tab1
where dob = cast(getdate() as date)
Set statistics io off

Drop table if exists Employee
Create Table Employee
(EmployeeID int,FirstName varchar(10),
MiddleName varchar(10),
LastName varchar(10),
DOB Date,Address1 nvarchar(max),
Address2 nvarchar(max),
Address3 nvarchar(max))
DECLARE @i int
SET @i = 1
BEGIN TRAN
WHILE @i<>100000
BEGIN
INSERT INTO Employee 
VALUES(@i,'a' ,'b','c',
getdate(),Replicate('a',8000),
Replicate('a',8000),Replicate('a',8000))
SET @i=@i+1
END
COMMIT TRAN
GO
Set statistics io on
Select EmployeeID,DOB From Employee
where dob = cast(getdate() as date)
Set statistics io off

So if you look at the code, we tried to create two tables Employee and Emploee_Tab1 and we inserted almost 100000 records to these tables. You can observe the difference between these two tables – Employee table contains 3 large text columns for address where as the Employee_Tab1 does not have those columns.When we query these tables (note that we are not referring the address columns), we can observe a huge number of logical reads happens with Employee table where as the table employee_tab1 which does not have address columns has to do only less logical reads(532 compared to 2000). This indicates that SQL has to get more data from disk to cache for second query (using employee table) than first query (using vertically partitioned table Employee_Tab1). You may observe the behavior in the below side by side query window screenshot clearly.

Points to ponder:

Use Vertical partitioning

1. If your table is having more columns

2. If your table has lots of BLOB or wide text columns those are not frequently used

3. If your table is a highly transaction table

4. with extreme care if you are restructuring your table since it may have big impact like many objects may need to change.

Horizontal Partitioning

As mentioned earlier, horizontal partitioning is a partition process that carries out at row level. That means, if you a table with (m/b)illions of data, you can partition the table horizontally. This would enable better manageability/maintainability etc. Horizontal partitioning may also have performance benefits, however, we should know how to implement this technique efficiently.

Let us quickly see a pictorial representation of the Horizontal partitioning as below.

Horizontal Partitioning requires understanding of few important terminologies.

1. Partition Key Column
Partition Key column determines on what basis the table has to be partitioned. In our example, the partition key column is AuditDate. It is important to be noted that the choice of partition key column is very important for performance aspects. We will cover this in details in next blog posts.

2. Partition Function
Just partition key column does not partition the table. We need a partition logic to be implemented on key column to partition the table. This logic can be implemented through a function called partition function. We can implement the function using RANGE Right and RANGE Left. Simply illustration of Range as below.

Range left

it is the last value in the left partition.

Range right

it is the first value in the right partition.

Usually, for an year column partition, the last day of the year (31st Dec year) is the range left and first day of the year (1st Jan year) is the range right. If range has not been specified properly, then there will be an overlapping of two years data in the same partition that can cause lots of confusion and difficulties, in fact it may even defeat the purpose of partition altogether.

3. Partition Scheme

The partition scheme maps the logical partitions to physical filegroups. It is possible to map each partition to separate filegroup or all partitions to one filegroup. There are many benefits with a proper mapping of partition and filegroup that we can cover in next blog posts.

Points to ponder:

Use Horizontal partitioning

1. If you know your table and data in detail (I am not joking – 🙂 )

2. If your table has lots of data and you have a definite partition requirement

3. If your table needs an archival of data or efficient movement of data

We will cover more on this topic in coming days, until then stay tuned!

Refer Also:

Stairway to SQL Server Table Partitioning – How do we partition an existing table

I’d like to grow my readership. If you enjoyed this blog post, please share it with your friends!

Create Table AuditData (ID int identity(1,1) Primary Key not null, AuditDate Datetime not null) Insert into AuditData Values(getdate() - (365*5)) ,(getdate() - (365*4)) ,(getdate() - (365*3)) ,(getdate() - (365*2)) ,(getdate() - (365*1)) ,(getdate()) ,(getdate() + (365*5)) ,(getdate() + (365*4)) ,(getdate() + (365*3)) ,(getdate() + (365*2)) ,(getdate() + (365*1)) --Check the data Select * From AuditData

select Object_name(p.object_id) Table_Name ,(Select name From sys.indexes where Object_id = p.object_id and index_id = p.index_id) 'Index_name', partition_number, lv.value leftValue, rv.value rightValue,p.rows, s.name 'Partition_Schema_Name',f.name 'Partition_Function_Name' from sys.partitions p join sys.allocation_units a on p.hobt_id = a.container_id join sys.indexes i on p.object_id = i.object_id Left join sys.partition_schemes s on i.data_space_id = s.data_space_id left join sys.partition_functions f on s.function_id = f.function_id left join sys.partition_range_values rv on f.function_id = rv.function_id and p.partition_number = rv.boundary_id left join sys.partition_range_values lv on f.function_id = lv.function_id and p.partition_number - 1 = lv.boundary_id where p.object_id = object_id('AuditData')

--Create partition function on Datetime Create partition function fn_AuditDate (Datetime) as Range right for values('20150101', '20160101','20170101','20180101', '20190101','20200101') /*Please note this test data is created on year 2020, so there might be difference in the data at the time you refer this blog.*/ --Create partition Schema to associate the function Create partition scheme sc_AuditDate As Partition fn_AuditDate ALL to ([Primary])

ALTER TABLE dbo.AuditData DROP CONSTRAINT [PK__AuditDat__3214EC2760DC1A18] GO ALTER TABLE dbo.AuditData ADD CONSTRAINT [PK__AuditDat__3214EC2760DC1A18] PRIMARY KEY NONCLUSTERED (ID) ON [PRIMARY] GO CREATE CLUSTERED INDEX IX_AuditData_AuditDate_ID ON dbo.AuditData (AuditDate) ON sc_AuditDate(AuditDate) GO

select Object_name(p.object_id) 'Table_Name',(Select name From sys.indexes where Object_id = p.object_id and index_id = p.index_id) 'Index_name', partition_number, lv.value leftValue, rv.value rightValue,p.rows, s.name 'Partition_Schema_Name',f.name 'Partition_Function_Name' from sys.partitions p join sys.allocation_units a on p.hobt_id = a.container_id join sys.indexes i on p.object_id = i.object_id join sys.partition_schemes s on i.data_space_id = s.data_space_id join sys.partition_functions f on s.function_id = f.function_id left join sys.partition_range_values rv on f.function_id = rv.function_id and p.partition_number = rv.boundary_id left join sys.partition_range_values lv on f.function_id = lv.function_id and p.partition_number - 1 = lv.boundary_id where p.object_id = object_id('AuditData')

	Stella on How do we provide xp_cmdshell…
	Caroline G on How to build comma separated s…
	Anonymous on Exception: The module being ex…
	Anonymous on How to recover “Recover…
	Anonymous on Search a value in your databas…

SQLZealots

Category: Table Partitioning

Stairway to SQL Server Table Partitioning – Whats is Table Partitioning in SQL Server

Stairway to SQL Server Table Partitioning – How do we partition an existing table

This site discusses and shares thoughts and experiences in SQL Server.