How much space will table compression save me?

At work we have a reasonably large table with a NVARCHAR(MAX) column. This column stores HTML and other user generated content. This table is on a SQL Server 2014 instance, so cannot take advantage of new compression technologies in SS 2016.

Looking at the estimated benefits via query:

DECLARE @MyTableVar TABLE
(
  object_name NVARCHAR(50),
  schema_name NVARCHAR(50),
  index_id INT,
  partition_number INT,
  size_w_CURRENT_compression_MB INT,
  size_w_REQUESTED_compression_MB INT,
  sample_size_w_current_compression_MB INT,
  sample_size_w_requested_compression_MB INT
  )
  INSERT INTO @MyTableVar
  EXEC sp_estimate_data_compression_savings
  @schema_name =  'dbo'
  , @object_name = 'description_archive'
  , @index_id = NULL
  , @partition_number = NULL
  , @data_compression = 'ROW'
select object_name
  , schema_name
  , index_id
  , partition_number
  , (size_w_CURRENT_compression_MB/1024) as size_w_CURRENT_compression_MB
  , (size_w_REQUESTED_compression_MB/1024) as size_w_REQUESTED_compression_MB
  from @MyTableVar

clip_image001

Well … bummer!  Looks like no savings with ROW compression, and running the same query above with PAGE compression did not look much better. Why? It seems like a NVARCHAR(MAX) column full of HTML would be a great candidate for compression.

The answer lies in the allocation units in use for this table.  Let’s see how the allocation works for our archive table:

SELECT OBJECT_NAME(sp.object_id) AS [ObjectName]
, si.name AS IndexName
, sps.in_row_data_page_count as In_Row
, sps.row_overflow_used_page_count AS Row_Over_Flow
,sps.lob_reserved_page_count AS LOB_Data
FROM sys.dm_db_partition_stats sps
JOIN sys.partitions sp ON (sps.partition_id=sp.partition_id)
JOIN sys.indexes si ON (sp.index_id=si.index_id AND sp.object_id = si.object_id)
WHERE OBJECTPROPERTY(sp.object_id,'IsUserTable') =1
AND OBJECT_NAME(sp.object_id) = 'description_archive'
order by sps.in_row_data_page_count desc

The results show that the majority of this table data is stored in LOB type pages instead of in-row, because the “description” field is composed of values mostly > 8,000 bytes.image

If we had data in our NVARCHAR(MAX) column whose size was <= 8000k we would benefit from PAGE level compression, as the data would be stored IN ROW (provided we didn’t specify something different

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.