Data Warehouse Partition Strategies

III. Partitioning Management
Now, that our table is partitioned, you are wondering, what will happen if there is an insert that has a date >= ‘2009-03-01’? What will happen is that all the data will be placed in the FG_FactTable_200902 because the data is right aligned as specified when we created the Partition Function.

To maintain the data equally distributed between Filegroups, you must plan the future growth of your Database and continue adding Filegroups to the Partition Scheme and continue to add dates to the Partition Function.
To add a File Group to the Partition Scheme:

NEXT USED [FG_FactTable_200903]

To add a new boundary to the Partition Function:
SPLIT RANGE (‘20090301’) 

To show the details about the Partition Function:
SELECT * FROM sys.partition_functions
WHERE name = ‘PFFactTableMonthly’

To show the boundaries of the Partition Function:
SELECT, b.* FROM sys.partition_functions a, sys.partition_range_values b
WHERE a.function_id = b.function_id
and = ‘PFFactTableMonthly’

To show the Allocation Units:
SELECT object_name(object_id) AS name,
    partition_id, partition_number AS pnum, rows,
    allocation_unit_id AS au_id, type_desc as page_type_desc,
    total_pages AS pages
FROM sys.partitions p JOIN sys.allocation_units a
   ON p.partition_id = a.container_id
WHERE object_id=object_id(‘schema.FactTable)

From my point of view the following statement is the best statement available to provide an overview of your partitioned table, because it shows the INDEX_ID, Partition_Number, Filegroup that is associated with the respective Partition_Number, ROWS, PAGES and the date that is associated with the FileGroup.

SELECT OBJECT_NAME(i.object_id) as Object_Name,
  i.index_id AS Index_ID,
         p.partition_number, AS Filegroup_Name,
         CASE boundary_value_on_right
             WHEN 1 THEN ‘less than’
      ELSE ‘less than or equal to’ END as ‘comparison’, value
FROM sys.partitions p JOIN sys.indexes i
      ON p.object_id = i.object_id and p.index_id = i.index_id
       JOIN sys.partition_schemes ps
                ON ps.data_space_id = i.data_space_id
       JOIN sys.partition_functions f
                   ON f.function_id = ps.function_id
       LEFT JOIN sys.partition_range_values rv
ON f.function_id = rv.function_id
                    AND p.partition_number = rv.boundary_id
     JOIN sys.destination_data_spaces dds
             ON dds.partition_scheme_id = ps.data_space_id
                  AND dds.destination_id = p.partition_number
     JOIN sys.filegroups fg
                ON dds.data_space_id = fg.data_space_id
     JOIN (SELECT container_id, sum(total_pages) as total_pages
                     FROM sys.allocation_units
                     GROUP BY container_id) AS au
                ON au.container_id = p.partition_id
WHERE OBJECT_NAME(i.object_id) = ‘FactTable’

Some useful Tips:

  • Always use a rule on DataFiles/FileGroup nomenclature, to make it easier to understand the dependencies.
  • When partitioning tables, don’t randomize the creation of the DataFiles, for example put the DF1 on StorageA, DF2 on StorageB, DF3 on StorageC, DF4 on StorageA,DF4 on StorageB…
  • For best performance align your non-clustered indexes with the partition.
  • After the end of the previous month, rebuild the respective previous partition number.
  • Be creative


Leave a comment

Your email address will not be published.