Creating Your Own Custom Data Collections

Let me review this code with you starting with Section A.  This section executes the stored procedures “sp_syscollector_create_collection_set”.  This stored procedure creates the collection set container.  When you define your custom collection set you identify the name, collection mode, a schedule for when the collection set is to be run, and how long the collected information will be retained in the MDW.  The collection mode is identified by the @collection_mode parameter.   There are two different collection modes, cached and non-cached.  A cached mode collection means the data collected is first stored temporarily in a cache file and then periodically it will be uploaded from the cache file into the MDW.  Whereas non-cached collection means the collection and upload processes happens at the same time.  For my example I’m using a non-cached data collection.    I identify the number of days to keep the information in the MDW with the @days_until_expiration parameter.   In my case I want to retain my row counts for 180 days or around 6 months.   To identify the schedule for when my collection will run I use the @schedule_name parameter.   Here I identify a schedule named “Daily at Midnight”.    This is a shared schedule and will make my collection run once per day at midnight.

Once my data collection set has been defined I then need to identify the type of data collector I will be using.  Since I will be collecting my statistics with a T-SQL script I need to use the “Generic T-SQL Query Collector Type”.  In section B I set a uniqueidentifer data type to the appropriate value for this collector type.
In section C I am defining my data collection item for my T-SQL statement.  To do this I pass some parameters to the “sys.syscollector_create_collection_item” stored procedure.  If you look at this section you will see that @parameters parameter contains my T-SQL within some XML.  Let’s review that XML, since there are some points you need to understand to help you create your own T-SQL data collection items.  The <Query> element contains a <Value> element and a <OutputTable> element.  I used the <Value> element to identify my T-SQL statements.  There are a couple of things you need to know about the <Value> element.  First when using system views like “sys.partitions” you need to make sure they use the proper case, otherwise you will get an error when your data collection executes.  Second is when you specifying conditions that require greater than (>) or less than (<) operators you need to replaces these operator symbols with “&gt;“ for “>” and “&lt;” for “<”.  This is because these symbols are used to define the beginning and ending of an XML element.  If you look at my T-SQL code you will note I used “&lt;” to represent a “<” symbol in my WHERE clause.  To identify the table in the MDW where my statistics will be stored I use the <OutputTable> element.  Here I have identified my table name as “NumofRows”.  The last element I use is the <Databases>.  This element identifies which databases I want my T-SQL to run against when this Data Collection is executed.  Here I have identify two different attributes “UseSystemDatabases” and “UseUserDatabases” and set them both to “true”.  This means I want to run my T-SQL against all databases contained within my instance.   

You might not want to execute your T-SQL Data Collection against all databases in an instance.  If you only what to have a Data Collection process your T-SQL against a single database or a couple of different databases you can use the following syntax for the <Databases> element:

<Databases>
  <Database>USERDB1</Database>
  <Database>USERDB2</Database>
</Databases>

Continues…

Leave a comment

Your email address will not be published.