My MVP Profile

Tuesday, August 29, 2006

Thoughts on Summary Reporting and SCDW Grooming

After installing the Summary Reporting Pack a couple of times recently, I got to thinking about the interdependencies of MOM databases from a disk space perspective, especially in single server environments, and wanted to share a couple random thoughts on the initial reporting data aggregation and grooming processes once you install the Summary Reporting Pack based on some observations of a much-less-than-SQL-guru during recent installations. I suppose this may also have some value in recovery if you should discover your SCDW grooming process has failed for extended periods due to lack of disk space.

 

 

Summary Reporting Notes

 

The Summary Reporting Pack aggregates data points (alerts, performance) into a single point for each day (or week) for a given counter or alert. I won't go into the detail as the Summary Reporting manual and Marcus have addressed this already in fine detail. I want to talk for a moment on managing log utilization and disk space in the process and job scheduling through the initial aggregation process and going forward.

 

At this point I'll restate a couple of somewhat self-apparent but important points:

 

  1. Don't let the Summary Reporting 'BuildAggregations' SQL job do the initial aggregation for you, as it will attempt to do it all at once. FYI – It is scheduled to run nightly at 2am by default.

A couple of items to note:

 

    • The DTS launched by this job appears to execute 4 stored procedures), 3 executed in parallel, which means they're all hammering away at server resources in unison.

If you want a good visual representation of the process, just open the BuildAggregations.dts DTS package launched by the BuildAggregations SQL job through Enterprise Manager (dts package located in the install directory for the Summary Reporting installer package).

 

    • This job will cause your tempdb to grow substantially as the procedures executed create temporary tables during the aggregation process. (several GBs for each week varying by the number of servers, and their associated records in the SCDW fact tables)

 

    • More importantly, it does not appear chunk the operation in any way (that I can see), it's like one great big transaction, so if that BuildAggregations job is allowed to run on it's own initially, it will fail unless you have a vast amount of free disk space for tempdb.

So once you have Summary Reporting installed and configured, follow the manual aggregation instructions in the guide that comes with the pack and manually chip away at that data a week to 10 days at a time until you have all your historical data taken care of.

 

…and if you can't do it the day you install it, consider disabling the BuildAggregations SQL job until you have time to complete this task.

 

  1. When you lower the historical data retention on your 6 fact tables in SystemCenterReporting (SCDW), set your target and then groom the data manually in small doses.

 

To explain that a bit further…

 

If you have the default 395 days of data and you then aggregate that data into the summary data points as provided by the Summary Reporting Pack, you've now succeeded in increasing the total size of your SCDW. So to get your "up to 65% disk savings" as advertised with Summary Reporting, you now need to reduce the number of days you retain the historical data in the 6 fact tables.

 

NOTE: If you perform the following before you've successfully performed the initial aggregation process with Summary Reporting, there will be nothing to summarize. L

 

For example, let's assume you decide that with 395 days of data summarized into thousands of tidy single points of data, you can get by with storing only 6 months of detailed historical data. At this point, you use the standard SQL query to reduce retention in the 6 fact tables to only 7 months (see Marcus for the how HERE.)

 

This means the SCDWGroomJob on its next run is going to attempt to groom everything older than your retention date – in our example, that is 6 months of historical detail data (13 mths – 7mths).

 

So what is the impact of this in terms of disk space during the grooming operation? I wanted to just get some add-hoc data to quantify the impact. Here's a rundown of what I saw and some of the why.

 

The environment:

 
Agent-managed servers: approx 200

Rows of data (per day) in SC_SampledNumericDataFact_Table: approx 1 million rows (based on a dozen or so representative samples). (There were about 144 million rows total when we started). Row counts in the other 5 fact tables were so small in comparison in this environment I didn't factor them in.  

 

The result:

 
By lowering historical data retention 1 day at a time (using the standard published SQL statement), I could then run the SCDWGroomJob in SQL and watch the growth of various databases during the process. What I saw was tempdb grow approximately 2.8GB in the process of grooming 1 days data (1 million rows). Reducing historical data retention by 10 days resulted in tempdb growth of about 28GB.

 

Why this happens:

 

SCDWGroomJob starts by putting an exclusive lock on 'MOM.Datawarehousing.DTSPackageGenerator' to ensure no other grooming operations are taking place.

 

It then creates a pair #temp tables (in tempdb), one to store some class ID relationship information on source and target rows (#tmpRelationshipContraints), and another to hold the ordered list of classids in the delete order (#tmpDeleteList).

 

And in short, the more data out there to be groomed in the data warehouse, the bigger tempdb will grow (no chunking apparent in this job either). So once you get your Summary Reporting aggregations out of the way, it's probably a good idea to perform the data warehouse grooming in the same controlled manner, reducing retention a few days at a time, running the SCDWGroomJob again, rinse, repeat, until you're down to your target retention level.

 

When complete, you could at this point  shrink your SystemCenterReporting database and transaction logs, leaving the prescribed free space (40% in SCDW if you use the standard SCDW reindexing script).

 

That's all for now. Another post forthcoming on optimal job scheduling and database maintenance in the next few days. Props to Marcus Oh and B lake Mengetto for their shared thoughts on the topic as well.

 

Send any thoughts, comments or other retorts as the spirit moves you to do so.

 

Labels:

Comments on "Thoughts on Summary Reporting and SCDW Grooming"

 

post a comment links to this post