USEFUL SITES :
Write for Us
Performance Issues with Sequential Processing
Inspite of usage of partitions it was found that given the time window of 2 hours for processing the dimensions and cubes was not sufficient as the volume of data was fairly large (often 40 million rows for Payments, Sales cubes) due to back dated updates for Payments and Sales based fact tables.It was found that with sequential processing the time taken to process the backdated transaction related cubes was exceeding two hours. Hence the cube processing SLA was not being met. The average data volumes processed across different manufacturing projects in the daily runs for the dimensions and cube and the time taken to process is given in the table below –
Dimension/Cube
Daily Refresh Strategy
Object Type
Processing Time
Data Volume
Item
Full
Dimension
6 seconds
21593
Time
5 seconds
22286
Factory
2 seconds
450
Supplier
780
Payments
Cube
1 hour 05 minutes
37,123,254
Receipts
1 hour 20 minutes
46,187,562
Total time taken for Cube Processing = 2 hours 25 minutes (sequential processing)
The average data volumes processed across different retail projects in the daily runs for the dimensions and cube and the time taken to process is given in the table below –
Product
7 seconds
22978
4 seconds
Outlet
566
963
Waste
1 hour 15 minutes
44,123,254
Sales
1 hour 25 minutes
52,187,562
Total time taken for Cube Processing = 2 hours 40 minutes (sequential processing)The diagram below shows the problems with sequential processing can be resolved by utilizing the Parallel Processing Utility in Analysis Services 2000.Performance Gains from Parallel Processing
The Parallel Processing Utility from Microsoft helps in processing cubes in parallel. It achieves this by using DSO calls and optimum utilization of CPU’s. The PPU can process cubes, partitions and dimensions in Analysis Services databases on local or remote servers. It includes two executable files namely ParallelProcess.exe and ProcessPartition.exe. ParallelProcess.exe is a Microsoft Windows program that enables a user to specify and process multiple cubes, partitions and dimensions on a server. Unlike Analysis Manager this program executes requests in parallel. ParallelProcess.exe calls the auxiliary executable file, ProcessPartition.exe to perform the actual processing of cubes, partitions and dimensions.
The basic rationale of parallel processing is explained by understanding the cube processing holistically. Processing partitions is a two-stage operation. In the first stage, data is read from the relational database into AS, grouped into segments and saved to disk. CPU usage in this stage is largely dependent on how fast data arrives from the RDBMS. Generally less than one CPU is occupied at this time. The second stage requires reading the data back from disk and computing aggregations and indexes for the data. This operation is performed in multiple threads, and can be very CPU intensive. Multiple processors may be fully utilized.
The PPU was integrated with the existing DTS jobs for the cubes and based on benchmark testing in the given environment it was observed that 2 partitions could be run in parallel and the processing time came down by around 25 %. This was a substantial gain and helped in meeting the cube processing SLA. The table below shows the data volumes and execution time for the dimensions and cubes with parallel processing of the cubes in manufacturing and retail domains -
47 minutes
57 minutes
Total Time to Process the cubes – 1 hour 44 minutes (parallel processing of 2 partitions).Gain in Time from use of PPU – 41 minutesPercentage Gain in Processing from Use of AS PPU– (41/145 ) * 100 = 28%
50 minutes
1 hour 10 minutes
Total Time to Process the cubes – 2 hours (parallel processing of 2 partitions).Gain in Time from use of PPU – 40 minutesPercentage Gain in Processing from Use of AS PPU– (40/160 ) * 100 = 25%
Hence a substantial performance benefit was found with the use of Parallel Processing Utility (PPU). The gain percentage will however vary depending on available hardware and data volumes. With the given environment it is recommended to process two partitions in parallel without causing memory issues or excessive load on CPU’s.
Recommendations about the use of PPU
The recommendations for parallel processing stem from the basic tenet that “All partitions CANNOT be processed in parallel”. This basically means that increasing parallelism does not imply performance gains.
The recommendations based on CPU sizes are as follows:
# of CPUs Optimum # of parallel executions--------- ----------------------------------------------------4 2-3 8 4-5 16 4-6
Considering the above it is recommended that:
Don’ts for the usage of the Parallel Processing Utility