<< MDX Studio 0.4.15 with support of SSAS 2012 and 2014 | Извлечение данных из 1С путем прямых запросов >>

MDX: Block Evaluation and Surrogate Attributes

by Admin20. July 2014 19:19

In this article we'll show you one interesting approach which we discovered recently: we call it "surrogate attributes". It speeds up the performance of many calculated measures, especially those which run through dimension members of the lowest granularity level.

Let us look at the example:

SUM(

FILTER(

NonEmpty(

existing [Customer].[Customer].[Customer].Members

, [Measures].[Internet Sales Amount]

)

, [Measures].[Internet Order Quantity] >= 2

)

, [Measures].[Internet Order Count]

)

You may find that this measure is artificial. Yes, it is: I've done this measure on Adventure Works database just to demonstrate the approach. The goal is to calculate something by running on low-level objects.

In this example we need to find out how many orders made those customers whose quantity of ordered products is more than 2.

If we build a report like this

we'll get 7 seconds execution.

By analyzing the request in MDX Studio and looking at the performance counters, we found this:

Time : 7 sec 607 ms
Calc covers : 0
Cells calculated : 43641
Sonar subcubes : 4103
NON EMPTYs : 995
Autoexists : 6
EXISTINGs : 2422
SE queries : 2318
Flat cache insert : 0
Cache hits : 2649
Cache misses : 2098
Cache inserts : 982
Cache lookups : 4747
Memory Usage KB : 16884

43641 calculations, 4103 sonars, 2318 storage engine queries. If we look at some interesting performance counters in the case if we run a similar MDX request on a real customer's database, we notice this:

Unfortunately, there is no way to tell to the MSAS "please, read all the measuregroup into memory because you will need it WHOLE later." There is no such MDX or XMLA command.

So, what shall we do in this case? The answer is next: we need to improve our formula AND to add one trick to the dimension.

The construct "SUM(FILTER(set, condition), measure)" is a very ineffective one, and it should be replaced with "SUM(set, IIF(condition, measure, null))" to allow block evaluation working.

So, if we change the formula like this

SUM(

existing [Customer].[Customer].[Customer].Members,

IIF(

[Measures].[Internet Sales Amount] > 0

AND

[Measures].[Internet Order Quantity] >= 2,

[Measures].[Internet Order Count],

null

)

we will get much better performance:

Time : 179 ms
Calc covers : 0
Cells calculated : 250
Sonar subcubes : 72
NON EMPTYs : 1
Autoexists : 0
EXISTINGs : 71
SE queries : 159
Flat cache insert : 0
Cache hits : 285
Cache misses : 0
Cache inserts : 0
Cache lookups : 285
Memory Usage KB : 0

We can stop out optimization efforts, of course. But this is not enough: we run 159 storage engine queries instead of one, and I still don't like it.

Why this happens? Just because for every non-empty cell the set

existing [Customer].[Customer].[Customer].Members

is different. But how can we make so that this set is the same for every cell?

This can be done with the help of a trick: let us add a new attribute to the Customers dimension with the same key as the key attribute and with no specified name:

Process the cube, and rewrite the formula like this:

SUM(

[Customer].[Customer Surr].[Customer Surr].Members,

IIF(

[Measures].[Internet Sales Amount] > 0

AND

[Measures].[Internet Order Quantity] >= 2,

[Measures].[Internet Order Count],

null

)

Now look at the result:

Time : 25 ms
Calc covers : 0
Cells calculated : 250
Sonar subcubes : 1
NON EMPTYs : 1
Autoexists : 0
EXISTINGs : 0
SE queries : 3
Flat cache insert : 0
Cache hits : 3
Cache misses : 0
Cache inserts : 0
Cache lookups : 3
Memory Usage KB : 0

Isn't it impressive?

We've removed "existing" keyword because we did not need it any longer: the surrogate attribute evaluates to null in all those non-existing coordinates. So, if we look at this cell

It will sum to

(British Columbila, Customer1_Surrogate, measure) +

(British Columbila, Customer2_ Surrogate, measure) +

. . . + (enumerated all customers) +

(British Columbila, CustomerN_ Surrogate, measure)

But for all those customers who do not belong to British Columbia, such tuples will automatically evaluate to null, because attributes "State-Province" and "Customer Surr" are INDEPENDENT on each other.

If you used the real dimension key (instead of surrogates) in the formula, then IT WOULD NOT EVALUATE TO NULL, because real key "vanishes" the British Columbia (because the "State-Province" is dependent on the real key). That is why real key attribute doesn't suit us.

Conclusion

We optimized everything from 7.6 seconds to 179 milliseconds – in 42.4 times, and we COULD STOP on this, but we did not. Instead, we made one more optimization – from 179 milliseconds to 25 milliseconds, that is in 7.16 more times.

On this example with Adventure Works this may seem as "not important". But believe me, on real huge databases this plays a big role.

From now we try to make surrogate attributes every time in every dimension, and we always consider using them instead of real keys in all calculations where we have enumeration of the values.

P.S.

You can get acquainted with our BI suite "Business Analysis Tool" at our website www.bitimpulse.com

2edb1268-2305-4be8-a47f-928b2412f6a0|1|5.0|96d5b379-7e1d-4dac-a6ba-1e50db561b04

Advanced AMO BrowserThis tiny SSAS tool allows to make several things: 1) check for unprocessed partitions; 2) generate...New Features of BAT Version 4.100New release allows you to resize width of columns and height of rows in the tables, updated user man...Date and Time Dimension TemplateIn this post we will show how we do date and time dimensions and how it can help dynamic date select...

Add comment

Name* Required Please choose another name

E-mail*RequiredPlease enter a valid e-mail

Country

5+5 =

b i u quote

Comment
Preview

Notify me when new comments are added

MDX: Block Evaluation and Surrogate Attributes

Conclusion

P.S.

Related posts

Add comment

About company

Calendar

Mo	Tu	We	Th	Fr	Sa	Su
25	26	27	28	29	30	31
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	1	2	3	4	5

Mo	Tu	We	Th	Fr	Sa	Su
25	26	27	28	29	30	31
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	1	2	3	4	5

MDX: Block Evaluation and Surrogate Attributes

Conclusion

P.S.

Related posts

Add comment

About company

Calendar

Category list

Newsletter

Thank you

Mo	Tu	We	Th	Fr	Sa	Su
25	26	27	28	29	30	31
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	1	2	3	4	5