Solved: Aggregate caused more database activity

Read only

746

SAP Managed Tags
BW (SAP Business Warehouse)

We've created an aggregate and have tested some queries on it. Most queries run better. But one query selects 360,000 database records w/o aggregate and selects 537,000 database records with the aggregate. It also takes more time on the database.

I thought that in the worst case scenario an aggregate would return the same results. So I was surprised to see this result. Does anyone know of a scenario where the aggregate could actually perform worse then the cube? Any ideas of where I could look next?

Thanks,

Chris

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Accepted Solutions (1)

Is your Aggregate compressed or uncompressed ?

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Both the cube & the aggregate are fully compressed.

Ravi,

Your question made me look at database size to make sure I answered correctly. My aggregate has 311978 records in it (1.84 mil in cube.) Yet if I read the staticis correctly there is 537095 database reads. That doesn't make sense to me.

I must be misundertstanding the statics.

a. Is field QDBSEL in table RSDDSTAT the number of rows from the cube/aggregate that had to be read inorder to retreive the correct results.

b. Does the value in QDBSEL (537095 in this case) represent cube/aggregate records read. Or can it represent records read from dim tables and master data tables?

Chris

Have you verified your row counts by using SE16? It actually runs a query and gets true current row counts in your aggregate and cubes. If you are looking at something else you might be getting counts based on the last time statistics were collected which could be recent or quite some time ago depending on exactly how your shop is handling statistics collection.

You can collect the fresh statistics a couple of ways, but the easiest is using the Performance tab.

The QDBSEL should be the rows read from the fact table. How does it compare to the QDBTRANS, the rows transferred to the OLAP processor?

As to how you could be having record counts greater against the aggregate than against the base cube, I can't explain that. I have never seen aggregate reads higher than off the base cube scenario. I'm assuming you are running the query thru RSRT twice, once with the aggregate selection turned off and once with it on.

Have you looked at an explain plan of the two executions? Are they the same except for the reference to the base and aggreagte fact tables.

What database are you using?

It is entirely possible to have a query against an aggregate deliver poorer performance than the base cube.

The main reason would be if you do not have much of a degree of summarization between the base cube and the aggregate.

Another reason that I can think of is if you have multiple aggregates and a query that has muliple restricted key figures resulting in muliple "split parts" running against several of the aggregates vs a single execution against the base cube, or a sinlge aggregate. Doubt that's the case here.

Are you working with inventory or some other application tha requires markers in the aggregate?

Chris

I use ST03N transaction in Expert mode to know the number of records transferred from Database.

You can try that.

Thanks

If you know any SAP employee, please request to forward the internal Note:422264 which contains information about fields of table RSDDSTAT

It took me a little while to figure out ST03N, but I like the information that it gives. Thanks. It confirmed my problem that the aggregate takes more time and reads more records then going straight to the cube.

Chris

Reply to the various questions posted by Pizzaman.

Based on your advice I've verified row counts in SE16. I get the same result. (Cube 1.84 mil & aggregate .31 mil rows.)

Aggregate:

QDBSEL = 537095

QDBTRANS = 21904

Cube:

QDBSEL = 360626

QDBTRANS = 15052

Yes, I'm using RSRT twice, once with the aggregate selection turned off and once with it on.

I've not been able to look at the explain plan. Any help on how to do this? I've tried to use RSRT option to show the SQL but I get an error message. I need to figure out why.

We are running on DB2 UDB.

We only have one aggregate at the time. But we do have 12 restricted key figures in the query.

No markers in the aggregate.

Thanks,

Chris

Since you're running DB2, I can't guarantee how if/how well the screens I mention will match since I'm in an Oracle environment. But we'll try....

You can use ST04 to get to the SQL Stmst in that have run recently. SM50 might show you the SQL while it's running, but it might truncate what it displays. ST04 is a good tuning tool to use to examine high resource using queries. You can filter on the fact/aggregate tables to narrow the stmts you'll see.

With respect to RSRT, I check the box called Show Run Schedule, NOT the Show SQL stmt. The Show SQL displays the generic SQL BW generates that must then be translated to the target DB - which is what you'll see in the Show Run Schedule.

So if you select Display Aggregate and the Show Run Schedule, what I think you'll see is what SAP BW refers to as "Split Parts" of the query, and you'll see several 1, 2, 3.... to 12 , or more since you have so many RKFs. Each Split Part will show you whether it was able to use the aggregate, or if it chose the base cube.

Somewhere in the bowels of the BW beast when a query is submitted, the system looks at a query and decides if the query can be satisfied by running against the aggregate rather than against the base cube. To do this it has to examine the query to see whether everything you are looking for is in the aggregate, if not, it will run against the base cube.

Seems simple enough, but (big BUT) what happens if you have multiple RKFs in your query? BW is going to examine each RKF and a generate a query for all the ones it can satisfy from the aggregate and and another separate query for all the RFKs it must get from the base cube. That's what I think is going on in your case. So now you have two queries running. This can get even more interesting if you have multiple aggregates, in which case, depending on the RKFs, it might use multiple aggregates and the base cube.

So now you have multiple queries ( from what I have seen, they run sequentially rather than in parallel) and they are all reading dimension, master data, as well as fact/aggregate tbls. These multiple result sets must be merged by the OLAP processor. This is how you can end up with a query that uses aggregates that can run longer than the query against the base cube and have to perform more DB selects.

Pizzaman,

Thank you. You've answered my question and also provided me with a lot to chew on.

I ran the RSRT with option Display Aggregate. And noticed that there was a 1 & 2. 1 was on the agg and 2 on the cube. I was able to compare them both and determine the missing value. I added the value to the aggregate and tried again. This time the aggregate performs better.

I'm not crazy about this query split. That means that if I don't create my aggregates correctly, I could cause more problems then I help.

ST04 was a little overwhelming, but I can see where this will be a valuable tool once I get use to it.

Thanks,

Chris

I've never seen the split part stuff discussed any where. Aggregates are always touted as the thing to do and people go off blindly creating them.

When I first started trying to tune queries against a cube, I created 5 (what I thought were carefully crafted)aggregates (some were parent/child) and thought I had deserved the Nobel prize. Queries were running faster, but no where near the improvement I had expected. As I started to look into the mess like you have been doing, I found that some of the user queries were invoking all 5 of the aggregates because of the different RKFs we had.

So I finally scaled back to just one slightly larger aggregate and got rid of the others and got another very nice performance gain. Simpler really can be better sometimes.

So aggregates really can be a little more involved than they are made out to be.

Glad to help. Have fun with this stuff.......

Answers (0)

Ask a Question

Top Q&A Solution Author

User

Count

By Category

Related Content

Activity Groups

Industry Groups

Influence and Feedback Groups

Interest Groups

Location Groups

Customer Only Groups

Forums

Related Resources

Products

Learning and Support

About

My Account

My Account

Aggregate caused more database activity

Know the answer?

Need more details?

Accepted Solutions (1)

Accepted Solutions (1)

Answers (0)