cancel
Showing results for 
Search instead for 
Did you mean: 

Bad Performance of Merge Join

mark_gearhart1
Explorer
0 Kudos

We are on ASE 15.0.3/EBF 21284 ESD#4.3 working on a application with over 3000 stored procedures.

Our server optimization goal is allrows_mix.

The Merge-Join is giving us problems. When a query uses Merge-Join, it usually take an order of magnitude longer to run than if we force it

to use the other types of joins - nested-loop, n-ary-nested-loop, or hash-join.

The query plan shows sorting on worktables leading into the merge-join.

I know I can disable it with "set merge_join off", or "set plan optgoal allrows_oltp", but I'd rather not if I can fix the problem instead.

Question: Are there configuration options that would help merge-join?

I've done variations of this:

sp_configure "number of sort buffers", 32000

I've also done variations of this in the proc:

set parallel_degree 5

set scan_parallel_degree 4

When I run the following command, I see sort buffer starvation:

1> sp_monitorconfig "sort buffers"

2> go

Usage information at date and time: Apr 24 2014  2:31PM.

Name                            Num_free    Num_active  Pct_act   Max_Used    Reuse_cnt

-------------------------                -----------            -----------      -------         -----------         -----------

number of sort buffers                0               82045   100.00           82045                  0

(1 row affected)

(return status = 0)

Maybe there are other configuration option to help merge-joins? Any ideas?

Thanks.

mark_gearhart1
Explorer
0 Kudos

Was hoping for some specific information on configuring for better merge-join performance.

Anyone else?

simon_ogden
Participant
0 Kudos

Hi,

A merge join is a merge join, its performance will depend on:

Tempdb perofmance if large sorts are taking place to provide sorted data to the merge operator (and cache availability).

Index cluster ratios if data is already sorted (and also data page/row cluster ratio if values are needed at the data level).

Contention for tempdb with other spids.

The ease of which the join key(s) comparison can be made (not tunable as will depend on the data types and data).

There's nothing you can do to directly tune a merge join, the merge operator just does what it can do in the fastest time it can on the cpu, remembering it may take a while to provide the sorted result set anyway.

As Mark says, the chances are it is doing multiple large sorts to facilitate the merge join as merge joins were favoured far too heavily on earlier 15x versions (and still are really).

You could make sure the optimizer level is ase_current which might help them not be selected.

You may also find you have some missing statistics (set option show_missing_stats on) which might be contributing to the selection of a merge join.

All in all though, the aim would be to get it to NOT select a merge join if it is proving inefficient. Making tempdb quicker and/or rebuilding indices are the only other real way you'll get the exisitng plan to run quicker.

Cheers,

Simon

simon_ogden
Participant
0 Kudos

I looked at your original post again. Don't worry about what looks like sort buffer starvation. That metric is basically usless via sp_monitorconfig.

'number of sort buffers' is a limit per sort operation, so if you have X spids all using 'number of sort buffers', you can actually get way more being used than the configured value for 'numbers of sort buffers'

Mark_A_Parsons
Active Participant
0 Kudos

You've stated that the merge join is a magnitude slower than nested loop joins and hash joins.

Why waste time trying to get a merge join to run 'faster' if you've got other options that already run faster?

mark_gearhart1
Explorer
0 Kudos

In our company, we would regression test 3000 procs if we changed the server-wide optimization goal. Alternately, we would work through each of the 3000 procs one by one, figuring out if it makes sense to use "set plan optgoal allrows_oltp". So, "wasting time", I suppose, is not a factual response if you have all the facts.

Thanks for your input anyway.

Mark_A_Parsons
Active Participant
0 Kudos

Well, I've worked for a couple (financial/banking) clients that had 1000's of procs (and standalone queries) and had allrows_mix set at the dataserver level.

One client, after months of testing and retesting and tweaking countless queries to work 'properly' (99% of them having to be forced to use allrows_oltp), and still only having made a dent in the stored proc inventory, I was asked to help with the tuning efforts.  To get through the migration (to ASE 15.x) I suggested getting back to comparable12.5 level performance by switching to allrows_oltp.  We completed the (re)testing of all procs in less time than had already been spent trying to force every query of every proc to run under allrows_mix. Then we manually tweaked the (relatively) few procs that could benefit from merge joins (though in most cases we opted for hash joins since it eliminated most of the worktable overhead of merge joins).

Another client insisted on using allrows_mix and manually having to tweak large quantities of of their queries (stored procs and standalone queries).  Yeah, they jumped through the hoops of trying to make merge joins run faster, failed to make any noticeable improvement, and continued on with forcing allrows_mix. They gave responses similar to yours when I suggested using allrows_oltp as their default setting soooo, it took awhile to work through their query inventory.

The first client understood the benefit of the KISS principle, and saved time by switching to allrows_oltp, which in turn free'd up my time to work on the more interesting/challenging performance issues (eg, improving EOD processing so they wouldn't overrun their SLAs).  The second client eventually got their inventory 'tuned' but it took a looong time to get to the finish line with allrows_mix (thankfully I work on an hourly basis).

Former Member
0 Kudos

regressing back to allrows_oltp setting to solve your performance problems should not be encouraged

as it devoids you from the powers of merge and hash join schemes,bad performance or the reason of wrong join scheme chosen should be investigated more accurately rather than blaming the join operators...because with the advent of newer versions of Sybase any inherent optimization problems are fast diminishing ...if any.

Mark_A_Parsons
Active Participant
0 Kudos

Well, I'm gonna have to emphatically disagree with your comment ...

"regressing back to allrows_oltp setting to solve your performance problems should not be encouraged"

For *EVERY* client I've worked with on migrating from ASE 12.5.x to ASE 15.x ... they all had the same objective ... get through the migration as quickly as possible and do not degrade the performance of our database queries.  Unfortunately for every client I've worked with ... ASE 15.x, and the default of allrows_mix, did just the opposite, ie, migrations took much longer than expected/planned due primarily to huge performance degradation across their SQL inventory.

For most of my clients merge joins were rarely, if ever, used in ASE 12.5.x.  And since hash joins never existed, that leaves us with using nested loop joins in ASE 15.x in an attempt to stay as close to ASE 12.5.x in terms of performance.

NOTE: No, I don't consider compatibility mode as a solution as this requires you go through 2 migrations ... once to compatibility mode ... and eventually once to get off of compatibility mode.

Now, can merge joins improve the performance of *some* queries?  Absolutely, but in practice ... especially with the first 4-5 years of ASE 15.x releases ... merge joins caused more headaches and performance degradation than they were worth.  I've seen too many clients spend huge amounts of time trying to re-write code to work with merge joins, often failing and having to 'regress back' to nested loop joins in the end.

Unfortunately a) Sybase delivered ASE 15.x with allows_mix as the default and b) most companies didn't have enough migration experience to understand the pitfalls of trying to run all of their queries under the default of allrows_mix.  This meant that many companies were left having to 'regress back' to alternative solutions (eg, allrows_oltp, compat mode, don't migrate, move to another RDBMS) to address the performance degradation introduced with ASE 15.x and the default setting of allrows_mix.

Former Member
0 Kudos

I do agree that going back to oltp setting would have enabled you to solve your apparent performance problems in a faster manner.

However at the same time it seemingly suggests that when choice of both merge and nl joins are thrown open to the optimizer it is wrongly selecting the merge join option over nl joins at least at some places.

It would be interesting to know what conditions are driving this selection and will it happen even if we provide enough/uptodate statistics information to the optimizer because missing stats may be one of the front running causes.

Accepted Solutions (0)

Answers (3)

Answers (3)

Former Member
0 Kudos

No joins are bad they are good for different cases, In your case you mentioned that nested loop joins are faster then your stats must be out of date. Try using set statistics plancost on and see if there is diff between actual and estimated rows.

Thanks

Former Member
0 Kudos

Did you check the estimated rows etc for the join condition? It is quite possible that the estimated rows are very high whereas actual rows are very less and that is why optimizer selected merge join but it was not efficient as compared to nested loop join as the actual rows were really less. You can use set statistics plancost on to check this.

if estimated and actual rows are quite similar than merge join will provide much better performance.

mark_gearhart1
Explorer
0 Kudos

Interesting. I was just reading over Jeff Tallman's 2014 presentation "Troubleshooting and Tuning ASE Statement Cache Query Performance", presented at the ISUG-TECH conference a couple weeks ago, and saw this at the bottom of one of the pages:


Sort Merge Joins = bad bad bad (unless 2 #temp tables)

Likely means an index missing or similar

So, it may be that no matter which version of ASE, the Merge Join is something to avoid if possible.

Mark_A_Parsons
Active Participant
0 Kudos

There's nothing wrong with the merge join *IF* your tables/indexes can support it.  In fact, merge joins can run circles around nested loop joins but again, this only happens *IF* your tables/indexes can support them properly (eg, no need for worktables).

While I can't speak for Jeff I think you're confusing 'merge join' with 'sort merge joins' ... 2 different animals.  'sort merge joins' are typically bad because of the overhead of building, populating and sorting worktables.  Eliminate the need for the worktable and a 'merge join' can be quite effective under the right conditions.