cancel
Showing results for 
Search instead for 
Did you mean: 

Is there something like an "intra-batch" parallelism?

VolkerBarth
Contributor
2,706

SQL Anywhere has long offered "inter-query parallelism" to allow the simultaneous execution of different requests, and has introduced "intra-query parallelism" in v10 to allow the parallel execution of parts of a single query.

But what about several statements within one code block or batch - may they be run simultaneously, too, or is each statement executed serially?

Say, I have a small code block like the following - would SQL Anywhere try to do the calculations in parallel (as they are based on different tables):

begin
   declare nCount int = 0;
   set nCount = nCount + (select count(*) from Products);
   set nCount = nCount + (select count(*) from Customers);
   set nCount = nCount + (select count(*) from SalesOrderItems);
   select nCount;
end;

(Yes, when using the SQL Anywhere demo database, a parallel execution would certainly be not worthwhile here, but I hope you get the idea...)

What about code blocks with external functions - might they run in parallel? (AFAIK, the engine uses separate worker threads to handle them.)

Accepted Solutions (0)

Answers (2)

Answers (2)

Breck_Carter
Participant

It may be a moot point, but the example you show might not benefit from "intra-batch parallelism" because it already exploits "intra-query-parallelism" to the maximum extent.

Here's the plan for one of your queries, using Foxhound's very own Table From Hell (a few days' worth of connection history for a target server with 700+ connections):

select count(*) from rroad_group_2_property_pivot;

count()
4605632


Re Is there something like an intrabatch parallelism

Here's part of the plan (SUBQ 3) after a manual rewrite to encourage "more parallelism"... SQL Anywhere implements each of the subqueries separately, and each of them uses intra-query parallelism to the full extent:

SELECT (select count(*) from rroad_sample_set)
     + (select count(*) from rroad_group_1_property_pivot)
     + (select count(*) from rroad_group_2_property_pivot) AS nCount;

nCount
4637562


Re Is there something like an intrabatch parallelism

VolkerBarth
Contributor
0 Kudos

Yes, that's expected behaviour - the question remains "If you can rewrite that, can the optimizer do that, too?"

But apparently that's a very difficult question as the engine would need to evaluate whether a change in the order of execution of different statements would make a difference (because they are not independent of another) or might change the transaction semantics... - rocket science, methinks.

Breck_Carter
Participant

Short WAG: "You're SOL, Earth Man!" 🙂

Long WAG...

I think "inter-query parallelism" is the simultaneous execution of separate requests made by separate connections, not just "separate requests".

Presumably, "intra-batch parallelism" might be considered an extension of "intra-query parallelism" since by definition a batch (or block) is sent to and processed by the server as a single unit.

However, just-in-time optimization of each statement is a hallmark of SQL Anywhere. The challenge of "parallelizing sequential programs" has been a topic of advanced research among human beings for decades... throw in just-in-time optimization, and "parallelizing SQL Anywhere batches" becomes a topic for alien super-beings.


Back here on Earth, the docs have some interesting topics...

Parallelism in the SQL Anywhere server

SQL Anywhere threading

Plan caching

"Normally, the optimizer selects an execution plan for a query every time the query is executed."

Query optimization and execution

"Optimization is essential in generating a suitable access plan for a query. Once each query is parsed, the optimizer analyzes it and decides on an access plan that computes the result using as few resources as possible. Optimization begins just before execution. If you are using cursors in your application, optimization commences when the cursor is opened. Unlike many other commercial database systems, SQL Anywhere usually optimizes each statement just before executing it. Because SQL Anywhere performs just-in-time optimization of each statement, the optimizer has access to the values of host and stored procedure variables, which allows for better selectivity estimation analysis. In addition, just-in-time optimization allows the optimizer to adjust its choices based on the statistics saved after previous query executions."

Parallelism in client applications (probably off-topic, but interesting nonetheless)

SQLCA management for multithreaded or reentrant code

"... if you use a single connection, you are restricted to one active request per connection. In a multithreaded application, you should not use the same connection to the database on each thread unless you use a semaphore to control access."

Multiple SQLCAs

"Each thread must have its own SQLCA."

"All operations on a given database connection must use the same SQLCA that was used when the connection was established."

alt text

VolkerBarth
Contributor
0 Kudos

So you got a snapshot of such an "alien super-being"? - Sorry, Ian, I'm relating to the one on wheels...

separate requests made by separate connections

Thanks for pointing that out: AFAIK SA does not currently execute separate requests from the same connection simultaneously - and obviously it would make a difference for clients if that would be changed and they might have to switch to a somewhat asynchronous mode to accept results.

As to "intra-batch parallelism": I should note I'm not asking for such a feature, I'm just interested whether this is already planned or thought about given that intra-query parallelism is supported, and that parallelizing serial code is generally a topic in the PL landscape and partly supported by compilers/libraries.