on 2022 Jun 25 6:12 AM
Version 17 (build 2000) introduced the ability to load tables in parallel and create indexes in parallel (by using the new BEGIN PARALLEL WORK statement).
Unfortunately, parallel creation of foreign keys is currently not implemented.
I understand that this must be a complicated algorithm, because when creating a foreign key, in addition to the master/detail tables, many other tables are locked ('Schema':'Exclusive'&'Shared').
I would like to ask two questions.
1) Is it possible to get a list of such locks in advance, for example, using a query to system tables?
2) Can we use a query on the system tables to understand whether there will be errors when creating two foreign keys in parallel (for example, using a trigger event)?
Request clarification before answering.
Creating foreign keys means using a CREATE TABLE or ALTER TABLE statement. As you are dealing with load table, I assume the FKs are meant for tables that are immediately filled with existing data (like when doing a rebuild or importing external data).
In such cases, you usually will declare foreign keys AFTER the data has been loaded for performance reasons, as has beend discussed here, so you are going to use ALTER TABLE to add foreign keys.
In my understanding, several ALTER TABLE statements are serialized by design, because the docs tell:
A checkpoint is carried out at the beginning of the ALTER TABLE operation, and further checkpoints are suspended until the ALTER operation completes.
If my understanding is correct, your second question will be moot as you won't be able to use two ALTER TABLE statements in parallel.
However, as Ivan has explained in the mentioned FAQ, the RI check when declaring a foreign key is already using parallelized queries, so even if you probably cannot build foreign keys in parallel, the individual keys seem to be created using parallelism.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Just to add a wild guess:
SQL Anywhere automatically creates (logical) indexes for any primary or foreign key. As several logical indexes can be based on the same physical index (as long as the index column(s), column order and ordering are identical), you might use BEGIN PARALLEL WORK to create fitting physical indexes for your foreign keys in parallel, so ALTER TABLE does not have to build a physical index itself but can rely on the already built ones. And you could drop those "help indexes" lateron...
Note, I have not tried that suggestion, and I'cant tell whether the overall performance will increase.
Sorry for the delay in reply.
"you usually will declare foreign keys AFTER the data has been loaded for performance reasons..."
I proceed as follows: before unloading, I drop that part of the foreign keys (about 20%), which, when reload, consumes 80% of the time.
After the reload is complete, I create the foreign keys that were removed in the first step using a separate script.
"A checkpoint is carried out at the beginning of the ALTER TABLE operation"
I conducted an experiment in single-user mode with the creation of only one foreign key (*) on a real database with a size of 2G.
When creating a foreign key, I get 55 records in sa_locks() (table_type= 'BASE', lock_class = 'Schema', lock_duration = 'Transaction'). Of these: 41 times lock_type = 'Exclusive' and 14 times lock_type = 'Shared'.
In my opinion, all this is difficult to explain using a checkpoint.
"the individual keys seem to be created using parallelism"
During the usual reload of foreign keys, monitoring showed the load of only 1 processor out of 36.
(*) ALTER TABLE "dba"."treat"
ADD FOREIGN KEY "fk_treat_#_patients_accounts" ("account_id" ASC)
REFERENCES "dba"."patients_accounts" ("id")
ON UPDATE CASCADE;
In my opinion, all this is difficult to explain using a checkpoint.
That's a misunderstanding: My point was not that a checkpoint would be used INSTEAD OF locks but that ALTER TABLE does an initial checkpoint and prevents other statements to do a checkpoint until it finishes, so it's even more "exclusive" that an exclusive lock on one or two tables...
During the usual reload of foreign keys, monitoring showed the load of only 1 processor out of 36.
FWIW, have you used sa_conn_info() to notice intra-query parallelism?
Again, AFAIK that's a misunderstanding: A checkpoint does not use locks.
See this FAQ for the steps during a checkpoint.
In my understanding, the locks will be due to the ALTER TABLE statement itself preventing other transactions from using using the "about to be altered table" (probably including the referenced tables)... - and additionally the ALTER TBALE itself will be blocked as long as other transactions have a shared schema lock on that table because they still access the table...
User | Count |
---|---|
52 | |
8 | |
5 | |
5 | |
5 | |
5 | |
5 | |
5 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.