
Here is overall process flow with some more detailed steps under for each phase:
Standard set of preparation activities takes place before running initial runs of Data Migration, including:
mysql.optional.tabledefs=CHARSET=utf8 COLLATE=utf8mb3_bin
This has caused initial runs to fail due to existing duplicates in source database and additional requirement for cleanup on customer side. Another solution might be reprovisioning of database to case sensitive which was not an elected option.
Set of mandatory and optional properties is well described in the documentation, here is our set with comments:
# enable execution in cluster
migration.data.export.enabled=true
# Set as required
migration.data.failonerror.enabled=false
# Required driver, jar should be provided via external-dependencies.xml
migration.ds.source.db.driver=com.mysql.jdbc.Driver
migration.ds.source.db.username=<USER_NAME>
migration.ds.source.db.password=<PASSWORD>
migration.ds.source.db.schema=<SOURCE_SCHEMA_NAME>
# Following properties are only needed if customer customized typesystemname on source system! Please see documentation
migration.ds.source.db.typesystemname=<TYPE_SYSTEM_NAME>
migration.ds.source.db.typesystemsuffix=<SUFFIX>
# DB SYNC does not update ydeployment system type entries, hence you must migrate data to same suffix tables. That only affects CX Commerce type tables:
# atomictypes
# attributedescriptors
# collectiontypes
# composedtypes
# enumerationvalues
maptypes migration.ds.target.db.typesystemsuffix=<SUFFIX>
migration.ds.source.db.url=jdbc:mysql://<NATTED_IP_CONFIGURED_IN_CLOUD_PORTAL>:3306/<SOURCE_SCHEMA_NAME>?&rewriteBatchedStatements=true&useConfigs=maxPerformance-8-0&characterEncoding=utf8&useSSL=false&includeThreadNamesAsStatementComment=false&includeThreadDumpInDeadlockExceptions=false&includeInnodbStatusInDeadlockExceptions=false
migration.data.export.enabled=false
It was necessary to identifiy SUFFIX used by db.type.system.name (TYPE_SYSTEM_NAME) and configure migration.ds.source.db.typesystemsuffix and migration.ds.target.db.typesystemsuffix accordingly. In order to identify SUFFIX for specific db.type.system.name:
-- composedtypes is one of the system tables and has
select RIGHT(TableName, 1) from ydeployments where TypeSystemName = '<TYPE_SYSTEM_NAME>' and TableName like 'composedtypes%'
Let's get going! We need to have working backgroundprocessing service with /hac to be able to run Data Migration using DB Sync.
As we want to safeguard our Data Migration and avoid time consuming operations, we recommend Staged Copy migration strategy for Validation and adjustments phase. This is why we need to properly Initialize target system (ccv2) by creating default Application Tenant and special, temporary Migration Tenant.
⚠️Please keep in mind that Application Tenant and Migration Tenant are working definitions introduced only for this Blog, not present in the official documentation!
This is easy - just run standard Initialization!
This step is a bit more complicated. We need to create second set of database tables used by system only for Data Migration!
This can be done by setting following property in Cloud Portal > hcs_admin service - once saved do not apply this change (as shown in screenshot):
db.tableprefix=dbsync_
# since we want DB Sync to Migrate Data to tables associated with Application Tenant, not Migration Tenant we need to set following property
migration.ds.target.db.tableprefix=
The final step is to run Initialization again, this time for Migration Tenant!
No surprises here - the goal of this phase is to eventually run succesful Data Migration by iteratively executing and fixing data issues. Another goal is to adjust settings in case of performance problems.
⚠️This phase was primarily run on anonymized data, hence adjustments applied were not complete. It is good idea to assume worst case scenario and plan additional time to fix potential issues in tables affected by prior anonymization.
If you need that extra level of control, there is interesting Staged Copy Recreate Migration strategy variation you might want to take. The basis of this variation is to avoid schema adjusting and instead go for full DDL recreation. Here is high level description:
DECLARE
@TableName varchar(255)
DECLARE
CursorToDelete CURSOR FOR
SELECT TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_TYPE = 'BASE TABLE'
AND TABLE_NAME NOT LIKE 'init__%' -- Replace 'prefix' with your prefix
AND TABLE_NAME NOT LIKE 'MIGRATIONTOOLKIT%'
AND TABLE_NAME NOT IN ('JGROUPSPING', 'CCV2_TYPESYSTEM_MIGRATIONS')
OPEN CursorToDelete
FETCH NEXT
FROM CursorToDelete
INTO @TableName
WHILE @@FETCH_STATUS = 0
BEGIN
EXEC('DROP TABLE [' + @TableName + ']')
FETCH NEXT FROM CursorToDelete INTO @TableName
END
CLOSE CursorToDelete DEALLOCATE CursorToDelete
In our case, since (by customer decision) collation has changed to case sensitive majority of encountered issues was fixing duplicates. Those issues occured practically only on customer business data - hence the Customer was primarily responsible for fixing those issues on source side:
One of the specifics of this Data Migration was that we had the possibility of running Data Migration over and over again, with Customer adjusting data on Source on the fly. This will not always be the case, with following issues blocking such approach:
It is possible to mitigate these restrictions by making sure all the data is copied and then addressing issues on ccv2 side!
Once we addressed all the issues we get successful Data Migration execution:
In order to perform Update Running System (URS) on Customer data we need to switch to dataset associated with Application Tenant:
db.tableprefix=
Since we have no direct access to Database, in case of encountering issues during URS our only chance to fix them is to switch back to Migration Tenant!
There is no one optimal set of properties and configuration that fits all projects, here is just an example that was running stable on p1 environment. Final settings depend on:
Take a look at this documentation. For extreme cases there are techniques allowing ensuring stability or shortening duration but as this article is based on pretty standard case study we won't be describing them in details. High level recommendations include:
migration.data.reader.batchsize.{table}=30000
In this phase Data migration is executed as part of Cutover, preferably with production database as source system. This way we are sure that all potential production data issues are solved before actual go-live.
Nothing specific to be added here. The Cutover Plan, tested during dryrun executions specific to out project can be found below in the Flight Log.
After successful go-live don't forget to:
Here is actual step by step guide we prepared for Dryrun/Cutover Data Migration for one of our Customers. For this Direct Migration strategy was used!
task.engine.loadonstartup=false
migration.ds.target.db.typesystemsuffix=[suffix used in on-prem active ts]
migration.ds.source.db.typesystemsuffix=[suffix used in on-prem active ts]
migration.ds.source.db.typesystemname=[active ts name]
drop table typesystemprops1;
exec sp_rename 'dbo.typesystemprops', 'typesystemprops1';
task.engine.loadonstartup=false
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
2 | |
1 | |
1 | |
1 | |
1 | |
1 | |
1 | |
1 | |
1 | |
1 |