cancel
Showing results for 
Search instead for 
Did you mean: 

Full Promotion Using LCMCLI - Very Slow - Not Utilizing RAM

omacoder
Active Contributor
0 Kudos
450

I've read the pattern book over and over but the pattern book is not giving me the answers to solve this problem. I have a clustered environment (6 servers). One of them is strictly a LCM node

The first property file I'm testing is the top 1-2000 static, relationships, SI_PARENT_FOLDER_CUID, SI_OWNER, SI_PATH FROM CI_INFOOBJECTS,CI_APPOBJECTS,CI_SYSTEMOBJECTS WHERE DESCENDENTS("SI_NAME='Folder Hierarchy'", "SI_ID in (23)") AND SI_INSTANCE=0 ORDER BY SI_NAME ASC,SI_ID ASC

It runs for over 3 hours for 2000 reports... the RAM stays at 6.8GB in use out of 48GB total.

I have a dedicated LCM node which runs these services:

  1. CMS
  2. APS.Promotion (-Xms6g -Xmx8g)
  3. AJS.Promotion (Xms20g,Xmx30g)
  4. FRS.In
  5. FRS.Out

LCM_CLI.bat parameter is -Xmx8g

This should leave 2gb left for OS/SIA/CMS/FRS

Properties File (CMS):

  1. LCM_CMS (MSP-BILCM01:6400)
  2. SOURCE_CMS (MSP-BILCM01:6400)

I've created an LCMCLI user group and server group. I've put the 5 servers above in this server group. I've put the Administrator account in this user group. I've set the default settings on the LCMCLI user group to strictly use the servers in group LCMCLI for scheduling.

Also, when I go into CMC \ Sessions, I see Administrator has sessions open with client = Lifecycle management, however the CMS that these sessions are using is not the CMS that I have defined in the properties file for the LCM and SOURCE CMS.

Where am I going wrong? How do I make use of all of this RAM to speed this puppy up?

omacoder
Active Contributor
0 Kudos

I should mention this is a full promotion - so exclude dependencies = true

Accepted Solutions (1)

Accepted Solutions (1)

Joe_Peters
Active Contributor
0 Kudos

I don't have an answer for you, but some thoughts:

You can't control which CMS will own a session, or which one will process a CMS query. (You also can't fully disable a CMS but you knew that already). Specifying a particular server and port in the config file only identifies the cluster; the CMSes will decide for themselves which one will own your session.

There are a lot of processes involved in a migration: the local LCMCLI process, the source CMS, LCM CMS (if different) and several others. The question is, which part is causing the delay? CMS queries have a default timeout period of 9 minutes; your process is running for 3 hours so I don't think the delay is in the CMS query processing. (you could confirm this by running it in Query Builder).

You have dedicated LCM services, but it's possible to set up a fully-dedicated LCM system. This would be an entirely separate BOE cluster that does nothing but host LCM and its dependent services. No guarantee this would help, but it would reduce any contention with the LCM and other services.

I believe the LCM APS will generate intermediate files in the FRS as its importing objects from the source CMS. So there will always be hard disk transfer constraints regardless of how much RAM you have.

One workaround I can think of is to do one root folder at a time, rather than everything as you have it now.

omacoder
Active Contributor
0 Kudos

Thanks Joe- appreciate your thoughts.

Makes sense on CMS-- while I've dedicated LCMCLI servers, it sounds like I should have built an entirely different BI system outside of the cluster to take advantage of segregated services.

As mentioned I have split out the queries into about 35 different queries, but it does include the use of the TOP clause and the order by clause. However, these queries run fast in Query Builder, so contention isn't here.

Not sure if I can describe the contention well as still learning how this is working, but I just kicked it off again to capture what I'm visually seeing:

  1. 2:11PM cmd executed
  2. 2:15PM stack trace shows that the CMS query is done and has gathered the 2000 objects (I'm only doing 2000 per batch). It writes out the list of the reports to stack trace. I would assume at this point the CMS query is done, so the contention should NOT be the CMS query.
  3. 2:11PM derby db created in C:\Users\BIService\AppData\Local\Temp\2 and *.dat files are writing in seg0 folder
  4. 2:16PM it appears to stop writing files to derby seg0
  5. 2:16PM it starts creating 0KB lcmbiar files in temp dir
  6. 2:24PM it writes more *.dat files to derby seg0
  7. 2:24PM it stops creating lcmbiar files in temp dir and now they have all changed from 0KB to ~400000kb
  8. 2:27PM it wrote 4 more small *.dat files in derby seg0
  9. 2:27PM it created it's first 0KB *lcmbiar file in the final destination as specified in my properties export location
  10. At this point... it sits and I don't see any file movement occurring -- no clue what is happening on the server but this is where it sits forever. Never during this time have I seen the CPU or RAM spike (on the LCM node). Ram is still only utilizing 6GB. It must be creating more intermediate files somewhere, possibly on another one of the nodes in the cluster. I enable LCMCLI high tracing at this point, and below is what the trace shows it's doing. This is what takes forever. How do I improve this step?
|3CDB0F6527A24DEDB4ADDE9C3A2775265dd|2019 10 02 14:46:47.773|-0500|Information| |==| | |LCMCLI| 8576|   1|main            |{|0|2|0|2|BIPSDK.StreamingDownloadFile:nextChunk|msp-bilum02:8576:1.270172:1|-|-|BIPSDK.StreamingDownloadFile:nextChunk|msp-bilum02:8576:1.270172:1|Cr9GyQVUvUPwhlqZ6M.8RNE4a35e|||||||||||START OUTGOING CALL Outgoing: FROM [BIPSDK.StreamingDownloadFile:nextChunk#msp-bilum02:8576:1.270172:1]
|3CDB0F6527A24DEDB4ADDE9C3A2775265de|2019 10 02 14:46:47.789|-0500|Information| |==| | |LCMCLI| 8576|   1|main            |}|2|0|0|2|BIPSDK.StreamingDownloadFile:nextChunk|msp-bilum02:8576:1.270172:1|-|-|BIPSDK.StreamingDownloadFile:nextChunk|msp-bilum02:8576:1.270172:1|Cr9GyQVUvUPwhlqZ6M.8RNE4a35e|||||||||||END OUTGOING CALL Outgoing: SPENT [00.016] FROM [BIPSDK.StreamingDownloadFile:nextChunk#msp-bilum02:8576:1.270172:1] TO [fileserver_BICMS02.FRS.In.getContentImmediate#localhost:4528:4804.355503:1]
|3CDB0F6527A24DEDB4ADDE9C3A2775265df|2019 10 02 14:46:47.789|-0500|Debug| |<<| | |LCMCLI| 8576|   1|main            |}|2|1|0|1|BIPSDK.StreamingDownloadFile:nextChunk|msp-bilum02:8576:1.270172:1|-|-|BIPSDK.StreamingDownloadFile:nextChunk|msp-bilum02:8576:1.270172:1|Cr9GyQVUvUPwhlqZ6M.8RNE4a35e|||||||||||StreamingDownloadFile:nextChunk: 00.016
|3CDB0F6527A24DEDB4ADDE9C3A2775265e0|2019 10 02 14:46:47.960|-0500|Debug| |<<| | |LCMCLI| 8576|   1|main            |{|0|1|0|1|BIPSDK.StreamingDownloadFile:hasNextChunk|msp-bilum02:8576:1.270173:1|-|-|BIPSDK.StreamingDownloadFile:hasNextChunk|msp-bilum02:8576:1.270173:1||||||||||||StreamingDownloadFile:hasNextChunk
|3CDB0F6527A24DEDB4ADDE9C3A2775265e1|2019 10 02 14:46:47.960|-0500|Debug| |<<| | |LCMCLI| 8576|   1|main            |}|0|1|0|1|BIPSDK.StreamingDownloadFile:hasNextChunk|msp-bilum02:8576:1.270173:1|-|-|BIPSDK.StreamingDownloadFile:hasNextChunk|msp-bilum02:8576:1.270173:1|Cr9GyQVUvUPwhlqZ6M.8RNE4a35f|||||||||||StreamingDownloadFile:hasNextChunk: 00.000
|3CDB0F6527A24DEDB4ADDE9C3A2775265e2|2019 10 02 14:46:47.960|-0500|Debug| |<<| | |LCMCLI| 8576|   1|main            |{|0|1|0|1|BIPSDK.StreamingDownloadFile:nextChunk|msp-bilum02:8576:1.270174:1|-|-|BIPSDK.StreamingDownloadFile:nextChunk|msp-bilum02:8576:1.270174:1||||||||||||StreamingDownloadFile:nextChunk
|3CDB0F6527A24DEDB4ADDE9C3A2775265e3|2019 10 02 14:46:47.960|-0500|Information| |==| | |LCMCLI| 8576|   1|main            |{|0|2|0|2|BIPSDK.StreamingDownloadFile:nextChunk|msp-bilum02:8576:1.270174:1|-|-|BIPSDK.StreamingDownloadFile:nextChunk|msp-bilum02:8576:1.270174:1|Cr9GyQVUvUPwhlqZ6M.8RNE4a360|||||||||||START OUTGOING CALL Outgoing: FROM [BIPSDK.StreamingDownloadFile:nextChunk#msp-bilum02:8576:1.270174:1]
|3CDB0F6527A24DEDB4ADDE9C3A2775265e4|2019 10 02 14:46:47.976|-0500|Information| |==| | |LCMCLI| 8576|   1|main            |}|2|0|0|2|BIPSDK.StreamingDownloadFile:nextChunk|msp-bilum02:8576:1.270174:1|-|-|BIPSDK.StreamingDownloadFile:nextChunk|msp-bilum02:8576:1.270174:1|Cr9GyQVUvUPwhlqZ6M.8RNE4a360|||||||||||END OUTGOING CALL Outgoing: SPENT [00.016] FROM [BIPSDK.StreamingDownloadFile:nextChunk#msp-bilum02:8576:1.270174:1] TO [fileserver_BICMS02.FRS.In.getContentImmediate#localhost:4528:6596.355504:1]
|3CDB0F6527A24DEDB4ADDE9C3A2775265e5|2019 10 02 14:46:47.976|-0500|Debug| |<<| | |LCMCLI| 8576|   1|main            |}|2|1|0|1|BIPSDK.StreamingDownloadFile:nextChunk|msp-bilum02:8576:1.270174:1|-|-|BIPSDK.StreamingDownloadFile:nextChunk|msp-bilum02:8576:1.270174:1|Cr9GyQVUvUPwhlqZ6M.8RNE4a360|||||||||||StreamingDownloadFile:nextChunk: 00.016
|3CDB0F6527A24DEDB4ADDE9C3A2775265e6|2019 10 02 14:46:48.164|-0500|Debug| |<<| | |LCMCLI| 8576|   1|main            |{|0|1|0|1|BIPSDK.StreamingDownloadFile:hasNextChunk|msp-bilum02:8576:1.270175:1|-|-|BIPSDK.StreamingDownloadFile:hasNextChunk|msp-bilum02:8576:1.270175:1||||||||||||StreamingDownloadFile:hasNextChunk
|3CDB0F6527A24DEDB4ADDE9C3A2775265e7|2019 10 02 14:46:48.164|-0500|Debug| |<<| | |LCMCLI| 8576|   1|main            |}|0|1|0|1|BIPSDK.StreamingDownloadFile:hasNextChunk|msp-bilum02:8576:1.270175:1|-|-|BIPSDK.StreamingDownloadFile:hasNextChunk|msp-bilum02:8576:1.270175:1|Cr9GyQVUvUPwhlqZ6M.8RNE4a361|||||||||||StreamingDownloadFile:hasNextChunk: 00.000
|3CDB0F6527A24DEDB4ADDE9C3A2775265e8|2019 10 02 14:46:48.164|-0500|Debug| |<<| | |LCMCLI| 8576|   1|main            |{|0|1|0|1|BIPSDK.StreamingDownloadFile:nextChunk|msp-bilum02:8576:1.270176:1|-|-|BIPSDK.StreamingDownloadFile:nextChunk|msp-bilum02:8576:1.270176:1||||||||||||StreamingDownloadFile:nextChunk
|3CDB0F6527A24DEDB4ADDE9C3A2775265e9|2019 10 02 14:46:48.164|-0500|Information| |==| | |LCMCLI| 8576|   1|main            |{|0|2|0|2|BIPSDK.StreamingDownloadFile:nextChunk|msp-bilum02:8576:1.270176:1|-|-|BIPSDK.StreamingDownloadFile:nextChunk|msp-bilum02:8576:1.270176:1|Cr9GyQVUvUPwhlqZ6M.8RNE4a362|||||||||||START OUTGOING CALL Outgoing: FROM [BIPSDK.StreamingDownloadFile:nextChunk#msp-bilum02:8576:1.270176:1]
|3CDB0F6527A24DEDB4ADDE9C3A2775265ea|2019 10 02 14:46:48.164|-0500|Information| |==| | |LCMCLI| 8576|   1|main            |}|2|0|0|2|BIPSDK.StreamingDownloadFile:nextChunk|msp-bilum02:8576:1.270176:1|-|-|BIPSDK.StreamingDownloadFile:nextChunk|msp-bilum02:8576:1.270176:1|Cr9GyQVUvUPwhlqZ6M.8RNE4a362|||||||||||END OUTGOING CALL Outgoing: SPENT [00.000] FROM [BIPSDK.StreamingDownloadFile:nextChunk#msp-bilum02:8576:1.270176:1] TO [fileserver_BICMS02.FRS.In.getContentImmediate#localhost:4528:2876.355505:1]
|3CDB0F6527A24DEDB4ADDE9C3A2775265eb|2019 10 02 14:46:48.164|-0500|Debug| |<<| | |LCMCLI| 8576|   1|main            |}|2|1|0|1|BIPSDK.StreamingDownloadFile:nextChunk|msp-bilum02:8576:1.270176:1|-|-|BIPSDK.StreamingDownloadFile:nextChunk|msp-bilum02:8576:1.270176:1|Cr9GyQVUvUPwhlqZ6M.8RNE4a362|||||||||||StreamingDownloadFile:nextChunk: 00.000
Joe_Peters
Active Contributor

Ok, I did a test in my sandbox environment because I was curious about the various phases. I noticed the same steps you did -- creation of 0-byte lcmbiar in temp, creation of Derby, etc.

I just happened to have CMS query tracing turned on on that server, so took a look at the trace after the final lcmbiar was created. I found a lot of queries being generated, seemingly from LCMCLI. Many of them looked like the following:

SELECT TOP 1 SI_ID, SI_CUID FROM CI_INFOOBJECTS,CI_SYSTEMOBJECTS,CI_APPOBJECTS WHERE SI_HIDDEN_OBJECT IN (0,1) AND SI_ID IN(9252)

and

SELECT TOP 1000 static, relationships from ci_infoobjects, ci_appobjects, ci_systemobjs where (SI_KIND='DependencyRule') and (SI_ID in (4110))

and

SELECT TOP 1000 static, relationships, SI_PARENT_FOLDER_CUID, SI_OWNER from ci_infoobjects, ci_appobjects, ci_systemobjs where ((SI_HIDDEN_OBJECT IN (0,1))) and (SI_ID in (5554,5141,5230))

So, I'm guessing the delay you're experiencing is due to LCMCLI submitting these secondary queries to the CMS. A lot of things can cause slowdown in CMS queries, but I'd suggest checking for contention on your CMS repository database.

omacoder
Active Contributor
0 Kudos

Joe, I'm accepting your answer.

I built out a separate node outside of the primary cluster. This, for whatever reason, is running much faster

Of course the environment is not the same (number of users logged in, time of day, etc) when I ran using the dedicated mode, but the timing is significantly improved.

ALSO- the LCM cluster is running 4.2 SP07 Patch 500 which the primary cluster is on 4.2 SP06.

Examples:

qryA) Using primary cluster: 1hr 6 min

qryA) Using separate cluster: 10 min

qryB) Using primary cluster: 3hr 39min

qryB) Using separate cluster: 31 min

Joe_Peters
Active Contributor
0 Kudos

Wow - that's quite a difference. Just curious -- what did you use for the CMS database on your new cluster? The default SQL Anywhere db, or the same one as your primary cluster?

Answers (4)

Answers (4)

shiva_vishnubatla
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi Brain,

i couldnt get much further just based on this behavior and that log entires. do you have a comprehensive AJS trace showing what its up to for at least a good chunk of time so that we can see what its up to ?

If you have a lot of them, or cant attach here, I understand, we can follow up offline and review them or get them reviewed on our end to check if its wasting cycles somewhere.

thanks,shiva

omacoder
Active Contributor
0 Kudos

Thanks Shiva- I will work on getting more traces today- but I'm sure you seen above the thread with Joe Peters. He ran a test and found that there are a lot of CMS queries running during the time where I seem to think the system is 'stuck'. What are these queries doing?

I'm still afraid that LCMCLI is calculating dependencies during this time, even though I have the switch in the properties file to NOT calculate dependencies.

shiva_vishnubatla
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi Brian, can you please update your version info. I'll try to get some eyes on this thread.

omacoder
Active Contributor
0 Kudos

Thanks Shiva.

Source system: 4.2 SP06 Patch 500 in our local data center.

Target system: 4.2 SP07 Patch 500 running on Microsoft Azure.

shiva_vishnubatla
Product and Topic Expert
Product and Topic Expert
0 Kudos

The only processes expected to take up any ram are lcm_cli and ajs. You do have plenty of ram assigned already.

you can try separating export and import:
you can also eliminate destination from the equation by trying to generate a lcmbiar instead of live to live - this may give you some idea if maybe pushing to target is where its slower.

Are you including security? if so, aAre the users enterprise or AD/ldap involved? If
AD/ldap is involved, you maybe able to speed it up by disabling AD plugin on source and/or target - this will remove unnecessary round trip to AD.

omacoder
Active Contributor
0 Kudos

Hi Shiva- I'm following the pattern book for a full system promotion, exporting to biar, first users/groups, then dependencies, and then reports. But the reports are broken down into many separate queries utilizing order by and top 1-2000. I have a total of 35 properties files that I would like to execute. But at ~3 hours per query, this is 5 days of downtime JUST TO EXPORT! Not gonna work!

Per pattern book full promotion, yes, I'm including security and excluding dependencies. Except, LCM CLI still appears to be excluding dependencies because if I remove the ClearCase service from the APS.Promotion, LCM_CLI blows up saying it failed checking for dependencies. Once I add this service back in, then it runs without error.

There are AD users. Again, per the pattern book, AD is disabled at TARGET. But, I'm not even to the point yet where I'm at target.

denis_konovalov
Active Contributor
0 Kudos

RAM doesn't speed up relationship calculations.
Not to mention that they are running on source CMS most likely, not on your dedicated node.

The ram you give to LCM is in use when it starts making lcmbiars for actual moving of content.