on 2014 Feb 04 6:55 PM
Hey all, We have recently migrated all our SAP systems from one datacenter to a new one, as well moved our systems from rack servers to VMware / Blades. Since then, we has been hit with some major performance bottlenecks.
The issue we seem to be seeing is with the database...
Here is the situation:
1. When we run a query or update in ECC Production (that use to take seconds in the old datacenter on the old server) - and now ends up taking i.e. 6
min to run.
2. However, when we run the same query in QA - and the report runs instantly (within seconds).
Both Prod and QAS are copies, both running on the same VM, same memory,etc. However, from what we see, production takes so incredibly long
to do what it use to in seconds... while QAS seems to run fine.
During these periods, we also see a lot of activity on the disks in production, and none in QAS. This leads us to believe the in production,
the oracle database is doing sequential reads nonstop.. while in QA - the data gets buffered as it should.
Does anyone know if:
1. This makes any sense?
2. What parameters should we look at that might cause this type of behavior... specifically, what are the parameters that would cause a
system to go to disk every time for data (vs. getting it from the buffers).
Any help resolving this is greatly appreciated
Richard
Request clarification before answering.
Hello Richard,
After the migration, have you update the statistics (note 588668)?
brconnect -u / -c -f stats -t oradict_stats
brconnect -u / -c -f stats -t system_stats
brconnect -u / -c -f stats -t all
Are the parameters correctly set (note 1171650)?
Are you using the latest patchset (note 1503709)?
Regards,
Eduardo Rezende
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Sanjoy,
To add another branch to this questioning.
What are your key ORACLE parameters set on QA and Prod?
I.e. SGA params, PGA params, Shared Pool, db_cache_size.
Perhaps easiest run "show sga" on both systems and paste here along with OS specs for RAM/CPU assigned to the virtualised instance.
Regards, Jamie
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Sanjoy Rath wrote:
During these periods, we also see a lot of activity on the disks in production, and none in QAS. This leads us to believe the in production,
the oracle database is doing sequential reads nonstop.. while in QA - the data gets buffered as it should.
Are these periods during which you run a query or update? What OS do you use? What storage type do you use?
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Richard,
the first important question is: How did you migrate your ECC production system from native hardware to VMware? Did you use R3load or a backup/restore based approach? This information is tremendously important as only with a bit-to-bit copy approach, side effects like changed structures (e.g. clustering factor, etc.) or changed set of statistics (with all of its consequences) can not harm your performance after migration.
Please do also not compare QA and PRD, if they are not 100% identical (and with 100% i mean from a structure, statistic and data point of view) - e.g. even just a slightly change can make a huge difference.
The additional problem in your case is that if the root cause is related to the virtual infrastructure you may have a problem by getting support for it (if it is not an already known issue).
I have written a blog post about a systematic and very constructive performance troubleshooting approach, that you may want to follow for your issue:
Once again - please do not compare QA and PRD (in many cases it is a waste of time - e.g. like here ) - just focus on your issue in PRD, drill it down to the root cause and fix it.
Regards
Stefan
P.S.: You may want to check out my services, if this problem is mission critical and you need prompt assistance. We also go the extra mile in such virtualized environments, if you need a work around while the vendors are fighting their support battle
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Stefan, and thank you for your help..
Our migration statagy was as follows:
- Installed vanilla SAP systems (same ver as the source)
- Setup Oracle Data guard and created a standby database in the new datacenter.
- kept both databases in sync till cutover
- then cutover during our golive date.
Of course we had to do a ton of things in between (i.e. copy over j2ee structure, global, profiles, etc, etc, but that is the high level overview of what we did.
Anyhow, I will do a full read of your blog.. just started and it looks awesome...
The thing however is, we are only looking at QA as it is an identical copy of prod.. yet it works great, and prod does not... its an odd issue... and if they are copies, they should work the same... As well, this is currently our only clue.
Thank you for all your help Stefan.
R
Hi Sanjoy,
let's summarize your provided information shortly:
If all of these points are expressed correctly, then the assumption that the root cause of this issue is caused by the underlying core layer (or its integration with it) becomes more and more likely. Please generate that raw SQL trace for that SQL and profile it (check my blog again for that). If you are using SAP Kernel 7.20 or higher you can also trace by username, which makes it more comfortable. After profiling you exactly know where your time is spent and can drill down further (e.g. on I/O layer or whatever) by stack tracking or system call tracing.
> its an odd issue ... and if they are copies, they should work the same ... As well, this is currently our only clue.
Not necessarily, but this would be to hard to explain via SCN forum
Regards
Stefan
Hello,
does the query use the same CBO execution plan?
BR
Andreas
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Sorry, just some additional info: We are running SAP ECC6 on Oracle 11.2.0.2.0
Thank you all
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
| User | Count |
|---|---|
| 7 | |
| 6 | |
| 5 | |
| 4 | |
| 4 | |
| 4 | |
| 3 | |
| 2 | |
| 2 | |
| 2 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.