Additional Blogs by SAP
cancel
Showing results for 
Search instead for 
Did you mean: 
0 Kudos
679
Since we made the tool freely available we got a lot of positive feedback. Most comments on the Web are about its good performance and we are quite happy about this response, of course. But interestingly, performance was just the first step and we weren’t aware of the other questions when we started to develop the tool.

At the time we achieved reasonable good performance we hoped to be ready to analyze gigabyte-large productive heap dumps, but we found out that we still couldn’t isolate the memory problems. We had just created a much faster but otherwise still ordinary heap walker - yet another one. Heap walking wasn’t doing us any good on heap dumps with tens or hundreds of millions of objects. It was then that we realized that we need not only to solve the problem of poor performance but also the problem of missing functionality to spot the problems. To illustrate this, I’d like to discuss together with you a very common and practical memory problem which haunted us during the first versions: What to do about millions of Strings?

Now, heap walking means, you can walk the references to and from an object. Doing that for millions of Strings is difficult. Where are the aggregation patterns behind the Strings? You need to walk a pretty big sample, let’s say some thousand Strings to guess which components are responsible for those Strings. More than that, some data structures like a LinkedList make it even impossible to find the aggregation patterns because you have to walk quite long reference chains until you reach the LinkedList and finally an object of interest. The following screenshot illustrates this to you.

image

So we came to the conclusion that we should do the mass heap walking for the user. Instead of only offering the inspection of a single object, the user should be enabled to see the references for a set of objects. Grouping the references by classes improved the readability. This helped to some extent as the next screenshot shows.

image

However, more often than you like you end up in a new problem with this approach as you see above. If your millions of Strings are referenced somehow through a single HashMap you will lose track of this single HashMap if it is hidden among many other less packed HashMaps.

We had some ideas how to solve that problem. However, at the time we ran into this problem we had found out about the dominators and exploited their properties to get the retained size for any object instantly. You want to know how much memory and which objects will be collected by the GC if a single object is no longer referenced? The dominated objects tell you this, so our tool displays that information beside each and every object. Now, instead of going down the dominators to find out about the objects which will be collected you can go up the dominators to find out about the single object which keeps another alive. If you do that for your Strings one by one you know for each String which other single object keeps it alive. Group that by class and you have what you see below: You have identified the problem to a degree where you can start calling people.

image

The operation for this view is called Immediate Dominators and you can specify an exclude pattern to filter out the obvious or useless information, e.g. that char[] are stored in Strings and some are stored in collections. You don’t care for that, but you care for classes from customers or your packages and that’s what this operation does for you.

I hope I could illustrate by this simple but realistic example what makes the SAP Memory Analyzer so helpful. Good performance is important, but just the first step. What you need is not a fast tool, but a tool which gives you the answers you are looking for. For us the SAP Memory Analyzer does exactly this.

2 Comments