Introduction
In my Finding Memory Leaks with SAP Memory Analyzer I have described a general method to search for memory leaks using the SAP Memory Analyzer tool. This time I will describe a very concrete pattern for “wasting” heap in Java.
In many of the real-live heap dumps I investigate as part of my daily work I notice a huge number of collections which have been instantiated, but have never been used. It is not unusual that the standard collection classes are among the most oftenly used. It's also clear that some of them reserve a certain capacity for their elements (e.g. ArrayList, HashMap, etc…) in advance, which makes them a bit more expensive. But I was surprised how frequently I have seen hundreds of megabytes wasted in millions of such collections which have never been touched after their creation. These empty collections may add a significant and unnecessary memory overhead, but it is difficult to spot them withouth the proper "equipment". Below I will describe how this could be done with the help of SAP Memory Analyzer.
A simple example
Have a look at the source of MyValueStorage class below. It defines three fields, which are immediatelly asigned with a newly instantiated ArrayList. However, only one of them (standardValues) is always needed. One of the others (specialValues) is used in 10% of the instances of MyValueStorage, and the last one (erroneousValues) is used only in some exceptional cases.
An empty ArrayList, created with the default capacity of 10, costs 80 bytes on a 32 bit system, and 144 bytes on a 64 bit system.
If the class MyValueStorage is commonly used in our application, and there are 500.000 instances of it in the heap, then there will be 80Mb on a 32 bit system (on 64 bit - 144MB) reserved for the specialValues and erroneousValues lists. But only about 5% from them are ever needed. Therefore, it may be a good idea to use a lazy initialization for these two fields (keep them null untill they are actually used). The cost we have to pay is several “if” statements to avoid running into NullPointerExceptions.
On the other hand, if there are only a few instances of these objects the change is not really worth. Therefore, before undertaking an optimisation of this kind, one has to identify the places from which some significant amount of memory can be gained.
How to find unused collections
To search for collections which are empty and have never been used you could do the following:
- Let the application run as expected, and get a heap dump from the java process
- use OQL (Object Query Language) in Memory Analyzer to find the collections whose size is 0 and modification count is also 0, i.e. there are no elements in these collections at the moment the snapshot was taken, and they were never modified before. To start using OQL press the "OQL" button from the toolbar.
Check the help pages for a detailed description of the OQL syntax. For the moment we need only a few concrete queries. For finding enpty and unmodified ArrayLists, HashMaps and Hashtables they look like this:
select * from java.util.ArrayList where size=0 and modCount=0
select * from java.util.HashMap where size=0 and modCount=0
select * from java.util.Hashtable where count=0 and modCount=0
How much do the empty collections cost?
The result of an OQL query is an object list containing all objects matching the criteria. To figure out how much memory they are keeping together switch from the object list view to a histogram view
Then calculate the retained size for all of the collections (use the "Calculate Retained Size" context menu). For our example it is about 75 MB.
Who is keeping the empty collections in memory?
After we have identified that there are a lot of unused empty ArrayLists, let's try to find who is keeping them. The easiest for me way to check who is responsible for any set of objects I find in the heap is to use the “Immediate Dominators” feature from the context menu.
A more detailed explanation of this feature and some other possibilities for investigating why objects are kept in the heap is described is this SAP Memory Analyzer – Just a fast tool or is there more to it?.
This is the output of "Immediate Dominators" for the example:
There you can see that there are 499.500 instances of MyValueStorage class ("Objects" column) which are keeping 949.500 of the unused of the ArrayLists ("Dom. Objects" column). Once you have this info it is easy to find the real references between the dominator objects and the empty collections.
Try it yourself! Get a heap dump from your application and check if there are some classes which keep a lot of unused collections. Try at least with HashMaps and ArrayLists. May be you can find some spots where an optimization is worth the efford.