Bits and a Pint: The Memory Analyzer Toolkit

About the Memory Analyzer

What is the Memory Analyzer?

The Eclipse Memory Analyzer also known as the Memory Analyzer Toolkit (MAT) is a tool that can be used for in-depth Memory Analysis of Java heaps. The beauty of this tool is the comprehensive views and canned queries that can be used to fully understand what is happening on the heap, which then allows you to better control how you develop with regards toward memory management.

Where can I get the Memory Analyzer?

The SAP Memory Analyzer is now a fully supported Eclipse project, known as the MAT (Memory Analyzer Toolkit) and can be downloaded either as a plug-in for Eclipse or as a stand alone application that uses the Eclipse RCP framework. Both of these can be downloaded from: http://www.eclipse.org/mat/.

Are there any tutorials on the Memory Analyzer?

There is a great 10 minute screencast associated with the SAP's site, which appears to still be valid. You have to register to view it, but registration is free and I think it's worth it to review. The screen cast is available here: https://www.sdn.sap.com/irj/scn/wiki?path=/display/Java/Java+Memory+Analysis (on right side of page you will see a text block regarding screencam).
If you would rather have a more in-depth live review of the application and it's functions the developers held a 45+ minute webcast which is hosted by live eclipse and is available here: http://live.eclipse.org/node/520 or as a link on the eclipse MAT project home page (http://www.eclipse.org/mat/).

Analyzing A heap

Preparation

1. Take a Heap dump from one of the methods listed below:

JConsole
- Under the MBeans Tab expand the com.sun.management bean and using the dumpHeap option pass in the fully qualified path and filename to where the dump should be created. NOTE: The file must end in .hprof which is the standard file type for a heap dump.

Using JMAP
- Navigate to the <JAVA_HOME>/bin
- Find the process ID for the Java process that you want to take a heap dump of.
- Run the following
  - jmap -dump:file=app.hprof <PID>

2. Archive the file

The heap dump can be quite large so you can archive it if transferring across the net to your local system for analysis. If the heap dump was taken on your system, then this step can be skipped.

3. Start the MAT

Startup the MAT utility either via stand-alone or open/start your eclipse IDE if installed as a plug-in

4. Open the heap dump

Open the heap dump by dropping down file menu and clicking open file
The MAT will start indexing/parsing the heap for usage and then build you an initial view of what it believes are the biggest leak offenders.

Analysis

Available Views

Histogram
- This view is essentially the data aggregated at the class level
The dominator tree
- This is a view of the actual objects and can be recursively opened to give a view of what objects are being retained by their parents (i.e. if the parent went away what would be freed).

Analysis Process

Take a look at the available leak chart that has been displayed by the MAT as being recorded as the biggest offenders (top consumers)
Open up the dominator tree and look for these objects. Hint: They will most likely be at the top
Expand the top offender and then take a look at what is being retained to cause it's offending footprint size
Next review the logic around why this object is in memory and being retained, is it necessary or can it be done differently to avoid the problem. Maybe the object is holding more data than required and causing the large heap footprint issue.

Case Study - Client X

The issue

The Client X sites were not able to handle more than 100 sessions per VM and it was noted that their VM's were constantly hitting memory ceiling settings of 1.5gb on Production. Observations of the sites running on Production also showed that there were no abnormal traffic patterns - no bots, email campaigns, etc... that could account for the high memory consumption on a regular basis. This indicated that there were potential issues with regards to how the VM's were utilizing memory. As a result heap snapshots were taken and the above process was followed for analysis.

Heap analysis - What was found?

When the dominator tree was reviewed it was clear that there were sessions with extremely large footprints (20mb), while others were of the 1mb size. The question is why? Upon drilling down on the Dominitor tree by following the large footprint, it became obvious that the problem was caching of the returned search results.

The Resolution

Now that we know search results cache is an issue, we turned our eyes to the code. In particular in the SiteSearchHelper object we found that the cache being built was storing the entire category item object and all of it's references. This can be extremely heavy given a particular sites setup and lead to large footprints for a single user session. While we determined that we needed the cache as this data is used across requests to avoid extra DB hit's and for navigation, it was determined that we really did not need to maintain all of the data about a category item, just it's PK. We can then go fetch the data on an as-needed basis, which allows us to remove this heavy data set from the session.

Follow up

Once the largest problem indicator was resolved, we repeated the process several times drilling down on the session and finding several other places that could also be improved. These were areas where we asked why are we storing that on the session or why is it being stored that way. These issues were each resolved in the previous manner.

Different Aggregate Views

Instead of finding the heaviest reference footprint to the session perhaps I wanted to use a more weighted approach and find which objects if Summed up would be heavy hitters so that they can be reviewed. In this case, I have a scenario where perhaps the sum of all relatively small violators is > than the sum of the heavy violators. Example: While search can take 20mb per session, there are only 5% of sessions that will contain this, but another item that can take 5m per session and is on 95% of the sessions could be a bigger issue that was dwarfed by the search issue.

This is where the histogram comes into play.

Within the dominator tree take a look at the Session objects by expanding the SessionManager object
Now lets take a look at what is on the session from a histogram perspective.
- Right Click on the LruCache object, which holds all of the individual session items, and then choose view histogram of retained set.
- Within the new view you can run a quick calculation of the retained heap by clicking the calculator on the toolbar and choosing calculate minimum retained heap size.
- Now Sort the histogram by the Retained heap size. This will give you statistics for each class. In particular we can look at the CommerceSession and the SiteSearchCache for comparing what these 2 classes do to our session. Both in terms of their total footprint and the actual number objects that are on the session.
  - When I look at this from a baseline test done for Party City I can see the following:
    - Search Cache has 13 objects that are consuming ~115mb
    - CommerceSession has 288 objects that are consuming ~80mb
  - This data tells me that while the Commerce Session is heavy, it is dwarved in comparison to the search cache. Note: That search is only 4.5% of the total sessions (288 / 13).

Bits and a Pint

Daily Power Ups

Friday, July 10, 2015

The Memory Analyzer Toolkit