Bits and a Pint: Performance Issues I've observed over the years

Over the years I've seen several performance issues that can be broken down into a few fundamentals: tune the heap, cache whenever/wherever, avoid excessive chatter (logging, network, etc...), avoid single threaded bottlenecks and be mindful of memory consumption. Below are several examples of actual issues.

Caching

If you have an operation that requires heavy or repetitive processing on a data set that rarely changes then the results of that operation should be cached. I've seen many occurrences where developers don't think about caching except as an after thought, but this should be the first thing in mind. Also, when designing the cache make sure to design for the data changes and how you plan to keep the caches updated as it doesn't do any good to have a cache that doesn't accurately reflect the state of the system.

Yes there is such a thing as over caching. Generally it comes from adding too much data to the cache that can cause a JVM to run out of memory - especially if the cache is local to the current session. As a general rule you want session based caches to be light weight keys and let them rebuild the relevant data from backing caches that contain the full objects. Make sure you balance your caching with your heap sizing and GC options.

Incorrect heap sizing

Quite often folks do not size the heap correctly - especially when it comes to various components of the heap: permgen, survivor and tenure spaces.

Incorrect GC options

Every Java application uses the heap in it's own unique way and will require tuning of the Garbage Collector to get the proper results.

Over usage of Introspection

Many introspection calls result in action to syncronized methods - i.e. Introspector.getBeanInfo, and if you have high volume's of threads this will eventually start blocking. For issues like this you would be best to maintain your own internal highly concurrent cache. See the source code for the Introspector here to get an idea on the issue: http://javasourcecode.org/html/open-source/jdk/jdk-6u23/java.beans/Introspector.java.html

MVC

Make sure that the MODEL object the MVC pattern is only on the request and not on the session. Adding it to the session just inflates memory usage for what should be "throw-away" data. Just like with caching, make sure you are holding the data at the right level to avoid potential long-term memory issues.

Hibernate

Hibernate requires a good cache to be efficient. More-over one needs to consider the usage of query caching and if/when it will help. Furthermore, the usage of lazy loaded items is a must when building/considering the data scheme. Finally, I've found that hibernate is a "pig" on per-request memory as it uses dehydration/re-hydration with it's caching mechanism. In this if you have the potential for large object graphs that would result in high short lived objects it might be best to shy away from hibernate as an ORM and use on that caches full objects and then utilizes pointers to those - especially if you don't expect the objects to be changed frequently (like a read-only product catalog). Another note is that Hibernate can be potentially slow if/when used for Batch operations and is overall hard to adjust the "generated" SQL as the path it takes is not always optimal.

Tags and Tag Pooling

When using tags on a page tag pooling becomes especially useful as it keeps the application server from hitting the backing hashmap for the context to get the necessary data when using a tag's TLD. With this in mind it is critical to ensure that tag's are thread safe so that they can be pooled and essentially cached.

Incorrect Logging

Logging level's should be well established to avoid things like Thread Contention on the logger, log file overload, excessive I/O, etc… I've observed times where incorrect log levels are set and accidentally make their way into production. When coding ensure that all logging is done at the correct level and wrapped appropriately - i.e. any log.debug() statements get wrapped by a log.isDebugEnabled() check as this will help to avoid allocating potential bytes to the heap when not in a DEBUG state.

Pooling

It's important to utilize pooling effectively. Too many times a pool is not correctly set, be it too small or too large. You might ask, how can a pool be too large and what are the issues with that? One example that I can recall is a thread pool that was set so large that if all the threads were in use the JVM would more than likely run out of memory either due to running out of heap space or not being able to allocate native heap. Other things to consider on a large pool is the reap interval, if set to high you run the risk of leaving many idle threads consuming potential resources (memory), but if set to low you'll run into bottlenecks due to the overhead it takes to create a new thread.

DB Interaction

Many times I've observed SQL calls that are repeated frequently during the same request. In cases like this one has to ask if this should be cached - see caching bullet point above. If it's a repetitive call and the data isn't changing, then chances are that the call should be cached and then it becomes a question as to what level - request, session or application. You should always watch out for heavy SQL Queries that take a long time to execute and see if there is a way to either redesign the query or the code to be more optimal; it may even require the changes to the data model such as the addition of an index or getting the data into a view or materialized view.

Bits and a Pint

Daily Power Ups

Monday, July 22, 2013

Performance Issues I've observed over the years

No comments:

Post a Comment