GeekLondon.com Help icon Syndication Feed icon 

Hibernate performance problems caused by over-eager toString() implementations

On my last contract we had a little performance problem. I found out about this when a colleague was discussing how best to go about cacheing a catalogue entity that he was reading from the database. This rang some warning bells and I asked a few questions. The catalogue entity had a few lazy one-to-many (Set) associations with other entities, but the rest of its properties were simple strings. Apparently this was taking over a minute to load.

The underlying query (edited to remove Hibernate mangling) should have been something like this:

SELECT ID, NAME FROM CATALOGUE WHERE ID = 42

There were a few thousand catalogues in that table, but on a developer database with no other processes hitting it, there's no way that that query could legitimately take more than a couple of seconds even assuming the worst case scenario for setting up a connection. And given the performance we were generally seeing, I would have expected something more like a fraction of a second for that query. Hmmm.

We enabled the hibernate.show_sql property and saw something very much more complex. The query was looking something like this:

SELECT ID, NAME FROM CATALOGUE WHERE ID = 42

SELECT ID, NAME FROM ALBUM WHERE CATALOGUE = 42

SELECT ID, NAME, CONTENT_ID FROM TRACK WHERE ALBUM = 1
SELECT ID, NAME, CONTENT_ID FROM TRACK WHERE ALBUM = 2
SELECT ID, NAME, CONTENT_ID FROM TRACK WHERE ALBUM = 3
SELECT ID, NAME, CONTENT_ID FROM TRACK WHERE ALBUM = 4
...
SELECT ID, NAME, CONTENT_ID FROM TRACK WHERE ALBUM = 2984
SELECT ID, NAME, CONTENT_ID FROM TRACK WHERE ALBUM = 2985

SELECT ID, PATH FROM CONTENT WHERE ID = 39920
...

This is heavily abbreviated. As well as the Catalogue, Album, Track, and Content entities there were half a dozen others. Every single entity in the graph below the original catalogue entity was being pulled out of the database. Not surprisingly this took quite a while to achieve. If anything I was impressed that it was performing so well, but that wouldn't last under realistic load - time to figure out what was going on.

The most obvious explanation for this scenario, of course, is that the mappings weren't being lazy loaded. However, the default behaviour in Hibernate 3 is lazy loading. This is what we were using, and except for a few special cases this is what we'd left things at.

I'll leave out most the painful details, including a great deal of head scratching, and (I'm ashamed to say) a certain amount of voodoo programming where we just tried playing with the lazy loading related attributes on the mapping files. What should have been obvious to me as soon as I saw that the entire graph was being loaded and that the default lazy attribute was in place was that something must be walking the graph to force the load. And indeed I eventually found it in a completely innocuous looking line of code:

public Catalogue findCatalogueByID(final Integer catalogueId) {
  final Catalogue cat = (Catalogue)getHibernateTemplate().get(catalogueId);
  log.info("Loaded catalogue: " + cat);
  return cat;
}

There's nothing intrinsically wrong with that, of course. Logging an entity is a perfectly reasonable thing to do. The problem was that all of the entities had been generated to derive from a base class that had a toString() method that looked like this:

import org.apache.commons.lang.builder.ToStringBuilder;
import org.apache.commons.lang.builder.ToStringStyle;

public class BaseEntity {
   public String toString() {
      return ToStringBuilder.reflectionToString(this, ToStringStyle.MULTI_LINE_STYLE);
   }
   ...
}

The problem here, if you haven't spotted it, is that the toString() method will use reflection to convert every attribute into a String. Not only the normal property attributes, but also the associations. It iterates over any Collection attributes, and prints those too - which in turn forces the lazy collection implementation to materialize its contents so that they can be printed.

I think that this approach to string building is probably a pretty bad idea in code generally. Unless you absolutely trust your developers to put appropriate condition blocks around their logging code to prevent the related overhead when in less verbose logging levels, you could get some minor performance problems. Worse, when it's enabled you risk making your logs so verbose that they become unreadable. It's a judgement call, but I think that it's probably a bad idea. But for Hibernate code it's guaranteed to be a bad idea if used on entities that have lazy associations or properties.

For Hibernate entities, I think that you should try and restrict yourself to printing the ID and any property fields that are not permitted to be null, This is enough for most debugging purposes, since it allows you to identify the commonest constraint violations and identify the appropriate row in the database. Anything more than that and the debugger is your best bet.

Even assuming that you've avoided the mistake of causing toString() to materialize the complete graph of entities, watch out for logging code of this type:

log.info("Loaded catalogue: " + cat.getId() + " containing " + cat.getAlbums().size() + " albums.");

That one's a little more conspicuous. Unless you have flagged the association with lazy="extra" you will be materializing the albums association when you call size() on its collection.

Posted at Mar 20, 2007 9:19:15 PM, and last updated Apr 14, 2007 10:59:37 PM