grokking hard

code smarter, not harder

Why there is no SoftHashMap?

Posted at — 2013-Dec-09

The above question bumped into me when I was developing a simple cache using java.util.Map. In the search for the answer, I came across a thread on JavaRanch that had the same question as mine. And there was an answer which had widen my knowledge in soft/weark references in Java.

I quoted the answer here in case the link is inaccessible.

In the few cases, where I used weak references (or soft references), I used the classes, (and queues) directly. Never used the WeakHashMap class. So, never thought about the possibility of the SoftHashMap class. IMO, the WeakHashMap class was written for internal (as support) reasons, and then exposed because others may need it. Maybe that is why SoftHashMap was never written.

In thinking about it a bit more, I don’t think that it is possible to write a SoftHashMap class. Remember that a SoftHashMap isn’t just a WeakHashMap that is using SoftReferences. It has to also not “distrurb” those soft references. Let me explain….

To implement a WeakHashMap, just use WeakReferences on the keys. This means, of course, that you need to do an additional deference to add/remove the key/value pairs. You will also need to attach a ReferenceQueue so that you can detect that a key has been GC’ed – which the WeakHashMap will use to automatically delete the key-value pair.

There is a minor flaw though – particularly if more than one key lands in a bucket. The WeakHashMap needs to deference the key (ie. make it a hard reference temporarily) so that it can use the hashCode() and equals() methods for the operation. This means, that in theory, it is possible for a key (that isn’t referenced by the application) to survive a GC cycle – if by chance theWeakHashMap is doing this operation….. This is probably not a big deal, as the chances are ridiculously low, and it can be picked up on the next GC cycle.

This becomes a major flaw with the SoftHashMap. Remember that SoftReference objects may or may not be GC’ed even if there are no hard references. It is at the whim of the JVM, which decides based on the amount of memory that it needs. The idea is, if there are lots of memory being freed, there is no reason to completely delete a cache that could be useful to the application.

The issue is, the SoftReference determines which instance to GC based on the access – it keeps timestamps of which object has been used and how recent (Least Recently Used). This means, that the temporary conversion of the soft reference to a hard reference (to do the equals()) will affect the LRU. A key that hasn’t been accessed by the application in a very long time, may survive a GC purely because it shares a bucket with a more commonly used key. And unlike the WeakHashMap issue, the chances are not low, and it may not be picked up in the next GC cycle.

Basically, you have a SoftHashMap that has soft references, but isn’t really a soft hash map.

Henry Wong

The link: http://www.coderanch.com/t/487535/Performance/java/WeakHashMap-SoftHashMap