Saturday, April 6, 2013

Discuss internal's of a Concurrent Hashmap (CHM) in Java 7


In Java 1.7, A ConcurrentHashMap is a hashmap supporting full concurrency of retrieval via volatile reads of segments and tables without locking, and adjustable expected concurrency for updates. All the operations in this class are thread-safe, although the retrieval operations does not depend on locking mechanism (non-blocking). And there is not any support for locking the entire table, in a way that prevents all access. The allowed concurrency among update operations is guided by the optional concurrencyLevel constructor argument (default is16), which is used as a hint for internal sizing.
Fig 1. Internals Implementation diagram for HashMap




ConcurrentHashMap is similar in implementation to that of HashMap, with resizable array of hash buckets, each consisting of List of HashEntry elements. Instead of a single collection lock, ConcurrentHashMap uses a fixed pool of locks that form a partition over the collection of buckets.

Here is the code snippet showing HashEntry class

static final class HashEntry {
        final int hash;
        final K key;
        volatile V value;
        volatile HashEntry next;
...
 
HashEntry class takes advantage of final and volatile variables to reflect the changes to other threads without acquiring the expensive lock for read operations.
The table inside ConcurrentHashMap is divided among Segments (which extends Reentrant Lock), each of which itself is a concurrently readable hash table. Each segment requires uses single lock to consistently update its elements flushing all the changes to main memory.

put() method holds the bucket lock for the duration of its execution and doesn't necessarily block other threads from calling get() operations on the map. It firstly searches the appropriate hash chain for the given key and if found, then it simply updates the volatile value field. Otherwise it creates a new HashEntry object and inserts it at the head of the list.
Iterator returned by the ConcurrentHashMap is fail-safe but weakly consistent. keySet().iterator() returns the iterator for the set of hash keys backed by the original map. The iterator is a "weakly consistent" iterator that will never throw ConcurrentModificationException, and guarantees to traverse elements as they existed upon construction of the iterator, and may (but is not guaranteed to) reflect any modifications subsequent to construction.
Re-sizing happens dynamically inside the map whenever required in order to maintain an upper bound on hash collision. Increase in number of buckets leads to rehashing the existing values. This is achieved by recursively acquiring lock over each bucket and then rehashing the elements from each bucket to new larger hash table.


Question : Is this possible for 2 threads to update the ConcurrentHashMap at the same moment ?

Answer : Yes, its possible to have 2 parallel threads writing to the CHM at the same time, infact in the default implementation of CHM, atmost 16 threads can write in parallel. But in worst case if the two objects lie in the same segment, then parallel write would not be possible. On the other-hand reads are not blocking in nature, so any number of threads can read the data at the same time, also reads can overlap with writes. 

Question : Can multiple threads read from a given Hashtable concurrently ?

Answer : No, get() method of hash table is synchronized (even for synchronized HashMap). So only one thread can get value from it at any given point in time. Full concurrency for reads is possible only in ConcurrentHashMap via the use of volatile.

7 comments:

  1. Nice work. Keep them coming.
    One question though, I had read that the default segment in CHM is 16, but you say 32. Is this 32 in Java 7?

    ReplyDelete
    Replies
    1. you are correct Ashoka, Java 7 also have default segment size of 16. I have modified the contents for the same.

      Delete
  2. This comment has been removed by a blog administrator.

    ReplyDelete
  3. How may threads can update the ConcurrentHashMap at the same moment and why ?

    ReplyDelete
    Replies
    1. DEFAULT_CONCURRENCY_LEVEL = 16, unless we specify some other value. Writes to same segment can not overlap.

      Delete
  4. I have a doubt. I do not know whether this is still active or not . My question is , is it possible to read and write in the same segment in a CHM and what will happen in this case.

    ReplyDelete
  5. Reads in CHM are non-blocking. That means it will not require any lock, so any write operation (put, remove, etc) from a different thread in same segment will not affect read. So we can conclude that read and update may overlap but read will reflect the results of the most recently completed update.

    On the other hand CHM update to same segment by multiple threads are blocking in nature and can not overlap.

    ReplyDelete

Your comment will be published after review from moderator