Thursday, January 21, 2016

Discuss Internals of Hashmap in Java

Fig1. - Pictorial view for HashMap Internals.
Important Facts about HashMap
  1. HashMap is a container of pairs
  2. HashMap is list of Buckets arranged by hashcode of key
  3. When a bucket gets too large, hashmap implementation will change that bucket implementation to something like TreeMap, simply by checking number of elements in each bucket. This allows faster lookup when there are too many hash collisions
  4. Get, Put and Remove method offers Big O (1) time complexity (constant runtime), assuming the hash function disperses the elements properly among the buckets
  5. Load factor (default value of 0.75) decides when hashmap will regrow, when capacity reaches of the hashmap is 75% (0.75 is load factor) full, then hashmap will double its size
  6. Size of hashmap is always expressed in 2n to allows faster hashcode calculations using bitwise operator instead of modulus operator. Even if you provide a different capacity at the time of hashmap construction, it will automatically be adjusted to nearest 2n
  7. Capacity and Size are two different things - size is actual number of elements present in hashmap, capacity is size of its internal table at any given point in time. Capacity grows when size becomes equal to capacity x load factor
  8. As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the HashMap class, including get and put).
  9. Bucket uses TreeNode instead of normal Node if it contains elements more than specified in TREEIFY_THRESHOLD, which is defaulted to 8
  10. The iterators returned by all of this class’s “collection view methods” are fail-fast: if the map is structurally modified at any time after the iterator is created, in any way except through the iterator’s own remove method, the iterator will throw a ConcurrentModificationException. Thus, in the face of concurrent modification, the iterator fails quickly and cleanly, rather than risking arbitrary, non-deterministic behavior at an undetermined time in the future.
HashMap is defined as Array of Nodes as shown below

transient Node<K,V>[] table;

1 comment:

  1. Great overview of Hashmap. I didn't know that it automatically gets converted to Treemap for buckets greater than eight.

    ReplyDelete

Your comment will be published after review from moderator