Skip to main content

How HashMap works internally in java

How Hashmap works Internally in Java

In this article, we are going to see how HashMap internally works in java. Also, we will have a look at what Java 8 made changes on the internal working of Hashmap to make it faster.

What is Hashmap and Hashing in java?

How Hashmap works Internally in Java is majorly dependent upon the Hashing Principle. So, Before going to learn how HashMap works internally in java, lets first understand what is HashMap and hashing.

HashMap :

A HashMap is a map used to store mappings of key-value pairs. Also, it works on the Principle of Hashing. To know more about the HashMap, visit this article: HashMap in Java

Hashing Principle :

Simply, Hashing is a method used to produce an integer value from an object and this integer value known as the hash value. In HashMap, the key object is used for Hashing.

Internal Struture of the HashMap in java

For internal working of HashMap, HashMap maintains an array of bucket, each bucket is a linked-list and linked list is a list of nodes wherein each node contains key-value pair.
Note : After Java 8, a bucket can be a linked-list or a binary tree depending upon the Threshold. We will discuss this later in the post.

Array of Bucket or Bucket Table :

The array of bucket is also known as the Bucket Table of the HashMap in Java. HashMap uses hashing method to calculate the index of array or which Bucket is going to be used. Internally Bucket array defined as
Internal Representation of Bucket Array
// Each Node of the Array is a Bucket
Node<K,V>[] table;

Bucket :

A bucket is a linked list of nodes where each node is an instance of class Node<K,V>. The key of the node is used to generate the hash value and this hash value is used to find the bucket from Bucket Table.
Note: When two or more keys generate the same hash, then it called a Hash Collision. In that case, the new node adds to the linked list. We will discuss more this later in the post that how HashMap works internally in java to handle Hash Collision.

Node :

Each node of the Linked List is an instance of class Node<K,V> and actually store key-value pair that we are storing in the HashMap. This class is an static inner class of HashMap class and it implements the Map.Entry<K,V> interface. The representation of this Node as :
Inner Static Node class of HashMap
static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        V value;
        Node<K,V> next;
}
  • final int hash : hash value of the key
  • final K key : Key of the node
  • V value : Value of the node
  • Node<K,V> next : Pointer to the next node present in the bucket or linked-list

How Hashmap Calculate the Index of Bucket in java?

To understand how HashMap works internally in Java, we must know about how the HashMap calculates the index of the bucket. Till now, we know the internal structure of HashMap, that HashMap maintains an array of the bucket. But when we store or retrieve any key-value pair, HashMap calculates the index of the bucket for each and every operation.
The Key object is used to calculate the index of the bucket. By using this key, hash value is calculated using hash(key) private method of HashMap.
Note: hash(key) method is a private method of HashMap that returns the hash value of the key, also if the hash value is too large then converts it into a smaller hash value.
But what will happen, if hash value of Key Object returns the integer that is greater than the size of the array i.e., hash(key) > n, then ArrayOutOfBoundsException could be raised. To handle this situation, HashMap reduces the hash value between 0 and n-1 using an expression :
Index Calculating Expresssion
index = hash(key) & (n-1)
Now, this index value is generated is used by HashMap to find bucket location and can never generate any Exception as the index value always from 0 to n-1.

Role of hashCode() and equals() in HashMap

The hashcode() and equals() have a major role in how HashMap works internally in java because each and every operation provided by the HashMap uses these methods for producing results.

hashCode() :

The hashCode() method puts a major impact on how HashMap works internally in java as it is used to find the bucket location from the Bucket Table.
For any given key, the hash(key) private method of HashMap used to calculate the hash value. Internally, the hash(key) method uses hashCode() method of the key object to get the hash value.
Note :
a) If the key is null then the hash value returned by the hash(key) private method of HashMap will be 0.
b) If the key.hashCode() is too large then hash(key) private method of HashMap converts it into smaller number.

equals() :

The equals() method also puts an impact on how HashMap works internally in java as it is used to check whether the key is present or not in the Bucket after the bucket is determined from the Bucket Table or Bucket Array.
Or In other words, When the bucket location is calculated then each node of the bucket is traversed and compared with the key object to check whether the key is present or not in the Bucket.

So, we can say hashCode() is used to find which bucket and equals() is used for key uniqueness.

How put() method of Hashmap works internally in java?

The put() method is a method of HashMap used to store key-value pair.
Syntax of put() method
hashmap.put(key, value);

Working of put() method :

1. The key is used to calculate the hash value by calling private hash(key) method, internally hash(key) method call hashCode() method of key.
hash = hash(key);
Note : HashMap can store one null value as the key. In that case, the hash value returned by the hash(key) method will be 0 and 0th bucket location will be used.
2. The above hash is reduced from 0 to n-1 to calculate the index of bucket (where n is the size of an array of the bucket).
index = hash & (n-1);
3. A new instance of Node<K,V> class is created.
Node<K,V> newNode = new Node(hash , key , value , null); // null for next pointer
4. The index of the bucket is used to fetch the bucket, then the new node is added to the fetched bucket.

How Hash Collision is resolved?

When the hashCode() method of two or more key generate the same value, then hash collision will occur. To resolve this issue, HashMap uses the concept of linked-list.
  • When a new key and the key already in the HashMap generate the same hash value : then HashMap indexing the same bucket (bucket that already contains nodes in the form of linked-list). Then, the new node added at the last of the linked-list. The next pointer of Node class is used to create Linked-List.
  • When a new value object is added with an existing key: logically, HashMap replaces the old value with new value related to the key. To do this, index of the bucket will be calculated (which will be the same for the key), then the indexed bucket or linked-list is traversed and key of the each node(k) is compared with the key by equals() method i.e., key.equals(k). When the equals() method returns true then the value of that node is replaced with the new value.

Time Complexity of put() method

HashMap store key-value pair in constant time which is O(1) as it indexing the bucket and add the node. But in worst case, it can be O(n) when all node returns same hashCode and added into the same bucket then traversal cost of n nodes will be O(n) but after the changes made by java 8 it can be maximum of O(log n).

How get() method of HashMap works internally in java?

The get() method is a method of HashMap used to retrieve value by providing the key.
Syntax of get() method
value = hashmap.get(key);

Working of get() method :

1. The key is used to calculate the hash value by calling private hash(key) method, internally hash(key) method call hashCode() method of key.
hash = hash(key);
Note : HashMap can store one null value as the key. In that case, the hash value returned by the hash(key) method will be 0 and 0th bucket location will be used.
2. The above hash is reduced from 0 to n-1 to calculate the index of bucket (where n is the size of array of bucket).
index = hash & (n-1);
3. If the bucket is null, then null will be returned.
             Else the first node of the indexed bucket is fetched and key of that node(k) is compared with the key (provided in the get(key) method of HashMap), comparison is done by equals() method i.e., key.equals(k).
             If the equals() method returned true then value of that node is returned otherwise next pointer of the node is used to fetched the next node in the list and apply the same procedure to that node. This process goes on till equals() returned true or next pointer is null. In case of next pointer is null then null is returned (which represents key is not present in the bucket or we can say not present in the HashMap).

Time Complexity of get() method

Similar to put() method, HashMap retrieves value by key in constant time which is O(1) as it indexing the bucket and add the node. But in worst case, it can be O(n) when all node returns same hashCode and added into the same bucket then traversal cost of n nodes will be O(n) but after the changes made by java 8 it can be maximum of O(log n).

Load Factor and Initial Capacity of HashMap in java

Load Factor and Initial Capacity are two important factors that govern how HashMap works internally in java.
Initial Capacity is a measure used internally by HashMap that represents the number of bucket or size of the bucket array at the time of creation of HashMap. By default the initial capacity is 24 = 16.
Load Factor is a factor that is used by HashMap internally to decide when the size of Bucket array needs to be increased. It is 0.75 by default i.e., when the number of nodes in the HashMap is more than 75% of Total Capacity then the HashMap grows its bucket array size.
Note : The capacity of HashMap always doubled each time when HashMap needs to increased its bucket array size i.e., 24 = 16 (initially), 25 = 32, 26 = 64 etc.

When to increase the size of bucket array in the HashMap?

As we know, both load factor and available capacity together is used by HashMap to decide when to increase the size of bucket array.
Load Factor : 0.75
Initial Capacity : 16 (Available Capacity initially)
Load Factor * Available Capacity = 0.75 * 16 = 12
So, at the time when 12+1 = 13th key-value pair is added to the HashMap, then HashMap grow its bucket array size i.e, 16*2 = 32.
Now Available capacity : 32
----------------------------------
Next Time when
Load Factor * Available Capacity = 0.75 * 32 = 24 i.e., 25th key-value pair is added to the HashMap, then HashMap grow its bucket array size i.e, 32*2 = 64 and so on.

Concept of Rehashing :

When HashMap grows its bucket array size, then Rehashing is done. Re-Hashing is a process where bucket index is calculated for each node again w.r.t. the updated size of the bucket array.

Changes made by Java 8 in internal working of HashMap

How HashMap works internally in java 8 is a little bit different from prior versions of java. HashMap in java 8, maintains a value called TREEIFY_THRESHOLD, it is an Integer Constant and currently the value of TREEIFY_THRESHOLD is 8. It is used as whenever in any bucket the number of nodes becomes more than this Threshold value then the data structure of that bucket is convert from linked-list to balanced tree. This change is done to make put-get operation faster as the linked-list provides O(n) time complexity while the time required to traverse the balance tree is O(log n). So that's how java 8 made HashMap much faster than its old version.

Related Article


So, this is all about how HashMap works internally in Java. In this post, we learn what is hashing, the internal structure of hashmap, how HashMap works internally in java to store and retrieve key-value pair and the changes made by java 8.
This is important to understand the internal working of HashMap as in most of the interview questions are related to HashMap, also HashMap is used in our day-to-day programming so with the better understanding we can use the HashMap efficiently.

Comments

  1. Nice blog on how hashmap works internally in java.Really a good source for beginers to start and explore this deep concept.

    ReplyDelete
  2. The way you explained is tremendous. Great Article

    ReplyDelete

Post a Comment

Popular posts from this blog

ArrayList in Java

ArrayList in Java In this article, we are going to discuss ArrayList in Java. ArrayList is a class from the Java Collection Framework and widely in our programs. We use ArrayList as a substitute for the array. So, it is important to know ArrayList in detail. We are going to see what is ArrayList in Java, how we can create the instance of ArrayList and the methods provided by the ArrayList in Java.

If you already familiar to ArrayList, you may learn common programs related to ArrayList.
Table of ContentsWhat is ArrayList in Java?Hierarchy of the ArrayList in java?How to create objects of ArrayList in Java?How to create an ArrayList of Custom Objects?How to iterate ArrayList in Java?How to get the elements by index from ArrayList in Java?How to check whether an elements is stored in the ArrayList in Java?How to remove elements from the ArrayList in Java?Few more methods of ArrayList in Java?
What is ArrayList in Java? ArrayList in Java is a List in which we can store and retrieve elemen…

HashMap in Java

HashMap in Java In this article, we are going to learn about the most important class of Java known as HashMap. HashMap in java is the most used map interface and it is used in our regular or day-to-day programming practices.
What is HashMap in java? Lets first start with basic definition of HashMap,
HashMap is a map which store key-value pair, where each key is mapped with a value.
HashMap defined in java.util package and implements map interface which provides functionalities to perform basic operation like store, retrieve, size of the map etc. It is a part of Java Collection Framework. Time complexity of HashMap in Java? : Time complexity to store and retrieve data from the HashMap is O(1) in the Best Case. But it can be O(n) in the worst case and after the changes made in Java 8 the worst case time complexity can be O(log n) atmost. Internal working of HashMap in java HashMap maintains an array of the buckets, where each bucket is a linked-list and the linked list is a list of…