源代码系列01——HashMap源码分析

vlambda
2021-05-04

源代码系列01——HashMap源码分析

从本篇文章开始，将正式进入源代码系列。源代码系列，将会首先进行源代码的分享，之后从面试官的角度出发，分享面试中出现的高频问题及其答案。

在进行Java技术面试时，如果聊到集合，那么HashMap几乎是绕不开的话题。本篇就从HashMap开始，分享HashMap的源代码。面试中HashMap的高频问题将在下篇进行分享。

1 HashMap的层次结构

HashMap的层级结构如图所示：

基于HashMap的层级结构，可以得到以下信息：

HashMap实现了Cloneable、Serializable接口，因此不必担心序列化相关的问题
HashMap继承了AbstractMap抽象类，AbstractMap则具有下述的方法

AbstractMap类的方法中，只有entrySet方法是抽象方法。其它方法中，有些方法（如size、get方法）提供了具体实现，有些方法（如put方法）会直接抛出异常，也就是需要继承类去重写。

AbstractMap的size方法如下所示：

public int size() { return entrySet().size();}

AbstractMap的put方法如下所示：

public V put(K key, V value) {    // 此处直接抛出异常，需要继承类重写    throw new UnsupportedOperationException();}

2 HashMap的静态变量及成员变量

HashMap主要（非全部）的静态变量如下所示：

// 默认的初始容量，默认的数组长度，也就是16static final int DEFAULT_INITIAL_CAPACITY = 1 << 4;// 最大容量static final int MAXIMUM_CAPACITY = 1 << 30;// 默认的装载因子static final float DEFAULT_LOAD_FACTOR = 0.75f;// 默认转换为红黑树的链表长度static final int TREEIFY_THRESHOLD = 8;// 默认取消红黑树的链表长度static final int UNTREEIFY_THRESHOLD = 6;// 变更为红黑树的最小数组长度，是否能够变更为红黑树的第二影响因素static final int MIN_TREEIFY_CAPACITY = 64;

HashMap主要（非全部）的成员变量如下所示：

// 数组，哈希桶transient Node<K,V>[] table;// HashMap中元素数量transient int size;// HashMap被修改的次数，用于简单实现并发判断等transient int modCount;// 下次扩容的阈值int threshold;// 装载因子final float loadFactor;

可以看到，table、size、modCount都使用了transient修饰。

3 HashMap中元素的表达方式

HashMap中的单个元素使用Node来进行表达，Node的定义如下所示：

static class Node<K,V> implements Map.Entry<K,V> { // hash值 final int hash; // key final K key; // value V value; // 下个节点的指针 Node<K,V> next;}

Node中，hash、key都使用了final进行修饰，这也是为什么hash、key不可修改的原因。同时，Node中还有一个next指针，用来指向下一个元素。

4 HashMap的hash方法

jdk1.8中，HashMap的hash方法基于key的hashCode生成，并利用了hashCode的高16位，尽量减少哈希碰撞。

static final int hash(Object key) { int h; // 计算hash时，也充分考虑高16位 return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);}

5 HashMap的初始化

HashMap初始化的代码如下所示：

public HashMap(int initialCapacity, float loadFactor) { // 初始容量不能小于0 if (initialCapacity < 0) throw new IllegalArgumentException("Illegal initial capacity: " +                                           initialCapacity); // 初始容量不能大于最大容量 if (initialCapacity > MAXIMUM_CAPACITY) initialCapacity = MAXIMUM_CAPACITY; // 装载因子必须合法 if (loadFactor <= 0 || Float.isNaN(loadFactor)) throw new IllegalArgumentException("Illegal load factor: " +                                           loadFactor); this.loadFactor = loadFactor;    // 计算扩容的阈值，此时threshold为16 this.threshold = tableSizeFor(initialCapacity);}

在计算扩容阈值时，jdk1.8保证阈值是2的次幂，如下所示：

static final int tableSizeFor(int cap) { int n = cap - 1; n |= n >>> 1; n |= n >>> 2; n |= n >>> 4; n |= n >>> 8; n |= n >>> 16; return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;}

此外，在进行HashMap初始化时，不会初始化成员变量table。也就是说，HashMap初始化后，table依然为null。

6 向HashMap中添加元素

向HashMap中添加元素时，会面临多种情况。本文将对以下情况详细进行介绍。

首次向HashMap中添加元素
正常向HashMap中添加元素，无哈希冲突，无扩容，无红黑树转换
正常向HashMap中添加元素，哈希冲突，无扩容，无红黑树转换
添加元素后引发扩容
添加元素后引发红黑树转换

为了后续演示方便，本文构造DemoMapKey这个类，使之作为HashMap中元素的key。主要是为了重写hashCode方法，使其更容易出现哈希冲突。

DemoMapKey的定义如下所示：

/** * 测试类，省略setter、getter、equals方法 */public class DemoMapKey implements Serializable {
 /** * 编码，主要使用code生成hashCode */ private int code;
 /** * name */ private String name;
 public DemoMapKey(int code, String name) { this.code = code; this.name = name; }     @Override public int hashCode() {        // 直接与32取余，便于实现哈希冲突 return code % 32; }
}

测试demo如下所示：

public class HashMapTest {
 public static void main(String[] args) { int count = 17; HashMap<DemoMapKey, Integer> testMap = new HashMap<>(16);
 for (int i = 1; i <= count; i++) { DemoMapKey keyEntry = new DemoMapKey(i, "test"); // 第13个元素引发扩容 if (i == 13) {                System.out.println("添加第13个元素，引发扩容"); }
 testMap.put(keyEntry, i);        } }
}

6.1、首次向HashMap中添加元素

首次向HashMap中添加元素时，代码执行路径如下所示：

public V put(K key, V value) { return putVal(hash(key), key, value, false, true);}
// hash：key的hash值// key：添加元素的key// value：添加元素的value// onlyIfAbsent：是否只有key不存在时添加，put时默认为false// evict：表是否在创建模式，false表示处于创建模式，put时默认为truefinal V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {    // 一些局部变量    // tab：指向table，p：哈希桶内的元素    // n：数组长度，i：插入元素应该在数组中的位置 Node<K,V>[] tab; Node<K,V> p; int n, i;    // 1、table为空，会进行扩容，实际是初始化 if ((tab = table) == null || (n = tab.length) == 0) n = (tab = resize()).length;    // 2、计算元素应该在的位置，数组当前位置无元素，故将此元素直接放在哈希桶中 if ((p = tab[i = (n - 1) & hash]) == null)        tab[i] = newNode(hash, key, value, null); }    // 3、修改次数+1 ++modCount; // 4、元素数量+1，但是不需要扩容 if (++size > threshold) resize(); afterNodeInsertion(evict); return null;}
final Node<K,V>[] resize() {    // 指向旧的table，首次添加元素时为null Node<K,V>[] oldTab = table; // 旧数组的容量，首次添加元素时为0 int oldCap = (oldTab == null) ? 0 : oldTab.length;    // 旧的阈值，当前值为16 int oldThr = threshold;    // 新数组容量，新的阈值 int newCap, newThr = 0; // oldCap为0，因此不会走此处的逻辑 if (oldCap > 0) { if (oldCap >= MAXIMUM_CAPACITY) { threshold = Integer.MAX_VALUE; return oldTab; } else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY) newThr = oldThr << 1; } // 5、因为oldThr为16，所以会执行到这里 else if (oldThr > 0) // 新数组容量设置为16 newCap = oldThr; else { newCap = DEFAULT_INITIAL_CAPACITY; newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY); }    // 6、计算新的阈值 if (newThr == 0) { float ft = (float)newCap * loadFactor; // 新的阈值为16*0.75=12 newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ? (int)ft : Integer.MAX_VALUE); }    // 7、threshold变更为12 threshold = newThr;    // 8、新数组初始化 @SuppressWarnings({"rawtypes","unchecked"}) Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];    // 9、将table指向新的数组    table = newTab; return newTab;}
Node<K,V> newNode(int hash, K key, V value, Node<K,V> next) { return new Node<>(hash, key, value, next);}

6.2、添加元素，无哈希冲突，不需要扩容及红黑树转换

若再次向HashMap中添加元素，会执行上述的2、3、4步，并且不会再执行扩容过程。在此不再赘述。

6.3、添加元素，发生哈希冲突，不需要扩容及红黑树转换

public V put(K key, V value) { return putVal(hash(key), key, value, false, true);}
// hash：key的hash值// key：添加元素的key// value：添加元素的value// onlyIfAbsent：是否只有key不存在时添加，put时默认为false// evict：表是否在创建模式，false表示处于创建模式，put时默认为truefinal V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {    // 一些局部变量    // tab：指向table，p：哈希桶内的元素    // n：数组长度，i：插入元素应该在数组中的位置 Node<K,V>[] tab; Node<K,V> p; int n, i;    // table不为空，因此不会走此处的逻辑 if ((tab = table) == null || (n = tab.length) == 0) n = (tab = resize()).length;    // 无哈希冲突 if ((p = tab[i = (n - 1) & hash]) == null) tab[i] = newNode(hash, key, value, null); } else {        // e：HashMap中已存在的元素 Node<K,V> e; K k;        // key完全相同，e指向当前元素 if (p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k)))) e = p;        // 红黑树的处理逻辑 else if (p instanceof TreeNode) e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value); else { for (int binCount = 0; ; ++binCount) {                // 1、将新增的元素添加到链表的尾部，采用尾插法                // 元素添加后，e依然为null if ((e = p.next) == null) { p.next = newNode(hash, key, value, null); if (binCount >= TREEIFY_THRESHOLD - 1) // 红黑树转换 treeifyBin(tab, hash); break; } if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) break; p = e; } } if (e != null) { V oldValue = e.value; if (!onlyIfAbsent || oldValue == null) e.value = value; afterNodeAccess(e); return oldValue; } }    // 2、修改次数+1 ++modCount;    // 3、元素数量+1，但是不需要扩容 if (++size > threshold) resize(); afterNodeInsertion(evict); return null;}

6.4、添加元素后引发扩容

当添加到第13个元素时，元素数量将大于扩容阈值，进而引起HashMap的扩容。代码执行路径如下所示：

public V put(K key, V value) { return putVal(hash(key), key, value, false, true);}
// hash：key的hash值// key：添加元素的key// value：添加元素的value// onlyIfAbsent：是否只有key不存在时添加，put时默认为false// evict：表是否在创建模式，false表示处于创建模式，put时默认为truefinal V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {    // 一些局部变量    // tab：指向table，p：哈希桶内的元素    // n：数组长度，i：插入元素应该在数组中的位置    Node<K,V>[] tab; Node<K,V> p; int n, i; if ((tab = table) == null || (n = tab.length) == 0) n = (tab = resize()).length;    // 1、计算元素所在的数组位置，当前数组位置无元素，故将此元素直接放在哈希桶中 if ((p = tab[i = (n - 1) & hash]) == null) tab[i] = newNode(hash, key, value, null); }    // 2、修改次数+1 ++modCount;    // 3、元素数量+1，此时size为13，大于threshold，也就是大于12，因此要进行扩容 if (++size > threshold) resize(); afterNodeInsertion(evict); return null;}
final Node<K,V>[] resize() {    // 指向旧的table，此时不为null Node<K,V>[] oldTab = table; // 旧数组的容量，当前值为16 int oldCap = (oldTab == null) ? 0 : oldTab.length;    // 旧的阈值，当前值为12 int oldThr = threshold; // 新数组容量，新的阈值 int newCap, newThr = 0;    // 4、oldCap为16，大于0，满足当前条件 if (oldCap > 0) { if (oldCap >= MAXIMUM_CAPACITY) { threshold = Integer.MAX_VALUE; return oldTab; }        // 5、newCap在oldCap基础上左移，也就是翻倍，16 -> 32，新的阈值也翻倍，变为24 else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY) newThr = oldThr << 1;    }    // 6、threshold变更为24 threshold = newThr;    // 7、新数组初始化 @SuppressWarnings({"rawtypes","unchecked"}) Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];    // 8、将table指向新的数组 table = newTab; // 9、进行元素的迁移 if (oldTab != null) { for (int j = 0; j < oldCap; ++j) { // 链表的当前元素 Node<K,V> e; // 哈希桶非空时才执行 if ((e = oldTab[j]) != null) { oldTab[j] = null;                // 只有哈希桶内一个元素，直接迁移                // 迁移的位置为key的hash、数组长度-1进行与运算 if (e.next == null) newTab[e.hash & (newCap - 1)] = e;                // 红黑树的迁移 else if (e instanceof TreeNode) ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);                // 链表的迁移 else {                    // 迁移后，低位的头尾指针，迁移时元素在数组中的位置不变 Node<K,V> loHead = null, loTail = null;                    // 迁移后，高位的头尾指针，迁移时元素在数组中的位置发生变化 Node<K,V> hiHead = null, hiTail = null; // 当前链表的下一个元素 Node<K,V> next; do { next = e.next;                        // 元素key的hash、原始容量进行与运算，结果为0，表示位置不变                        // 比如，原始容量为16，换算为二进制为10000                        // 如果元素key的hash值小于16，则元素key的hash、原始容量进行与运算，结果一定为0                        // 如果元素key的hash值大于16，则可能出现在低位、也可能出现在高位，具体要看元素key的hash值                        // 比如，17（10001）在高位，32（100000）在低位 if ((e.hash & oldCap) == 0) {                            // 低位的尾指针为空，证明还未构建低位的头尾指针 if (loTail == null)                                // 设置低位的尾指针 loHead = e; else                                // next设置 loTail.next = e; // 低位尾指针的移动 loTail = e; } else {                            // 高位的尾指针为空，证明还未构建高位的头尾指针 if (hiTail == null)                                // 设置高位的尾指针 hiHead = e; else // next设置 hiTail.next = e;                            // 高位尾指针的移动 hiTail = e; } } while ((e = next) != null);                    // 新数组j位置元素 if (loTail != null) { loTail.next = null; newTab[j] = loHead; }                    // 新数组（j + 旧数组容量）位置元素 if (hiTail != null) { hiTail.next = null; newTab[j + oldCap] = hiHead; } } } } } return newTab;}

6.5、添加元素后引发红黑树转换

为了模拟hash冲突，对测试demo进行变更。变更后的demo如下所示：

public class HashMapTest {
 public static void main(String[] args) { int count = 17; HashMap<DemoMapKey, Integer> testMap = new HashMap<>(16);
 for (int i = 1; i <= count; i++) { DemoMapKey keyEntry = new DemoMapKey(i * 32, "test"); // 第9个元素，引发扩容 if (i == 9) {                System.out.println("添加第9个元素，引发扩容"); }
            // 第11个元素，真正进行红黑树的转换 if (i == 11) {                System.out.println("添加第11个元素，引发红黑树转换"); }
 testMap.put(keyEntry, i); } }
}

当添加第9个元素时，原本应该进行红黑树转换。但此时数组的长度并未达到64，因此实际执行的是扩容。当添加第11个元素时，才会进行红黑树的转换。代码执行路径如下所示：

public V put(K key, V value) { return putVal(hash(key), key, value, false, true);}
// hash：key的hash值// key：添加元素的key// value：添加元素的value// onlyIfAbsent：是否只有key不存在时添加，put时默认为false// evict：表是否在创建模式，false表示处于创建模式，put时默认为truefinal V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {    // 一些局部变量    // tab：指向table，p：哈希桶内的元素    // n：数组长度，i：插入元素应该在数组中的位置 Node<K,V>[] tab; Node<K,V> p; int n, i; // table不为空，因此不会走此处的逻辑 if ((tab = table) == null || (n = tab.length) == 0) n = (tab = resize()).length; // 无哈希冲突 if ((p = tab[i = (n - 1) & hash]) == null) tab[i] = newNode(hash, key, value, null); } else { // e：HashMap中已存在的元素 Node<K,V> e; K k; // key完全相同，e指向当前元素 if (p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k)))) e = p; // 红黑树的处理逻辑        else if (p instanceof TreeNode) e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value); else { for (int binCount = 0; ; ++binCount) { if ((e = p.next) == null) { p.next = newNode(hash, key, value, null);                    // 1、binCount达到7，也就意味着链表的长度达到8（不算哈希桶内元素） if (binCount >= TREEIFY_THRESHOLD - 1) // 红黑树转换 treeifyBin(tab, hash); break; } if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) break; p = e; } } if (e != null) { V oldValue = e.value; if (!onlyIfAbsent || oldValue == null) e.value = value; afterNodeAccess(e); return oldValue; } }    // 2、修改次数+1 ++modCount;    // 3、元素数量+1，但是不需要扩容 if (++size > threshold) resize(); afterNodeInsertion(evict); return null;}
final void treeifyBin(Node<K,V>[] tab, int hash) { int n, index; Node<K,V> e;    // tab为空，或者数组长度未达到64，只进行扩容 if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY) resize(); // 满足上述条件，才会进行红黑树的转换 else if ((e = tab[index = (n - 1) & hash]) != null) { TreeNode<K,V> hd = null, tl = null; do { TreeNode<K,V> p = replacementTreeNode(e, null); if (tl == null) hd = p; else { p.prev = tl; tl.next = p; } tl = p; } while ((e = e.next) != null); if ((tab[index] = hd) != null) hd.treeify(tab); }}

7 元素的删除

HashMap中，元素的删除相对比较简单，在此不再过多赘述。

public V remove(Object key) {    Node<K,V> e;    return (e = removeNode(hash(key), key, null, false, true)) == null ?        null : e.value;}
final Node<K,V> removeNode(int hash, Object key, Object value, boolean matchValue, boolean movable) {    // tab：指向table，n：table的长度    // index：元素应该在数组中的位置 Node<K,V>[] tab; Node<K,V> p; int n, index; if ((tab = table) != null && (n = tab.length) > 0 && (p = tab[index = (n - 1) & hash]) != null) {        // node：通过key获取到的元素        // e：p的后置元素 Node<K,V> node = null, e; K k; V v;        // 哈希桶内元素判断 if (p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k)))) node = p; else if ((e = p.next) != null) { // 在红黑树中查找元素 if (p instanceof TreeNode) node = ((TreeNode<K,V>)p).getTreeNode(hash, key); // 在链表中查找元素 else { do { if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) { node = e; break; } p = e; } while ((e = e.next) != null); } }        // 元素存在 if (node != null && (!matchValue || (v = node.value) == value || (value != null && value.equals(v)))) {            // 红黑树的处理，红黑树的自平衡、取消 if (node instanceof TreeNode) ((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);            // 删除的元素位于哈希桶内 else if (node == p) tab[index] = node.next; // 链表元素的删除 else p.next = node.next; ++modCount; --size; afterNodeRemoval(node); return node; } } return null;}

8 元素的获取

HashMap中，元素的获取相对比较简单，在此不再过多赘述。

public V get(Object key) { Node<K,V> e;    // 能够获取到元素就返回其值value，否则返回null return (e = getNode(hash(key), key)) == null ? null : e.value;}
final Node<K,V> getNode(int hash, Object key) {    // tab：指向table，n：数组长度    // first：哈希桶内元素 Node<K,V>[] tab; Node<K,V> first, e; int n; K k; if ((tab = table) != null && (n = tab.length) > 0 && (first = tab[(n - 1) & hash]) != null) {        // 查看哈希桶内元素，是否为需要获取的元素 if (first.hash == hash && ((k = first.key) == key || (key != null && key.equals(k)))) return first; if ((e = first.next) != null) { // 在红黑树中查找元素 if (first instanceof TreeNode) return ((TreeNode<K,V>)first).getTreeNode(hash, key);            // 在链表中查找元素            do { if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) return e; } while ((e = e.next) != null); } } return null;}

9 结语

jdk1.8中，HashMap的结构由数组+链表，变更为数组+链表+红黑树。其源代码中，有些地方的设计非常精妙，值得研发同学仔细阅读分析。

下一篇将分享HashMap的高频面试题及其答案，敬请期待。

最后，祝各位读者朋友工作顺利、心想事成。

vlambda博客
学习文章列表