揭秘Map的Key视图及Entry视图设计原理

无标签

jdk源码分析

发布日期: 2020-03-31

问题描述：相信大家在阅读 HashMap 源代码的时候都会有一个疑问，HashMap 定义了 keySet，entrySet 用来存储 HashMap 的 key 及 entry 视图，但是翻遍了源代码就是没发现这两个成员变量是在哪里同步 HashMap 的 key 和 entry 的，下面就以 keySet 为例来分析Key视图的设计原理

1. HashMap 中 keySet 的声明及获取代码

// key 集合成员变量声明
transient Set<K>        keySet;

// 获取 key 集合视图
public Set<K> keySet() {
    Set<K> ks = keySet;
    if (ks == null) {
        // 创建了 KeySet 对象，可能在构造器中同步了 HashMap 的所有 key
        ks = new KeySet();
        keySet = ks;
    }
    return ks;
}

直接调用 keySet() 方法就可以返回当前 hashmap 的 key 视图，但是没有看到忘 keyset 中存储 hashmap 所有 key 的代码，于是只能看下 KeySet 中是否有同步数据的代码。

2.KeySet 定义

// key 视图定义
final class KeySet extends AbstractSet<K> {
    public final int size()                 { return size; }
    public final void clear()               { HashMap.this.clear(); }

    // 构造器中也没有同步数据的代码，看迭代器中能不能发现什么
    public final Iterator<K> iterator()     { return new KeyIterator(); }
    public final boolean contains(Object o) { return containsKey(o); }
    public final boolean remove(Object key) {
        return removeNode(hash(key), key, null, false, true) != null;
    }
    public final Spliterator<K> spliterator() {
        return new KeySpliterator<>(HashMap.this, 0, -1, 0, 0);
    }

    // forEach 方法中似乎看到了一点线索，看到遍历的时候是直接使用外部类（即 hashmap）存储数据的数组
    public final void forEach(Consumer<? super K> action) {
        Node<K,V>[] tab;
        if (action == null)
            throw new NullPointerException();
        // 这里将 hashmap 的 table 赋值给了 tab，证明这里遍历的其实就是 hashmap
        if (size > 0 && (tab = table) != null) {
            int mc = modCount;
            for (int i = 0; i < tab.length; ++i) {
                for (Node<K,V> e = tab[i]; e != null; e = e.next)
                    action.accept(e.key);
            }
            if (modCount != mc)
                throw new ConcurrentModificationException();
        }
}

KeySet 构造器中也没有同步数据相关代码，意外看到 forEach() 中其实就是遍历 hashmap，那么是否迭代器也是在遍历 hashmap 呢。（增强 for 循环遍历集合类型对象时其实使用的就是迭代器）

3.KeyIterator 定义

final class KeyIterator extends HashIterator
    implements Iterator<K> {
    public final K next() { return nextNode().key; }
}

调用 nextNode() 实现遍历，继续看 nextNode() 代码

final Node<K,V> nextNode() {
        Node<K,V>[] t;
        // next 为 KeyIterator 的父类 HashIterator 中的成员变量
        Node<K,V> e = next;

        if (modCount != expectedModCount)
            throw new ConcurrentModificationException();
        if (e == null)
            throw new NoSuchElementException();
        if ((next = (current = e).next) == null && (t = table) != null) {
            do {} while (index < t.length && (next = t[index++]) == null);
        }
        // 这里其实就是返回父类中的 next
        return e;
}

返回的是父类中的 next 成员变量，继续看父类 HashIterator 中的代码

abstract class HashIterator {
    Node<K,V> next;        // next entry to return
    Node<K,V> current;     // current entry
    int expectedModCount;  // for fast-fail
    int index;             // current slot

    HashIterator() {
        expectedModCount = modCount;
        // 使用 hashmap 的 table
        Node<K,V>[] t = table;
        current = next = null;
        index = 0;
        // hashmap 非空
        if (t != null && size > 0) { // advance to first entry
            // 找到 hashmap 的 table 中第一个元素的位置，并将该位置存储的 entry 赋值给 next 进行初始化
            do {} while (index < t.length && (next = t[index++]) == null);
        }
    }

    public final boolean hasNext() {
        return next != null;
    }

    final Node<K,V> nextNode() {
        Node<K,V>[] t;
        Node<K,V> e = next;
        if (modCount != expectedModCount)
            throw new ConcurrentModificationException();
        if (e == null)
            throw new NoSuchElementException();
        if ((next = (current = e).next) == null && (t = table) != null) {
            do {} while (index < t.length && (next = t[index++]) == null);
        }
        return e;
    }

    public final void remove() {
        Node<K,V> p = current;
        if (p == null)
            throw new IllegalStateException();
        if (modCount != expectedModCount)
            throw new ConcurrentModificationException();
        current = null;
        K key = p.key;
        removeNode(hash(key), key, null, false, false);
        expectedModCount = modCount;
    }
}

从 HashIterator 构造器中可以看出 next 的初始值即为 hashmap 的 table 中存储的第一个元素，这就意味着使用 KeyIterator 实例是可以遍历 hashmap 的，到这里应该就真相大白了。

总结

hashmap 的 keySet 与 entrySet 在遍历时其实使用的就是 hashmap 的 table，所以 keySet() 与 entrySet() 方法返回的对象叫做视图，一个对象的视图意味着只能用作遍历，查询而不能用于修改，如果 keySet 与 entrySet 本身需要存储数据的话，那就意味着可以通过某些方式修改hashmap存储的值，比如反射，当然完全可以通过拷贝的方式存储 hashmap 的数据副本来达到不影响 hashmap 的目的，但是假如这样做了，hashmap 占用的内存是不是要翻倍了呢，所以 jdk 一些 java 库里面的代码写的是真的经典！！！

写在最后，做了一个实践

static class MySet<E> extends AbstractSet<E> {

    private Iterator<E> iterator;

    public MySet(Iterator<E> iterator) {
        this.iterator = iterator;
    }

    @Override
    public Iterator<E> iterator() {
        return iterator;
    }

    @Override
    public int size() {
        return 0;
    }
}

public static void main(String[] args) {
    List<String> list = new ArrayList<>();
    list.add("1");
    list.add("2");
    list.add("3");

    MySet<String> mySet = new MySet<>(list.iterator());

    System.out.println(mySet);
}

截屏2020-03-31下午5.24.34
在 AbstractCollection 中重写的 toString() 也是使用的迭代器，所以，这里直接可以打印到控制台。