Is there any reason to have 8 on TREEIFY_THRESHOLD in Java Hashmap?(Java Hashmap 中有什么理由在 TREEIFY_THRESHOLD 上有 8 个吗?)
问题描述
从 Java 8 开始,hashMap 稍作修改,如果同一存储桶上有超过 8 个 (TREEIFY_THRESHOLD=8) 项,则 hashMap 具有平衡树而不是链表.选择 8 有什么理由吗?
From Java 8, the hashMap modified slightly to have balanced tree instead of linkedlist if more than 8 (TREEIFY_THRESHOLD=8) items on same bucket. is there any reason choosing 8?
如果是 9 会影响性能吗?
would it impact the performance in case it is 9?
推荐答案
使用平衡树而不是链表是一种权衡.在列表的情况下,必须执行线性扫描以在存储桶中执行查找,而树允许日志时间访问.当列表很小时,查找速度很快,并且使用树实际上并没有提供任何好处,而大约 8 个左右的元素在列表中查找的成本变得足够显着,以至于树提供了加速.
The use of a balanced tree instead of a linked-list is a tradeoff. In the case of a list, a linear scan must be performed to perform a lookup in a bucket, while the tree allows for log-time access. When the list is small, the lookup is fast and using a tree doesn't actually provide a benefit while around 8 or so elements the cost of a lookup in the list becomes significant enough that the tree provides a speed-up.
我怀疑树的使用是针对密钥哈希被灾难性破坏(例如许多密钥冲突)的例外情况;虽然线性查找会导致性能严重下降,但使用树可以缓解这种情况性能有所损失,如果键可直接比较.
I suspect that the use of a tree is intended for the exceptional case where the key hash is catastrophically broken (e.g. many keys collide); while a linear lookup will cause performance to degrade severely the use of a tree mitigates this performance loss somewhat, if the keys are directly comparable.
因此,8 个条目的确切阈值可能不是非常重要:假设良好的密钥分布,树箱的机会是 0.00000006,因此在这种情况下显然很少使用树箱.当哈希算法灾难性地失败时,存储桶中的键数无论如何都远大于 8.
Therefore, the exact threshold of 8 entries may not be terribly significant: the chance of a tree bin is 0.00000006 assuming good key distribution, so tree bins are obviously used very rarely in such a case. When the hash algorithm is failing catastrophically, then the number of keys in the bucket is far greater than 8 anyway.
这会带来空间损失,因为树节点必须包含额外的引用:四个对树节点的引用和一个布尔值除了 LinkedHashMap.Entry
(见 它的来源).
This comes at a space penalty since the tree-node must include additional references: four references to tree nodes and a boolean in addition to the fields of a LinkedHashMap.Entry
(see its source).
来自 HashMap类源码中的注释:
因为 TreeNode 的大小大约是常规节点的两倍,我们仅当 bin 包含足够的节点以保证使用时才使用它们(参见 TREEIFY_THRESHOLD).当它们变得太小时(由于删除或调整大小)它们被转换回普通垃圾箱.在使用分布良好的用户哈希码,树箱是很少使用.理想情况下,在随机哈希码下,箱中的节点遵循泊松分布(http://en.wikipedia.org/wiki/Poisson_distribution)默认调整大小的平均参数约为 0.50.75 的阈值,尽管有很大的差异,因为调整粒度.忽略方差,预期列表大小 k 的出现次数为 (exp(-0.5) * pow(0.5, k)/阶乘(k)).
Because TreeNodes are about twice the size of regular nodes, we use them only when bins contain enough nodes to warrant use (see TREEIFY_THRESHOLD). And when they become too small (due to removal or resizing) they are converted back to plain bins. In usages with well-distributed user hashCodes, tree bins are rarely used. Ideally, under random hashCodes, the frequency of nodes in bins follows a Poisson distribution (http://en.wikipedia.org/wiki/Poisson_distribution) with a parameter of about 0.5 on average for the default resizing threshold of 0.75, although with a large variance because of resizing granularity. Ignoring variance, the expected occurrences of list size k are (exp(-0.5) * pow(0.5, k) / factorial(k)).
这篇关于Java Hashmap 中有什么理由在 TREEIFY_THRESHOLD 上有 8 个吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:Java Hashmap 中有什么理由在 TREEIFY_THRESHOLD 上有 8


基础教程推荐
- 如何对 Java Hashmap 中的值求和 2022-01-01
- 存储 20 位数字的数据类型 2022-01-01
- RabbitMQ:消息保持“未确认"; 2022-01-01
- REST Web 服务返回 415 - 不支持的媒体类型 2022-01-01
- 使用堆栈算法进行括号/括号匹配 2022-01-01
- 修改 void 函数的输入参数,然后读取 2022-01-01
- 无法复制:“比较方法违反了它的一般约定!" 2022-01-01
- 问题http://apache.org/xml/features/xinclude测试日志4j 2 2022-01-01
- Struts2 URL 无法访问 2022-01-01
- Spring AOP错误无法懒惰地为此建议构建thisJoinPoin 2022-09-13