Erlang 新数据类型Map的定位和性能
原创文章,转载请注明: 转载自系统技术非业余研究
本文链接地址: Erlang 新数据类型Map的定位和性能
Erlang R17最大的语言层面的变化莫过是引入 Map数据结构,参见:Erlang R17新特性浅评 还有 这里。
Map相关的细节在EEP 43上,参见 这里。
定位:
A record replacement is just that, a replacement. It’s like asking the question, “What do we have?” instead of “What can we get?” The instant rebuttal would be “What do we need?” I say Maps.
满足的约束:
The new data-type shall have semantics, syntax and operations that:
> provides an association set from key terms to value terms which can be constructed, accessed and updated using language syntax
> can be uniquely distinguished from every other data-type in the language
> has no compile-time dependency for constructing, accessing or updating contents of maps nor for passing maps between modules, processes or over Erlang distribution
> can be used in matching expressions in the language
> has a one-to-one association between printing and parsing the data-type
> has a well defined order between terms of the type and other Erlang types
> has at most O(log N) time complexity in insert and lookup operations, where N is the number of key-value associations.
思遥同学很贴心的写了一篇maps的分析,参看 Erlang 的新数据结构 map 浅析
从源码来分析的话:
/* erl_map.h */ |
typedef struct map_s { |
Eterm thing_word; |
Uint size; |
Eterm keys; /* tuple */ |
} map_t; |
/* map node |
* |
* ----------- |
* Eterm THING |
* Uint size |
* Eterm Keys -> {K1, K2, K3, ..., Kn} where n = size |
* ---- |
* Eterm V1 |
* ... |
* Eterm Vn, where n = size |
* ----------- |
*/ |
* maps:get/2 |
* return value if key *matches* a key in the map |
* exception bad_key if none matches |
*/ |
/*erl_map.c */ |
int erts_maps_get(Eterm key, Eterm map, Eterm *value) { |
Eterm *ks,*vs; |
map_t *mp; |
Uint n,i; |
mp = (map_t*)map_val(map); |
n = map_get_size(mp); |
if (n == 0) |
return 0; |
ks = map_get_keys(mp); |
vs = map_get_values(mp); |
if (is_immed(key)) { |
for ( i = 0; i < n; i++) { |
if (ks[i] == key) { |
*value = vs[i]; |
return 1; |
} |
} |
} |
for ( i = 0; i < n; i++) { |
if (EQ(ks[i], key)) { |
*value = vs[i]; |
return 1; |
} |
} |
return 0; |
} |
int erts_validate_and_sort_map(map_t* mp) |
{ |
... |
} |
我们知道map的数据结构非常简单:记录个数,keys, values. keys是以tuple有序存放的。所以插入和移除的代价都很高,查询也是遍历方式进行的。
map数据结构的复杂性和进程字典(put,get)相比都简单很多, 所以map第一天设计就不是为了大量存放kv数据的目的,它的目的是提供一种更好的record的替代品,在细节上改进:支持atom以外的key, 语法和语义更自然。
小结: 如果要性能,请使用ets, dict等数据结构;如果要替代record用的更顺手,用map。
祝玩的开心!
Post Footer automatically generated by wp-posturl plugin for wordpress.
现在的实现还有点糙,EEP里面提到的是insert和lookup最坏O(log n),目前的实现是线性搜索。其实key都是有序的,改成二分搜索就可以实现proposed的复杂度了。
Yu Feng Reply:
March 12th, 2014 at 8:59 pm
确实
map比record是灵活多了,原来record生成的代码是用element()和setelement模拟的,字段位置固定,一旦有不是故意的字段次序变动,所有使用这个record的模块都要重新编译。而且新字段不能动态添加也是个问题。
蛋疼的是lists里面的keyxxx方法全是针对record的,在考虑把record替换成maps的时候想到lists那些方法可能都要替换或者重写,有点犹豫