系统技术非业余研究Erlang 新数据类型Map的定位和性能

Home > Erlang探索, 源码分析 > Erlang 新数据类型Map的定位和性能

Erlang 新数据类型Map的定位和性能

March 12th, 2014 Yu Feng

原创文章，转载请注明： 转载自系统技术非业余研究

Erlang R17最大的语言层面的变化莫过是引入 Map数据结构，参见：Erlang R17新特性浅评还有这里。

Map相关的细节在EEP 43上，参见这里。

定位：

A record replacement is just that, a replacement. It’s like asking the question, “What do we have?” instead of “What can we get?” The instant rebuttal would be “What do we need?” I say Maps.

满足的约束：

The new data-type shall have semantics, syntax and operations that:

> provides an association set from key terms to value terms which can be constructed, accessed and updated using language syntax
> can be uniquely distinguished from every other data-type in the language
> has no compile-time dependency for constructing, accessing or updating contents of maps nor for passing maps between modules, processes or over Erlang distribution
> can be used in matching expressions in the language
> has a one-to-one association between printing and parsing the data-type
> has a well defined order between terms of the type and other Erlang types
> has at most O(log N) time complexity in insert and lookup operations, where N is the number of key-value associations.

思遥同学很贴心的写了一篇maps的分析，参看 Erlang 的新数据结构 map 浅析

从源码来分析的话：

/* erl_map.h */
typedef struct map_s {
    Eterm thing_word;
    Uint  size;
    Eterm keys;      /* tuple */
} map_t;
/* map node                                                                                                                                                                                    
 *                                                                                                                                                                                             
 * -----------                                                                                                                                                                                 
 * Eterm   THING                                                                                                                                                                               
 * Uint    size                                                                                                                                                                                
 * Eterm   Keys -> {K1, K2, K3, ..., Kn} where n = size                                                                                                                                        
 * ----                                                                                                                                                                                        
 * Eterm   V1                                                                                                                                                                                  
 * ...                                                                                                                                                                                         
 * Eterm   Vn, where n = size                                                                                                                                                                  
 * -----------                                                                                                                                                                                 
 */

* maps:get/2                                                                                                                                                                                  
 * return value if key *matches* a key in the map                                                                                                                                              
 * exception bad_key if none matches                                                                                                                                                           
 */

/*erl_map.c */
int erts_maps_get(Eterm key, Eterm map, Eterm *value) {
    Eterm *ks,*vs;
    map_t *mp;
    Uint n,i;

    mp  = (map_t*)map_val(map);
    n   = map_get_size(mp);

    if (n == 0)
        return 0;

    ks  = map_get_keys(mp);
    vs  = map_get_values(mp);

    if (is_immed(key)) {
        for( i = 0; i < n; i++) {
            if (ks[i] == key) {
                *value = vs[i];
		return 1;
            }
	}
    }
   for( i = 0; i < n; i++) {
        if (EQ(ks[i], key)) {
            *value = vs[i];
            return 1;
        }
    }
    return 0;
}

int erts_validate_and_sort_map(map_t* mp)
{
 ...
}

我们知道map的数据结构非常简单：记录个数，keys, values. keys是以tuple有序存放的。所以插入和移除的代价都很高，查询也是遍历方式进行的。

map数据结构的复杂性和进程字典(put,get)相比都简单很多, 所以map第一天设计就不是为了大量存放kv数据的目的，它的目的是提供一种更好的record的替代品，在细节上改进：支持atom以外的key, 语法和语义更自然。

小结：如果要性能，请使用ets, dict等数据结构;如果要替代record用的更顺手,用map。

祝玩的开心！

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Erlang探索, 源码分析 Tags: eep43, map

Comments (3)

zhengsiyao

March 12th, 2014 at 18:42 | #1

Reply | Quote

现在的实现还有点糙，EEP里面提到的是insert和lookup最坏O(log n)，目前的实现是线性搜索。其实key都是有序的，改成二分搜索就可以实现proposed的复杂度了。

Yu Feng Reply:
March 12th, 2014 at 8:59 pm
确实
Xu Yifeng

March 13th, 2014 at 11:07 | #2

Reply | Quote

map比record是灵活多了，原来record生成的代码是用element()和setelement模拟的，字段位置固定，一旦有不是故意的字段次序变动，所有使用这个record的模块都要重新编译。而且新字段不能动态添加也是个问题。
dcy

November 4th, 2014 at 15:15 | #3

Reply | Quote

蛋疼的是lists里面的keyxxx方法全是针对record的，在考虑把record替换成maps的时候想到lists那些方法可能都要替换或者重写，有点犹豫

Comments are closed.

基于LLVM的高性能Erlang(Hipe)尝鲜 cowboy-高性能简洁的erlang版web框架

系统技术非业余研究

Erlang 新数据类型Map的定位和性能

buy me a coffee.

Recent Posts

Recent Comments

Categories

Blogroll

Archives

Meta

系统技术非业余研究

Erlang 新数据类型Map的定位和性能

Share this:

buy me a coffee.

Recent Posts

Recent Comments

Categories

Blogroll

Archives

Meta