昨天@淘宝雕梁 同学推荐了无锁的内存分配器,上网站粗粗的了解了下,这家叫Lockless的公司主要有2个产品:
Lockless MPI 和 Lockless Memory Allocator, 我对内存分配器比较感兴趣,它对高性能服务器的影响还是非常大的,特别是mysql这样的服务器,看它的文档对性能的提升好像比较明显。
我们重点来了解下 Lockless Memory Allocator:
The Lockless Memory Allocator is downloadable under the GPL 3.0 License.
官网强调的特性:
Multithread Optimized
The Lockless memory allocator uses lock-free techniques to minimize latency and memory contention. This provides optimal scalability as the number of threads in your application increases. Per-thread data is used to reduce bus communication overhead. This results in thread-local allocations and frees not requiring any synchronization overhead in most cases.
官网做的和主流的几种分配器的性能比较:
详细的bechmark见这里,看上去让人挺印象深刻的。
代码在这里下载 http://locklessinc.com/downloads/, 支持32位和64位的Linux, 安装文档在这里
我们来尝鲜实验下:
$ wget http://locklessinc.com/downloads/lockless_allocator_src.tgz
$ tar xzf lockless_allocator_src.tgz
$ cd lockless_allocator
$ gcc -v
Using built-in specs.
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile --enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib --with-ppl --with-cloog --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 4.4.5 20110214 (Red Hat 4.4.5-6) (GCC)
$ make
/bin/sh -ec 'gcc -MM ll_alloc.c | sed -n "H;$ {g;s@.*:\(.*\)@ll_alloc.c := \$\(wildcard\1\)\nll_alloc.o ll_alloc.c.d: $\(ll_alloc.c\)@;p}" > ll_alloc.c.d'
cc ll_alloc.c -fomit-frame-pointer -Wcast-qual -Wmissing-format-attribute -Wlogical-op -Wstrict-aliasing -Wsign-compare -Wdeclaration-after-statement -Wnested-externs -Wdisabled-optimization -Winline -Wundef -Wimplicit -Wunused -Wfloat-equal -Winit-self -Wformat=2 -Wswitch -Wsequence-point -Wparentheses -Wimplicit -Wchar-subscripts -Wredundant-decls -Wstrict-prototypes -Wbad-function-cast -Wpointer-arith -Wwrite-strings -Wno-long-long -Wmissing-declarations -Wmissing-prototypes -Wextra -Wall -pedantic -ggdb3 -std=gnu99 -O3 -fPIC -pthread -c -o libllalloc.o
strip -g libllalloc.o
ar rcs libllalloc.a libllalloc.o
ranlib libllalloc.a
cc ll_alloc.c -fomit-frame-pointer -Wcast-qual -Wmissing-format-attribute -Wlogical-op -Wstrict-aliasing -Wsign-compare -Wdeclaration-after-statement -Wnested-externs -Wdisabled-optimization -Winline -Wundef -Wimplicit -Wunused -Wfloat-equal -Winit-self -Wformat=2 -Wswitch -Wsequence-point -Wparentheses -Wimplicit -Wchar-subscripts -Wredundant-decls -Wstrict-prototypes -Wbad-function-cast -Wpointer-arith -Wwrite-strings -Wno-long-long -Wmissing-declarations -Wmissing-prototypes -Wextra -Wall -pedantic -ggdb3 -std=gnu99 -O3 -shared -fpic -Wl,-soname,libllalloc.so.1.3 -Wl,-z,interpose -o libllalloc.so.1.3
strip libllalloc.so.1.3
$ ls libllalloc.*
libllalloc.a libllalloc.o libllalloc.so.1.3
$ LD_PRELOAD=./libllalloc.so.1.3 erl
Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:16:16] [rq:16] [async-threads:0] [hipe] [kernel-poll:false]
Eshell V5.8.5 (abort with ^G)
1>
#另外一个终端确认libllalloc.so.1.3在使用
$ lsof -c beam.smp
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
beam.smp 8458 chuba txt REG 8,5 2344032 3775338 /usr/local/lib/erlang/erts-5.8.5/bin/beam.smp
...
beam.smp 8458 chuba mem REG 8,6 40384 195304 /home/chuba/lockless_allocator/libllalloc.so.1.3
...
这里面有个问题: 编译的时候需要的gcc版本比较高,gcc version 4.1.2 20080704 (Red Hat 4.1.2-46)编译不过。
之前tcmalloc就没通过erl的使用,因为erlang内部的指针的后4位被用了,如果分配器不遵守16字节对齐,就会出问题。
看了代码实现的也很简单,代码质量也一般,不知道具体的性能如何,后续找个案例benchmark下!
未完待续!
祝玩得开心!
Recent Comments