Home > Erlang探索, Linux > Erlang内置数据库挑战7000WQPS

Erlang内置数据库挑战7000WQPS

November 26th, 2010 Leave a comment Go to comments

原创文章,转载请注明: 转载自系统技术非业余研究

本文链接地址: Erlang内置数据库挑战7000WQPS

在EUC-2010上rickard做了个报告,详细的解读了R14B读写锁优化的有效性,并且给出了benchmark, 详见这里 http://www.erlang.org/~rickard/euc-2010/

优化的效果非常好,读写锁比NPTL内置的有好几倍的提升,我也来体验下。

我的测试是在Dell R815机器上测试的,以下是它的硬件配置.
获取的系统信息脚本这里下载。

# summary.sh
      Date | 2010-11-25 14:18:18 UTC (local TZ: CST +0800)
    Hostname | =i
      Uptime |  9:02,  7 users,  load average: 10.27, 15.74, 11.78
      System | Dell Inc.; PowerEdge R815; vNot Specified (<OUT OF SPEC>)
 Service Tag | 55SSW2X
     Release | Red Hat Enterprise Linux Server release 5.4 (Tikanga)
      Kernel | 2.6.18-164.el5
Architecture | CPU = 64-bit, OS = 64-bit
   Threading | NPTL 2.5
    Compiler | GNU CC version 4.1.2 20080704 (Red Hat 4.1.2-44).
     SELinux | Disabled
# Processor ##################################################
  Processors | physical = 4, cores = 48, virtual = 48, hyperthreading = no
      Speeds | 48x1900.026
      Models | 48xAMD Opteron(tm) Processor 6168
      Caches | 48x512 KB
# Memory #####################################################
       Total | 62.92G
        Free | 57.33G
        Used | physical = 5.60G, swap = 420.00k, virtual = 5.60G
     Buffers | 100.19M
      Caches | 186.92M
        Used | 5.20G
  Swappiness | vm.swappiness = 0
 DirtyPolicy | vm.dirty_ratio = 40, vm.dirty_background_ratio = 10
...

详细的请查看Dell R815机器配置

我的测试是这样的:

首先我Hack了rickard的etsb.erl代码,代码在这里下载 主要有几个改进:
1. 并发准备数据集合.
2. 同时5秒更新下现在的进展。
3. 参数指定数据集的大小。
4. ETS所有权转移减少准备数据集合的次数。

我同时还写了以下的stap脚本来验证我们的结果是对的:

# cat dig.stp
global ops;
probe process("/usr/local/lib/erlang/erts-5.8.2/bin/beam.smp").function(@1)
{
ops<<<1;
}
probe timer.s(1)
{
printf("ops=%d\n", @count(ops));
delete ops
}
probe begin { println(":)") }

验证用的脚本是这样使用的:

# #看插入的次数
# stap dig.stp ets_insert* 

# #看查询的次数
# stap dig.stp ets_lookup* 

测试程序的使用: 共有5个参数Name, LoopsStr, ReadOnlyStr, EtsOpts, TblSizeStr。
Name: 测试案例的名字
LoopsStr: 循环的次数. 一个ETS表在一个并发参数设置下每个进程的表操作数, 总的OPS=Loops * 1000
ReadOnlyStr: true只测试查询操作, false测试混合操作。
EtsOpts: 取值rc,wc, wrc之一, 分别代表read_concurrency, write_concurrency,read_concurrency| write_concurrency
TblSizeStr: 数据集的行数

我们用numactl –interleave=all 用来避免numa机器的swap的问题,代价是内存访问的速度慢了百分30%左右。

# #倾斜内存的使用
# /sbin/sysctl vm.swappiness=0 
# #启用hipe方式能快点
# erlc +native etsb.erl 
# #1千万条数据,循环1万次
# rm -f test* && numactl --interleave=all erl -noshell +h 9999 +S 48:48 +rg 64 -sct db -run etsb go test 10000 false rwc 10000000 -s erlang halt   
init workers:48, seg:208333, rem:16
:ets current size[2771137], ips[0.5542274]
ets current size[4913060], ips[0.4283846]
ets current size[6976923], ips[0.4127726]
ets current size[9090054], ips[0.4226262]
) size[10000000], time[22.095488s]
test-wrc-0 48 0.135391
total 10000000 op, 0.135391 s, avg 73.86015318595771M QPS
test-wrc-0 48 0.129994
total 10000000 op, 0.129994 s, avg 76.92662738280228M QPS
test-wrc-0 48 0.120818
total 10000000 op, 0.120818 s, avg 82.76912380605539M QPS
test-wrc-0 48 0.124831
total 10000000 op, 0.124831 s, avg 80.10830643029375M QPS
test-wrc-0 48 0.119286
total 10000000 op, 0.119286 s, avg 83.8321345338095M QPS
test-wrc-0-avg 48 0.126064
test-wrc-0-sw1 48 0.126555
total 10000000 op, 0.126555 s, avg 79.01702816957054M QPS
test-wrc-0-sw1 48 0.130929
total 10000000 op, 0.130929 s, avg 76.37727317859299M QPS
test-wrc-0-sw1 48 0.133867
total 10000000 op, 0.133867 s, avg 74.70100921063444M QPS
test-wrc-0-sw1 48 0.135101
total 10000000 op, 0.135101 s, avg 74.01869712289324M QPS
test-wrc-0-sw1 48 0.123784
total 10000000 op, 0.123784 s, avg 80.78588509015705M QPS
test-wrc-0-sw1-avg 48 0.1300472
test-wrc-1 48 0.844015
total 10000000 op, 0.844015 s, avg 11.848130661184932M QPS
test-wrc-1 48 0.843837
total 10000000 op, 0.843837 s, avg 11.85062992023341M QPS
test-wrc-1 48 0.834215
total 10000000 op, 0.834215 s, avg 11.987317418171575M QPS
test-wrc-1 48 0.833802
total 10000000 op, 0.833802 s, avg 11.993254993391716M QPS
test-wrc-1 48 0.833611
total 10000000 op, 0.833611 s, avg 11.996002931823117M QPS
test-wrc-1-avg 48 0.837896
test-wrc-50 48 4.182473
total 10000000 op, 4.182473 s, avg 2.390929959380491M QPS
test-wrc-50 48 4.218778
total 10000000 op, 4.218778 s, avg 2.3703546382388456M QPS
test-wrc-50 48 4.14823
total 10000000 op, 4.14823 s, avg 2.410666718094223M QPS
test-wrc-50 48 4.166377
total 10000000 op, 4.166377 s, avg 2.4001668596000796M QPS
test-wrc-50 48 4.148239
total 10000000 op, 4.148239 s, avg 2.4106614879229475M QPS
test-wrc-50-avg 48 4.1728194
test-wrc-100 48 17.192856
total 10000000 op, 17.192856 s, avg 0.5816369310602032M QPS
test-wrc-100 48 17.555575
total 10000000 op, 17.555575 s, avg 0.5696196222567474M QPS
test-wrc-100 48 17.733847
total 10000000 op, 17.733847 s, avg 0.563893440605414M QPS
test-wrc-100 48 17.617372
total 10000000 op, 17.617372 s, avg 0.5676215499110764M QPS
test-wrc-100 48 17.650338
total 10000000 op, 17.650338 s, avg 0.5665613882295059M QPS
test-wrc-100-avg 48 17.5499976

每秒可以向数据库中插入差不多40W行数据。

我们可以看到在1千万条记录下,查询最多可以到73M QPS, 查询/更改在999:1的情况下有11MQPS, 查询/更改在950:50的情况下有2.37M QPS, 查询/更改在900:100的情况下有0.5M QPS.

# #1亿条数据,循环1万次
# rm -f test* && numactl --interleave=all erl -noshell +h 9999 +S 48:48 +rg 64 -sct db -run etsb go test 10000 false rwc 100000000 -s erlang halt   
init workers:48, seg:2083333, rem:16
:ets current size[2605562], ips[0.5211124]
ets current size[4719691], ips[0.4228258]
ets current size[6920927], ips[0.4402472]
ets current size[9056821], ips[0.4271788]
ets current size[11189006], ips[0.426437]
ets current size[13343972], ips[0.4309932]
ets current size[15489180], ips[0.4290416]
ets current size[17593478], ips[0.4208596]
ets current size[19714165], ips[0.4241374]
ets current size[21802050], ips[0.417577]
ets current size[23892756], ips[0.4181412]
ets current size[25959468], ips[0.4133424]
ets current size[28035023], ips[0.415111]
ets current size[30034566], ips[0.3999086]
ets current size[32016364], ips[0.3963596]
ets current size[34038414], ips[0.40441]
ets current size[36036807], ips[0.3996786]
ets current size[38107167], ips[0.414072]
ets current size[40110323], ips[0.4006312]
ets current size[42151673], ips[0.40827]
ets current size[44144094], ips[0.3984842]
ets current size[46204692], ips[0.4121196]
ets current size[48207547], ips[0.400571]
ets current size[50182314], ips[0.3949534]
ets current size[52155995], ips[0.3947362]
ets current size[54102271], ips[0.3892552]
ets current size[56043591], ips[0.388264]
ets current size[58025689], ips[0.3964196]
ets current size[60064420], ips[0.4077462]
ets current size[62085518], ips[0.4042196]
ets current size[64092824], ips[0.4014612]
ets current size[66088791], ips[0.3991934]
ets current size[68023640], ips[0.3869698]
ets current size[69915172], ips[0.3783064]
ets current size[71821355], ips[0.3812366]
ets current size[73834553], ips[0.4026396]
ets current size[75835866], ips[0.4002626]
ets current size[77804632], ips[0.3937532]
ets current size[79834676], ips[0.4060088]
ets current size[81835580], ips[0.4001808]
ets current size[83878663], ips[0.4086166]
ets current size[85899564], ips[0.4041802]
ets current size[87846268], ips[0.3893408]
ets current size[89850390], ips[0.4008244]
ets current size[91819539], ips[0.3938298]
ets current size[93832815], ips[0.4026552]
ets current size[95777563], ips[0.3889496]
ets current size[97739853], ips[0.392458]
ets current size[99876702], ips[0.4273698]
) size[100000000], time[278.634797s]
test-wrc-0 48 0.134338
total 10000000 op, 0.134338 s, avg 74.43910137116825M QPS
test-wrc-0 48 0.125968
total 10000000 op, 0.125968 s, avg 79.38524069604979M QPS
test-wrc-0 48 0.125977
total 10000000 op, 0.125977 s, avg 79.37956928645706M QPS
test-wrc-0 48 0.125679
total 10000000 op, 0.125679 s, avg 79.56778777679644M QPS
test-wrc-0 48 0.130515
total 10000000 op, 0.130515 s, avg 76.61954564609432M QPS
test-wrc-0-avg 48 0.12849539999999998
test-wrc-0-sw1 48 0.131326
total 10000000 op, 0.131326 s, avg 76.14638380823294M QPS
test-wrc-0-sw1 48 0.133842
total 10000000 op, 0.133842 s, avg 74.7149624183739M QPS
test-wrc-0-sw1 48 0.132196
total 10000000 op, 0.132196 s, avg 75.64525401676299M QPS
test-wrc-0-sw1 48 0.130818
total 10000000 op, 0.130818 s, avg 76.44207983610819M QPS
test-wrc-0-sw1 48 0.127537
total 10000000 op, 0.127537 s, avg 78.40861867536479M QPS
test-wrc-0-sw1-avg 48 0.13114379999999998
test-wrc-1 48 0.847669
total 10000000 op, 0.847669 s, avg 11.79705757789892M QPS
test-wrc-1 48 0.849755
total 10000000 op, 0.849755 s, avg 11.768097863501833M QPS
test-wrc-1 48 0.84315
total 10000000 op, 0.84315 s, avg 11.860285832888573M QPS
test-wrc-1 48 0.842553
total 10000000 op, 0.842553 s, avg 11.868689566116316M QPS
test-wrc-1 48 0.854927
total 10000000 op, 0.854927 s, avg 11.69690511587539M QPS
test-wrc-1-avg 48 0.8476108
test-wrc-50 48 4.22419
total 10000000 op, 4.22419 s, avg 2.3673177579606977M QPS
test-wrc-50 48 4.229486
total 10000000 op, 4.229486 s, avg 2.3643534935450785M QPS
test-wrc-50 48 4.212307
total 10000000 op, 4.212307 s, avg 2.3739960074135147M QPS
test-wrc-50 48 4.217103
total 10000000 op, 4.217103 s, avg 2.371296124377327M QPS
test-wrc-50 48 4.213192
total 10000000 op, 4.213192 s, avg 2.3734973388347838M QPS
test-wrc-50-avg 48 4.219255599999999
test-wrc-100 48 17.439039
total 10000000 op, 17.439039 s, avg 0.5734260930318466M QPS
test-wrc-100 48 17.89559
total 10000000 op, 17.89559 s, avg 0.5587968879483717M QPS
test-wrc-100 48 16.806215
total 10000000 op, 16.806215 s, avg 0.5950179740054498M QPS
test-wrc-100 48 16.832135
total 10000000 op, 16.832135 s, avg 0.5941016989229233M QPS
test-wrc-100 48 16.999419
total 10000000 op, 16.999419 s, avg 0.5882553986109761M QPS
test-wrc-100-avg 48 17.1944796

性能和上面的差不多。

准备数据的时候TOP说:

查询时候TOP说:

结论:
Erlang内置数据库在1亿条纪录的规模下, Dell R815的48核机器可以做到7000WQPS, 8亿的情况下可以到5000W。内存为王!!!

Post Footer automatically generated by wp-posturl plugin for wordpress.

  1. Jer
    November 26th, 2010 at 15:42 | #1

    测试辛苦了!
    结论很有力。
    然后是业务场景的思考。

    [Reply]

  2. November 26th, 2010 at 16:47 | #2

    test-wrc-0 48 2.063125
    total 100000000 op, 2.063125 s, avg 48.47016055740685M QPS

    8亿条行是5千万左右。

    [Reply]

  3. November 26th, 2010 at 16:47 | #3

    CPU: AMD64 family10, speed 1900.03 MHz (estimated)
    Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit mask) count 100000
    samples % image name app name symbol name
    232026642 22.0811 beam.smp beam.smp db_put_hash
    176727298 16.8185 vmlinux vmlinux .text.default_idle
    106396349 10.1253 beam.smp beam.smp ethr_rwmutex_rlock
    61758613 5.8773 vmlinux vmlinux schedule
    44327529 4.2185 beam.smp beam.smp ethr_event_swait
    25606555 2.4369 vmlinux vmlinux scheduler_tick
    20536080 1.9543 libc-2.5.so libc-2.5.so sched_yield
    19836117 1.8877 beam.smp beam.smp db_get_hash
    16357228 1.5567 beam.smp beam.smp copy_struct
    13705245 1.3043 beam.smp beam.smp ethr_rwmutex_rwlock
    11853368 1.1280 beam.smp beam.smp ethr_atomic_read
    11645193 1.1082 vmlinux vmlinux smp_local_timer_interrupt
    10770201 1.0250 vmlinux vmlinux system_call
    8670359 0.8251 vmlinux vmlinux audit_syscall_exit
    8128466 0.7736 vmlinux vmlinux smp_apic_timer_interrupt
    7850433 0.7471 vmlinux vmlinux .text.find_busiest_group
    7657765 0.7288 libpthread-2.5.so libpthread-2.5.so pthread_mutex_lock
    7047641 0.6707 vmlinux vmlinux audit_syscall_entry
    6497677 0.6184 libpthread-2.5.so libpthread-2.5.so pthread_getspecific
    6452996 0.6141 beam.smp beam.smp bf_link_free_block
    6387939 0.6079 beam.smp beam.smp ets_insert_2
    6180141 0.5881 beam.smp beam.smp ethr_rwmutex_runlock

    [Reply]

  1. No trackbacks yet.