Archive

Archive for the ‘Erlang探索’ Category

非阻塞connect的一个细节

May 18th, 2010 Yu Feng 3 comments

昨天听zhuzhaoyuan说的一个connect细节. 通常我们connect的时候都是非阻塞的, 在connect调用后把句柄挂到poll去, 等poll通知可写的时候, 我们就认为connect成功了. 但是在linux平台下实际上不一定成功, 具体的要用socket get_opt来检查下出错码来决定.

以下是从man 2 connnect摘抄的:

EINPROGRESS
The socket(2,7,n) is non-blocking and the connection cannot be com-
pleted immediately. It is possible to select(2,7,2 select_tut)(2) or poll(2) for
completion by selecting the socket(2,7,n) for writing. After select(2,7,2 select_tut)
indicates writability, use getsockopt(2) to read(2,n,1 builtins) the SO_ERROR
option at level SOL_SOCKET to determine whether connect com-
pleted successfully (SO_ERROR is zero) or unsuccessfully
(SO_ERROR is one of the usual error(8,n) codes listed here, explain-
ing the reason for the failure).

我们看下erlang的inet_drv是如何处理的.
./erts/emulator/drivers/common/inet_drv.c:

        {
            int error = 0;      /* Has to be initiated, we check it */
            unsigned int sz = sizeof(error); /* even if we get -1 */
            int code = sock_getopt(desc->inet.s, SOL_SOCKET, SO_ERROR,
                                   (void *)&error, &sz);

            if ((code < 0) || error) {
                desc->inet.state = TCP_STATE_BOUND;  /* restore state */
                ret = async_error(INETP(desc), error);
                goto done;
            }
        }

结论: 细节决定品质.

How to Build a Debug Enabled Erlang RunTime System

May 7th, 2010 Yu Feng 1 comment

很多朋友在问如何调试Erlang的驱动代码等等,其实otp源码下的INSTALL.md写的很清楚, 摘抄下:
How to Build a Debug Enabled Erlang RunTime System
————————————————–

After completing all the normal building steps described above a debug
enabled runtime system can be built. To do this you have to change
directory to `$ERL_TOP/erts/emulator`.
注:一定要注意这句话, 假设你现在在otp源码目录下,正常编译好了, export ERL_TOP=`pwd` 然后进入erts/emulator目录下

In this directory execute:

$ make debug FLAVOR=$FLAVOR

where `$FLAVOR` is either `plain` or `smp`. The flavor options will
produce a beam.debug and beam.smp.debug executable respectively. The
files are installed along side with the normal (opt) versions `beam.smp`
and `beam`.

To start the debug enabled runtime system execute:

$ $ERL_TOP/bin/cerl -debug

The debug enabled runtime system features lock violation checking,
assert checking and various sanity checks to help a developer ensure
correctness. Some of these features can be enabled on a normal beam
using appropriate configure options.

There are other types of runtime systems that can be built as well
using the similar steps just described.

$ make $TYPE FLAVOR=$FLAVOR

where `$TYPE` is `opt`, `gcov`, `gprof`, `debug`, `valgrind`, or `lcnt`.
These different beam types are useful for debugging and profiling
purposes.

祝玩的开心!

Categories: Erlang探索 Tags: , ,

erts_modified_timings是如何起作用的

May 3rd, 2010 Yu Feng No comments

我们先看下man erl

+T Level:

Enables modified timing and sets the modified timing level. Currently valid range is 0-9. The timing of the runtime system will
change. A high level usually means a greater change than a low level. Changing the timing can be very useful for finding timing
related bugs.

Currently, modified timing affects the following:

Process spawning:
A process calling spawn, spawn_link, spawn_monitor, or spawn_opt will be scheduled out immediately after completing the
call. When higher modified timing levels are used, the caller will also sleep for a while after being scheduled out.

Context reductions:
The amount of reductions a process is a allowed to use before being scheduled out is increased or reduced.

Input reductions:
The amount of reductions performed before checking I/O is increased or reduced.

NOTE: Performance will suffer when modified timing is enabled. This flag is only intended for testing and debugging. Also note
that return_to and return_from trace messages will be lost when tracing on the spawn BIFs. This flag may be removed or changed
at any time without prior notice.

我们可以知道这个选项是用于修改VM的进程和IO调度的时间,以及延迟spawn的执行时间, 使得和时间相关的问题,容易得到暴露,达到发现问题的目的。 但是Level是如何对于参数的呢?下面这个表格能回答问题了.
erts/emulator/beam/erl_init.c

 132ErtsModifiedTimings erts_modified_timings[] = {
 133    /* 0 */     {make_small(0), CONTEXT_REDS, INPUT_REDUCTIONS},
 134    /* 1 */     {make_small(0), 2*CONTEXT_REDS, 2*INPUT_REDUCTIONS},
 135    /* 2 */     {make_small(0), CONTEXT_REDS/2, INPUT_REDUCTIONS/2},
 136    /* 3 */     {make_small(0), 3*CONTEXT_REDS, 3*INPUT_REDUCTIONS},
 137    /* 4 */     {make_small(0), CONTEXT_REDS/3, 3*INPUT_REDUCTIONS},
 138    /* 5 */     {make_small(0), 4*CONTEXT_REDS, INPUT_REDUCTIONS/2},
 139    /* 6 */     {make_small(1), CONTEXT_REDS/4, 2*INPUT_REDUCTIONS},
 140    /* 7 */     {make_small(1), 5*CONTEXT_REDS, INPUT_REDUCTIONS/3},
 141    /* 8 */     {make_small(10), CONTEXT_REDS/5, 3*INPUT_REDUCTIONS},
 142    /* 9 */     {make_small(10), 6*CONTEXT_REDS, INPUT_REDUCTIONS/4}
 143};
 

注意延时是以毫秒为单位的。

比如说: erl +T 8 那么spawn延时10ms, IO处理时间加大到3倍, 有利于快速处理IO事件。

Categories: Erlang探索 Tags: , ,

erts_ use_sender_punish未公开的特性

May 3rd, 2010 Yu Feng No comments

我们知道erlang的VM调度是根据reds的,每个进程初始的时候分配2000个reds, 一旦这个reds用完了,进程就被挂起,放到队列去排队,等待下一次调度。OTP R13B04下一个进程给另外一个进程发送消息,是需要扣除发送者一定的reds, 这样看起来更公平。因为古语说杀敌一千, 自损八百。 但是Erlang有个未公开的选项来避开这种情况:erl +snsp
我们来看代码:
emulator/beam/erl_init.c

1131 else if (sys_strcmp("nsp", sub_param) == 0)
1132                erts_use_sender_punish = 0;

emulator/beam/bif.c

  /* send to local process */
	erts_send_message(p, rp, &rp_locks, msg, 0);
        if (!erts_use_sender_punish)
            res = 0;
        else {
#ifdef ERTS_SMP
            res = rp->msg_inq.len*4;
            if (ERTS_PROC_LOCK_MAIN & rp_locks)
                res += rp->msg.len*4;
#else
            res = rp->msg.len*4;
#endif
        }

正常情况下 扣除的reds是 接收者队列长度的4倍。

结论: 我们如果需要批量发送的场合,可以使用这个选项。

Categories: Erlang探索 Tags: ,

Inside Erlang VM(你需要知道的VM原理)

April 26th, 2010 Yu Feng 4 comments

公司培训用的文档, 对于Erlang的VM会有个大体的认识, 方便设计和使用Erlang.
点解下载pdf格式的文档

Categories: Erlang探索 Tags: , ,

如何找出异常所在的行(新思路)

April 21st, 2010 Yu Feng 3 comments

在Erlang-china的邮件列表上看到这样的问题:

我的服务经常发生这样的错误,举例:
Error in process <0.33.0> with exit value: {badarg,[{erlang,'++',[undefined,[{"37"}]]},{groups,doWork,1},
{groups,doWork,1},{groups,manage_clients,1}]}

大意明白,但问题是我使用匹配机制时没考虑到多个函数”doWork/1″出错无法定位到其中一个,这该如何是好?
Erlang是否会像其它语言一样提示某一行出错?

这个问题确实很常见, Erlang的运行期没有给出出错的具体行数, 这给我们定位问题带来了很大的麻烦.

有先驱给出了这样的解决方案 http://mryufeng.javaeye.com/blog/368507 但是这个模块已经很老了, 过时不维护了.

这里我给出另外一个方案, 利用erlang现有的模块来实现的: cover + dbg

cover的工作原理可以参考这篇文章 http://mryufeng.javaeye.com/blog/482204.

原理就是cover编译过的模块会在每行执行前, 先执行ets:update_counter(cover_internal_data_table,{bump,Mod,Fun,1,1,Line},1) 来更新模块某行的执行次数.
那么我们只要截取 ets:update_counter这个动作, 我们就知道改模块最后的执行行, 也就是异常所在的行.

Ok, 原理介绍完毕, 上菜.

[root@centos ~]# cat line.erl

-module(line).
-export([dbg/1]).
-include_lib("stdlib/include/ms_transform.hrl").

dbg(Mod)->
    cover:compile(Mod),
    dbg:tracer(),
    dbg:p(all, [call]),
    dbg:tpl(ets,
            update_counter,
            dbg:fun2ms(fun([_,{bump,Mod,_,_,_,_},1]) ->
                               return_trace()
                       end)),
    ok.

[root@centos ~]# cat hello.erl

-module(hello).
-export([start/0]).

start()->
    a=a,
    A=2,
    C=3,
    A=C-1,
    C=A+1,
    io:format("hello world~n",[]),
    test(C),
    ok.

test(C)->
    A=4,
    A=C,  % Error is on this line.
    ok.

我们可以看到这个hello模块会在hello:test发生异常, A=C这个地方是具体位置. 现在让我们找到行号:

[root@centos ~]# erl
Erlang R13B02 (erts-5.7.3) [source] [64-bit] [smp:2:2] [rq:2] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.7.3  (abort with ^G)
1> line:dbg(hello).
ok
2> hello:start().
hello world
** exception error: no match of right hand side value 3
     in function  hello:test/1
     in call from hello:start/0
4> (<0.34.0>) call ets:update_counter(cover_internal_data_table,{bump,hello,start,0,1,5},1)
(<0.34.0>) returned from ets:update_counter/3 -> 1
(<0.34.0>) call ets:update_counter(cover_internal_data_table,{bump,hello,start,0,1,6},1)
(<0.34.0>) returned from ets:update_counter/3 -> 1
(<0.34.0>) call ets:update_counter(cover_internal_data_table,{bump,hello,start,0,1,7},1)
(<0.34.0>) returned from ets:update_counter/3 -> 1
(<0.34.0>) call ets:update_counter(cover_internal_data_table,{bump,hello,start,0,1,8},1)
(<0.34.0>) returned from ets:update_counter/3 -> 1
(<0.34.0>) call ets:update_counter(cover_internal_data_table,{bump,hello,start,0,1,9},1)
(<0.34.0>) returned from ets:update_counter/3 -> 1
(<0.34.0>) call ets:update_counter(cover_internal_data_table,{bump,hello,start,0,1,10},1)
(<0.34.0>) returned from ets:update_counter/3 -> 1
(<0.34.0>) call ets:update_counter(cover_internal_data_table,{bump,hello,start,0,1,11},1)
(<0.34.0>) returned from ets:update_counter/3 -> 1
(<0.34.0>) call ets:update_counter(cover_internal_data_table,{bump,hello,test,1,1,16},1)
(<0.34.0>) returned from ets:update_counter/3 -> 1
(<0.34.0>) call ets:update_counter(cover_internal_data_table,{bump,hello,test,1,1,17},1)   %这里我们看到出错的行号
(<0.34.0>) returned from ets:update_counter/3 -> 1
3>
BREAK: (a)bort (c)ontinue (p)roc info (i)nfo (l)oaded
       (v)ersion (k)ill (D)b-tables (d)istribution

我们可以看到最后一次执行hello模块的行数是17.

Bingo!

Erlang源码汇编格式

April 16th, 2010 Yu Feng No comments

我们在编码的时候, 通常会好奇, 这时候需要观察erl源码生成的VM opcode. Erlang的VM是register based的VM, 生产的opcode很容易理解.
生成汇编格式有2种方式:
1. 从源码生成抽象码. erlc +”‘S’” mod.erl, 生成mod.S
2. 从beam生成Opcode. 未公开的功能. erts_debug:df 参数M或者 M, F, 生成mod.dis

来吧,实践下:

root@ubuntu:~/exam# ls
eg.erl
root@ubuntu:~/exam# erlc +"'S'" eg.erl
root@ubuntu:~/exam# erlc eg.erl
root@ubuntu:~/exam# erl
Erlang R14A (erts-5.8) [source] [smp:2:2] [rq:2] [async-threads:0] [hipe] [kernel-poll:false] [lock-counting]

Eshell V5.8  (abort with ^G)
1>  erts_debug:df(eg).
ok
2>
User switch command
 --> q
root@ubuntu:~/exam# ls eg*
eg.beam  eg.dis  eg.erl  eg.S

我们得到了eg.S, eg.dis这2个反汇编的结果. 我们再来参观下.
先看源码:
root@ubuntu:~/exam# cat eg.erl

-module(eg).
-import(lists).
-import(lists,[sum/1]).
-compile(export_all).

kilo_byte() ->
    kilo_byte(10, [42]).
kilo_byte(0, Acc) ->
    Acc;
kilo_byte(N, Acc) ->
    kilo_byte(N-1, [Acc|Acc]).

loop()->
    sum(lists:seq(1,100)),
    loop().

中间汇编码, 供transform进行处理和编译器进一步生成opcode.
root@ubuntu:~/exam# cat eg.S

{module, eg}.  %% version = 0

{exports, [{kilo_byte,0},
           {kilo_byte,2},
           {loop,0},
           {module_info,0},
           {module_info,1}]}.

{attributes, []}.

{labels, 12}.

{function, kilo_byte, 0, 2}.
  {label,1}.
    {func_info,{atom,eg},{atom,kilo_byte},0}.
  {label,2}.
    {move,{literal,"*"},{x,1}}.
    {move,{integer,10},{x,0}}.
    {call_only,2,{f,4}}.

{function, kilo_byte, 2, 4}.
  {label,3}.
    {func_info,{atom,eg},{atom,kilo_byte},2}.
  {label,4}.
    {test,is_eq_exact,{f,5},[{x,0},{integer,0}]}.
    {move,{x,1},{x,0}}.
    return.
  {label,5}.
    {gc_bif,'-',{f,0},2,[{x,0},{integer,1}],{x,0}}.
    {test_heap,2,2}.
    {put_list,{x,1},{x,1},{x,1}}.
    {call_only,2,{f,4}}.

{function, loop, 0, 7}.
  {label,6}.
    {func_info,{atom,eg},{atom,loop},0}.
  {label,7}.
    {allocate,0,0}.
    {move,{integer,100},{x,1}}.
    {move,{integer,1},{x,0}}.
    {call_ext,2,{extfunc,lists,seq,2}}.
    {call_ext,1,{extfunc,lists,sum,1}}.
    {call_last,0,{f,7},0}.

{function, module_info, 0, 9}.
  {label,8}.
    {func_info,{atom,eg},{atom,module_info},0}.
  {label,9}.
    {move,{atom,eg},{x,0}}.
    {call_ext_only,1,{extfunc,erlang,get_module_info,1}}.

{function, module_info, 1, 11}.
  {label,10}.
    {func_info,{atom,eg},{atom,module_info},1}.
  {label,11}.
    {move,{x,0},{x,1}}.
    {move,{atom,eg},{x,0}}.
    {call_ext_only,2,{extfunc,erlang,get_module_info,2}}.

VM opcode形式, VM就是来解释运行这些code的

root@ubuntu:~/exam# cat eg.dis
B5146074: i_func_info_IaaI 0 eg kilo_byte 0
B5146088: move_cx "*" x(1)
B5146094: i_move_call_only_fcr eg:kilo_byte/2 10 x(0) 

B51460A0: i_func_info_IaaI 0 eg kilo_byte 2
B51460B4: i_is_eq_immed_frc f(B51460C8) x(0) 0
B51460C0: move_return_xr x(1) x(0)
B51460C8: i_fetch_rc x(0) 1
B51460D0: i_minus_jId j(00000000) 2 x(0)
B51460E0: test_heap_II 2 2
B51460EC: put_list_xxx x(1) x(1) x(1)
B51460F4: i_call_only_f eg:kilo_byte/2 

B51460FC: i_func_info_IaaI 0 eg loop 0
B5146110: allocate_tt 0 0
B5146118: move_cx 100 x(1)
B5146124: i_move_call_ext_cre 1 x(0) lists:seq/2
B5146130: i_call_ext_e lists:sum/1
B5146138: i_call_last_fP eg:loop/0 0 

B5146144: i_func_info_IaaI 0 eg module_info 0
B5146158: move_cr eg x(0)
B5146160: allocate_tt 0 1
B5146168: call_bif1_e erlang:get_module_info/1
B5146170: deallocate_return_P 0 

B5146178: i_func_info_IaaI 0 eg module_info 1
B514618C: move_rx x(0) x(1)
B5146194: move_cr eg x(0)
B514619C: allocate_tt 0 2
B51461A4: call_bif2_e erlang:get_module_info/2
B51461AC: deallocate_return_P 0

收工!