Home > Erlang探索 > ”Erlang supervisor 极其白痴的 Bug“的澄清

”Erlang supervisor 极其白痴的 Bug“的澄清

原创文章,转载请注明: 转载自系统技术非业余研究

本文链接地址: ”Erlang supervisor 极其白痴的 Bug“的澄清

2008-05-26的时候, 著名的Trustno1发表了这篇文章 http://www.iteye.com/topic/197097 抱怨Erlang supervisor 极其白痴的一个bug.

今天 @淘李福 同学重新提起这个事情:

翻到一个老帖子: http://www.iteye.com/topic/197097
现在是 R14 ,代码还是那样,我觉得是不是我们理解错了,shutdown属于normal退出

由于该帖子关闭评论, 所以我在这里澄清下,这个不是bug!

前几天我重新读了下init.erl的代码,是特地的设计,目的是在系统init:stop的时候为了让kernel进程包括supervisor tree有个正常退出的机会。

我们来看下init:stop 的代码:
erts/preloaded/src/init.erl

...
%%% -------------------------------------------------
%%% Stop the system.
%%% Reason is: restart | reboot | stop
%%% According to reason terminate emulator or restart
%%% system using the same init process again.
%%% -------------------------------------------------

stop(Reason,State) ->
    BootPid = State#state.bootpid,
    {_,Progress} = State#state.status,
    State1 = State#state{status = {stopping, Progress}},
    clear_system(BootPid,State1),
    do_stop(Reason,State1).

do_stop(restart,#state{start = Start, flags = Flags, args = Args}) ->
    boot(Start,Flags,Args);
do_stop(reboot,_) ->
    halt();
do_stop(stop,State) ->
    stop_heart(State),
    halt();
do_stop({stop,Status},State) ->
    stop_heart(State),
    halt(Status).

clear_system(BootPid,State) ->
    Heart = get_heart(State#state.kernel),
    shutdown_pids(Heart,BootPid,State),
    unload(Heart).
stop_heart(State) ->
    case get_heart(State#state.kernel) of
        false ->
            ok;
        Pid ->
            %% As heart survives a restart the Parent of heart is init.
            BootPid = self(),
            %% ignore timeout
            shutdown_kernel_pid(Pid, BootPid, self(), State)
    end.

shutdown_pids(Heart,BootPid,State) ->
    Timer = shutdown_timer(State#state.flags),
    catch shutdown(State#state.kernel,BootPid,Timer,State),
    kill_all_pids(Heart), % Even the shutdown timer.
    kill_all_ports(Heart),
    flush_timout(Timer).

get_heart([{heart,Pid}|_Kernel]) -> Pid;
get_heart([_|Kernel])           -> get_heart(Kernel);
get_heart(_)                    -> false.

shutdown([{heart,_Pid}|Kernel],BootPid,Timer,State) ->
    shutdown(Kernel, BootPid, Timer, State);
shutdown([{_Name,Pid}|Kernel],BootPid,Timer,State) ->
    shutdown_kernel_pid(Pid, BootPid, Timer, State),
    shutdown(Kernel,BootPid,Timer,State);
shutdown(_,_,_,_) ->
    true.


%%
%% A kernel pid must handle the special case message
%% {'EXIT',Parent,Reason} and terminate upon it!
%%
shutdown_kernel_pid(Pid, BootPid, Timer, State) ->
    Pid ! {'EXIT',BootPid,shutdown},
    shutdown_loop(Pid, Timer, State, []).

...

系统会先用exit(Pid,kill)杀掉非kernel类型的进程,然后再用Pid ! {‘EXIT’,BootPid,shutdown},杀掉shutdown_kernel_pid。

这句话是重点: A kernel pid must handle the special case message and terminate upon it!
那么什么是kernel进程呢?

看下bin/start.script

...
{kernelProcess,heart,{heart,start,[]}},
     {kernelProcess,error_logger,{error_logger,start_link,[]}},
     {kernelProcess,application_controller,
         {application_controller,start,
             [{application,kernel,
...

这些带kernelProcess标签的进程都是, 特别是application!

到此为止,我们能很好的理解supervisor.erl中的这二句话了:

do_restart(_, shutdown, Child, State) ->   
    NState = state_del_child(Child, State),   
    {ok, NState};   
do_restart(transient, Reason, Child, State) ->   
    report_error(child_terminated, Reason, Child, State#state.name),   
    restart(Child, State);  

小结:不要轻易怀疑别人,特别是跑了20年以上的系统!

玩得开心!

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Erlang探索 Tags: ,
  1. langxianzhe
    July 4th, 2011 at 22:09 | #1

    还是要怀疑的,不然就看不到您精彩的解答了。

    [Reply]

    Yu Feng Reply:

    我的意思是先怀疑我们自己理解错了,:(

    [Reply]

  1. No trackbacks yet.