”Erlang supervisor 极其白痴的 Bug“的澄清
原创文章,转载请注明: 转载自系统技术非业余研究
本文链接地址: ”Erlang supervisor 极其白痴的 Bug“的澄清
2008-05-26的时候, 著名的Trustno1发表了这篇文章 http://www.iteye.com/topic/197097 抱怨Erlang supervisor 极其白痴的一个bug.
今天 @淘李福 同学重新提起这个事情:
翻到一个老帖子: http://www.iteye.com/topic/197097
现在是 R14 ,代码还是那样,我觉得是不是我们理解错了,shutdown属于normal退出
由于该帖子关闭评论, 所以我在这里澄清下,这个不是bug!
前几天我重新读了下init.erl的代码,是特地的设计,目的是在系统init:stop的时候为了让kernel进程包括supervisor tree有个正常退出的机会。
我们来看下init:stop 的代码:
erts/preloaded/src/init.erl
...
%%% -------------------------------------------------
%%% Stop the system.
%%% Reason is: restart | reboot | stop
%%% According to reason terminate emulator or restart
%%% system using the same init process again.
%%% -------------------------------------------------
stop(Reason,State) ->
BootPid = State#state.bootpid,
{_,Progress} = State#state.status,
State1 = State#state{status = {stopping, Progress}},
clear_system(BootPid,State1),
do_stop(Reason,State1).
do_stop(restart,#state{start = Start, flags = Flags, args = Args}) ->
boot(Start,Flags,Args);
do_stop(reboot,_) ->
halt();
do_stop(stop,State) ->
stop_heart(State),
halt();
do_stop({stop,Status},State) ->
stop_heart(State),
halt(Status).
clear_system(BootPid,State) ->
Heart = get_heart(State#state.kernel),
shutdown_pids(Heart,BootPid,State),
unload(Heart).
stop_heart(State) ->
case get_heart(State#state.kernel) of
false ->
ok;
Pid ->
%% As heart survives a restart the Parent of heart is init.
BootPid = self(),
%% ignore timeout
shutdown_kernel_pid(Pid, BootPid, self(), State)
end.
shutdown_pids(Heart,BootPid,State) ->
Timer = shutdown_timer(State#state.flags),
catch shutdown(State#state.kernel,BootPid,Timer,State),
kill_all_pids(Heart), % Even the shutdown timer.
kill_all_ports(Heart),
flush_timout(Timer).
get_heart([{heart,Pid}|_Kernel]) -> Pid;
get_heart([_|Kernel]) -> get_heart(Kernel);
get_heart(_) -> false.
shutdown([{heart,_Pid}|Kernel],BootPid,Timer,State) ->
shutdown(Kernel, BootPid, Timer, State);
shutdown([{_Name,Pid}|Kernel],BootPid,Timer,State) ->
shutdown_kernel_pid(Pid, BootPid, Timer, State),
shutdown(Kernel,BootPid,Timer,State);
shutdown(_,_,_,_) ->
true.
%%
%% A kernel pid must handle the special case message
%% {'EXIT',Parent,Reason} and terminate upon it!
%%
shutdown_kernel_pid(Pid, BootPid, Timer, State) ->
Pid ! {'EXIT',BootPid,shutdown},
shutdown_loop(Pid, Timer, State, []).
...
系统会先用exit(Pid,kill)杀掉非kernel类型的进程,然后再用Pid ! {‘EXIT’,BootPid,shutdown},杀掉shutdown_kernel_pid。
这句话是重点: A kernel pid must handle the special case message and terminate upon it!
那么什么是kernel进程呢?
看下bin/start.script
...
{kernelProcess,heart,{heart,start,[]}},
{kernelProcess,error_logger,{error_logger,start_link,[]}},
{kernelProcess,application_controller,
{application_controller,start,
[{application,kernel,
...
这些带kernelProcess标签的进程都是, 特别是application!
到此为止,我们能很好的理解supervisor.erl中的这二句话了:
do_restart(_, shutdown, Child, State) ->
NState = state_del_child(Child, State),
{ok, NState};
do_restart(transient, Reason, Child, State) ->
report_error(child_terminated, Reason, Child, State#state.name),
restart(Child, State);
小结:不要轻易怀疑别人,特别是跑了20年以上的系统!
玩得开心!
Post Footer automatically generated by wp-posturl plugin for wordpress.
还是要怀疑的,不然就看不到您精彩的解答了。
Yu Feng Reply:
July 4th, 2011 at 10:10 pm
我的意思是先怀疑我们自己理解错了,:(