Archive

Archive for August, 2013

服务器时间校正思考

August 30th, 2013 3 comments

原创文章,转载请注明: 转载自系统技术非业余研究

本文链接地址: 服务器时间校正思考

大部分网络业务服务器都大量用到了时间,比如各种状态机,各种超时,各种取时间戳, 如果机器的挂钟时间发生突变,没有特殊处理的服务器大部分都得挂。好的服务器程序如erlang, nginx等都有time correction机制,我这里就不罗嗦了,直接摘抄erlang的time correction文档,写的很好:

2 Time and time correction in Erlang

Time is vital to an Erlang program and, more importantly, correct time is vital to an Erlang program. As Erlang is a language with soft real time properties and we have the possibility to express time in our programs, the Virtual Machine and the language has to be very careful about what is considered a correct point in time and in how time functions behave.

In the beginning, Erlang was constructed assuming that the wall clock time in the system showed a monotonic time moving forward at exactly the same pace as the definition of time. That more or less meant that an atomic clock (or better) was expected to be attached to your hardware and that the hardware was then expected to be locked away from any human (or unearthly) tinkering for all eternity. While this might be a compelling thought, it’s simply never the case.

A “normal” modern computer can not keep time. Not on itself and not unless you actually have a chip level atomic clock wired to it. Time, as perceived by your computer, will normally need to be corrected. Hence the NTP protocol that together with the ntpd process will do it’s best to keep your computers time in sync with the “real” time in the universe. Between NTP corrections, usually a less potent time-keeper than an atomic clock is used.

But NTP is not fail safe. The NTP server can be unavailable, the ntp.conf can be wrongly configured or your computer may from time to time be disconnected from the internet. Furthermore you can have a user (or even system administrator) on your system that thinks the right way to handle daylight saving time is to adjust the clock one hour two times a year (a tip, that is not the right way to do it…). To further complicate things, this user fetched your software from the internet and has never ever thought about what’s the correct time as perceived by a computer. The user simply does not care about keeping the wall clock in sync with the rest of the universe. The user expects your program to have omnipotent knowledge about the time.

Most programmers also expect time to be reliable, at least until they realize that the wall clock time on their workstation is of by a minute. Then they simply set it to the correct time, maybe or maybe not in a smooth way. Most probably not in a smooth way.

The amount of problems that arise when you expect the wall clock time on the system to always be correct may be immense. Therefore Erlang introduced the “corrected estimate of time”, or the “time correction” many years ago. The time correction relies on the fact that most operating systems have some kind of monotonic clock, either a real time extension or some built in “tick counter” that is independent of the wall clock settings. This counter may have microsecond resolution or much less, but generally it has a drift that is not to be ignored.

So we have this monotonic ticking and we have the wall clock time. Two unreliable times that together can give us an estimate of an actual wall clock time that does not jump around and that monotonically moves forward. If the tick counter has a high resolution, this is fairly easy to do, if the counter has a low resolution, it’s more expensive, but still doable down to frequencies of 50-60 Hz (of the tick counter).

So the corrected time is the nearest approximation of an atomic clock that is available on the computer. We want it to have the following properties:

Monotonic
The clock should not move backwards
Intervals should be near the truth
We want the actual time (as measured by an atomic clock or an astronomer) that passes between two time stamps, T1 and T2, to be as near to T2 – T1 as possible.
Tight coupling to the wall clock
We want a timer that is to be fired when the wall clock reaches a time in the future, to fire as near to that point in time as possible
To meet all the criteria, we have to utilize both times in such a way that Erlangs “corrected time” moves slightly slower or slightly faster than the wall clock to get in sync with it. The word “slightly” means a maximum of 1% difference to the wall clock time, meaning that a sudden change in the wall clock of one minute, takes 100 minutes to fix, by letting all “corrected time” move 1% slower or faster.

Needless to say, correcting for a faulty handling of daylight saving time may be disturbing to a user comparing wall clock time to for example calendar:now_to_local_time(erlang:now()). But calendar:now_to_local_time/1 is not supposed to be used for presenting wall clock time to the user.

Time correction is not perfect, but it saves you from the havoc of clocks jumping around, which would make timers in your program fire far to late or far to early and could bring your whole system to it’s knees (or worse) just because someone detected a small error in the wall clock time of the server where your program runs. So while it might be confusing, it is still a really good feature of Erlang and you should not throw it away using time functions which may give you higher benchmark results, not unless you really know what you’re doing.

2.1 What does time correction mean in my system?

Time correction means that Erlang estimates a time from current and previous settings of the wall clock, and it uses a fairly exact tick counter to detect when the wall clock time has jumped for some reason, slowly adjusting to the new value.

In practice, this means that the difference between two calls to time corrected functions, like erlang:now(), might differ up to one percent from the corresponding calls to non time corrected functions (like os:timestamp()). Furthermore, if comparing calendar:local_time/0 to calendar:now_to_local_time(erlang:now()), you might temporarily see a difference, depending on how well kept your system is.

It is important to understand that it is (to the program) always unknown if it is the wall clock time that moves in the wrong pace or the Erlang corrected time. The only way to determine that, is to have an external source of universally correct time. If some such source is available, the wall clock time can be kept nearly perfect at all times, and no significant difference will be detected between erlang:now/0’s pace and the wall clock’s.

Still, the time correction will mean that your system keeps it’s real time characteristics very well, even when the wall clock is unreliable.

2.2 Where does Erlang use corrected time?

For all functionality where real time characteristics are desirable, time correction is used. This basically means:

erlang:now/0
The infamous erlang:now/0 function uses time correction so that differences between two “now-timestamps” will correspond to other timeouts in the system. erlang:now/0 also holds other properties, discussed later.
receive … after
Timeouts on receive uses time correction to determine a stable timeout interval.
The timer module
As the timer module uses other built in functions which deliver corrected time, the timer module itself works with corrected time.
erlang:start_timer/3 and erlang:send_after/3
The timer BIF’s work with corrected time, so that they will not fire prematurely or too late due to changes in the wall clock time.
All other functionality in the system where erlang:now/0 or any other time corrected functionality is used, will of course automatically benefit from it, as long as it’s not “optimized” to use some other time stamp function (like os:timestamp/0).

Modules like calendar and functions like erlang:localtime/0 use the wall clock time as it is currently set on the system. They will not use corrected time. However, if you use a now-value and convert it to local time, you will get a corrected local time value, which may or may not be what you want. Typically older code tend to use erlang:now/0 as a wall clock time, which is usually correct (at least when testing), but might surprise you when compared to other times in the system.

2.3 What is erlang:now/0 really?

erlang:now/0 is a function designed to serve multiple purposes (or a multi-headed beast if you’re a VM designer). It is expected to hold the following properties:

Monotonic
erlang:now() never jumps backwards – it always moves forward
Interval correct
The interval between two erlang:now() calls is expected to correspond to the correct time in real life (as defined by an atomic clock, or better)
Absolute correctness
The erlang:now/0 value should be possible to convert to an absolute and correct date-time, corresponding to the real world date and time (the wall clock)
System correspondence
The erlang:now/0 value converted to a date-time is expected to correspond to times given by other programs on the system (or by functions like os:timestamp/0)
Unique
No two calls to erlang:now on one Erlang node should return the same value
All these requirements are possible to uphold at the same time if (and only if):

The wall clock time of the system is perfect
The system (Operating System) time needs to be perfectly in sync with the actual time as defined by an atomic clock or a better time source. A good installation using NTP, and that is up to date before Erlang starts, will have properties that for most users and programs will be near indistinguishable from the perfect time. Note that any larger corrections to the time done by hand, or after Erlang has started, will partly (or temporarily) invalidate some of the properties, as the time is no longer perfect.
Less than one call per microsecond to erlang:now/0 is done
This means that at any microsecond interval in time, there can be no more than one call to erlang:now/0 in the system. However, for the system not to loose it’s properties completely, it’s enough that it on average is no more than one call per microsecond (in one Erlang node).
The uniqueness property of erlang:now/0 is the most limiting property. It means that erlang:now() maintains a global state and that there is a hard-to-check property of the system that needs to be maintained. For most applications this is still not a problem, but a future system might very well manage to violate the frequency limit on the calls globally. The uniqueness property is also quite useless, as there are globally unique references that provide a much better unique value to programs. However the property will need to be maintained unless a really subtle backward compatibility issue is to be introduced.

2.4 Should I use erlang:now/0 or os:timestamp/0

The simple answer is to use erlang:now/0 for everything where you want to keep real time characteristics, but use os:timestamp for things like logs, user communication and debugging (typically timer:ts uses os:timestamp, as it is a test tool, not a real world application API). The benefit of using os:timestamp/0 is that it’s faster and does not involve any global state (unless the operating system has one). The downside is that it will be vulnerable to wall clock time changes.

2.5 Turning off time correction

If, for some reason, time correction causes trouble and you are absolutely confident that the wall clock on the system is nearly perfect, you can turn off time correction completely by giving the +c option to erl. The probability for this being a good idea, is very low.

祝玩得开心!

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Erlang探索 Tags:

application配置文件和热升级

August 29th, 2013 No comments

原创文章,转载请注明: 转载自系统技术非业余研究

本文链接地址: application配置文件和热升级

前面我们一直说过erlang是以app为单位来组织程序,数据,配置等信息,让这些信息聚合在一起成为一个整体,设计上和unix系统一模一样。 那app的配置信息存在哪里呢?

配置信息有三种方式体现(其实是4种):
1. .app文件里面的env字段, 通常是MyApplication.app, 具体参见这里
2. .config文件,通常是sys.config,具体参见这里
3. 命令行 erl -ApplName Par1 Val1 … ParN ValN 具体参见这里

我们摘抄重要的信息如下:
方式1:

7.8 Configuring an Application

An application can be configured using configuration parameters. These are a list of {Par, Val} tuples specified by a key env in the .app file.

{application, ch_app,
[{description, “Channel allocator”},
{vsn, “1”},
{modules, [ch_app, ch_sup, ch3]},
{registered, [ch3]},
{applications, [kernel, stdlib, sasl]},
{mod, {ch_app,[]}},
{env, [{file, “/usr/local/log”}]}
]}.
Par should be an atom, Val is any term. The application can retrieve the value of a configuration parameter by calling application:get_env(App, Par) or a number of similar functions, see application(3)

方式2:

A configuration file contains values for configuration parameters for the applications in the system. The erl command line argument -config Name tells the system to use data in the system configuration file Name.config.

Configuration parameter values in the configuration file will override the values in the application resource files (see app(4)). The values in the configuration file can be overridden by command line flags (see erl(1)).

The value of a configuration parameter is retrieved by calling application:get_env/1,2.

方式3:

The values in the .app file, as well as the values in a system configuration file, can be overridden directly from the command line:

% erl -ApplName Par1 Val1 … ParN ValN

这三种方式都可以很方便的来设置应用的配置信息,由于一个应用会依赖于其他很多应用,所以会有很多的配置信息,这里我比较推荐sys.config方式,这也是rebar组织配置文件的标准形式。
Read more…

Post Footer automatically generated by wp-posturl plugin for wordpress.

erlang和其他语言读文件性能大比拼

August 28th, 2013 25 comments

原创文章,转载请注明: 转载自系统技术非业余研究

本文链接地址: erlang和其他语言读文件性能大比拼

百岁同学说:

今天公司技术比武,比赛题目是给一个1.1g的大文本,统计文本中词频最高的前十个词。花了两天用erlang写完了代码,但是放到公司16核的机器上这么一跑,结果不比不知道,一比吓一条。erlang写的代码执行时间花了55秒左右,同事们有的用java,有的用C,还有的用C++,用C最快一个老兄只花了2.6秒,用java的也只用了3.2秒。相比之下erlang的代码,真是一头大蜗牛,太慢了。

详细参见这篇文章:http://www.iteye.com/topic/1131748

读取文件并且分析这是很多脚本语言如perl, python,ruby经常会干的事情.这个同学的问题是很普遍的问题,不只一个人反映过慢的问题。
今天我们来重新来修正下这个看法, 我们用数据说话。

首先我们来准备下文件, 这个文件是完全的随机数,有1G大小:

$ dd if=/dev/urandom  of=test.dat count=1024 bs=1024K
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 188.474 s, 5.7 MB/s
$ time dd if=test.dat of=/dev/null 
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB) copied, 1.16021 s, 925 MB/s

real    0m1.162s
user    0m0.219s
sys     0m0.941s
$ time dd if=test.dat of=/dev/null bs=1024k
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 0.264298 s, 4.1 GB/s

real    0m0.266s
user    0m0.000s
sys     0m0.267s

我们准备了1G大小左右的文件,由于用的是buffered io, 数据在准备好了后,全部缓存在pagecache里面,只要内存足够,这个测试的性能和IO设备无关。 我们试着用dd读取这个文件,如果块大小是4K的话,读取这个文件花了1.16秒,而如果块大小是1M的话,0.26秒,带宽达到4.1GB每秒,远超过真实设备的速度。

那么我们用erlang来读取下这个文件来比较下,我们有三种读法:
1. 一下子读取整个1G文件。
2. 一个线程一次读取1块,比如1M大小,直到读完。
3. 多个线程读取,每个读取一大段,每次读取1M块大小。
然后比较下性能。

首先普及下背景:
1. erlang的文件IO操作由efile driver来提高,这个driver内部有个线程池,大小由+A 参数控制,所以IO是多线程完成的。
2. erlang的文件分二种模式: 1. raw模式 2. io模式 在raw模式下,数据直接由driver提供给调用进程, io模式下数据先经过file_server做格式化,然后再给调用进程。
3. 数据可以以binary和list方式返回,list方式下文件内容的byte就是一个整数,在64位机器上占用8个字节内存。
Read more…

Post Footer automatically generated by wp-posturl plugin for wordpress.

erlang关键的环境变量

August 23rd, 2013 No comments

原创文章,转载请注明: 转载自系统技术非业余研究

本文链接地址: erlang关键的环境变量

Erlang新增全面的系统信息收集器-system_information模块 这个模块列出来以下的环境变量,对系统运行非常关键,你都了解吗?

+%% get known useful erts environment
+
+os_getenv_erts_specific() ->
+ os_getenv_erts_specific([
+ “BINDIR”,
+ “DIALYZER_EMULATOR”,
+ “CERL_DETACHED_PROG”,
+ “EMU”,
+ “ERL_CONSOLE_MODE”,
+ “ERL_CRASH_DUMP”,
+ “ERL_CRASH_DUMP_NICE”,
+ “ERL_CRASH_DUMP_SECONDS”,
+ “ERL_EPMD_PORT”,
+ “ERL_EMULATOR_DLL”,
+ “ERL_FULLSWEEP_AFTER”,
+ “ERL_LIBS”,
+ “ERL_MALLOC_LIB”,
+ “ERL_MAX_PORTS”,
+ “ERL_MAX_ETS_TABLES”,
+ “ERL_NO_VFORK”,
+ “ERL_NO_KERNEL_POLL”,
+ “ERL_THREAD_POOL_SIZE”,
+ “ERLC_EMULATOR”,
+ “ESCRIPT_EMULATOR”,
+ “HOME”,
+ “HOMEDRIVE”,
+ “HOMEPATH”,
+ “LANG”,
+ “LC_ALL”,
+ “LC_CTYPE”,
+ “PATH”,
+ “PROGNAME”,
+ “RELDIR”,
+ “ROOTDIR”,
+ “TERM”,
+ %”VALGRIND_LOG_XML”,
+
+ %% heart
+ “COMSPEC”,
+ “HEART_COMMAND”,
+
+ %% run_erl
+ “RUN_ERL_LOG_ALIVE_MINUTES”,
+ “RUN_ERL_LOG_ACTIVITY_MINUTES”,
+ “RUN_ERL_LOG_ALIVE_FORMAT”,
+ “RUN_ERL_LOG_ALIVE_IN_UTC”,
+ “RUN_ERL_LOG_GENERATIONS”,
+ “RUN_ERL_LOG_MAXSIZE”,
+ “RUN_ERL_DISABLE_FLOWCNTRL”,
+
+ %% driver getenv
+ “CALLER_DRV_USE_OUTPUTV”,
+ “ERL_INET_GETHOST_DEBUG”,
+ “ERL_EFILE_THREAD_SHORT_CIRCUIT”,
+ “ERL_WINDOW_TITLE”,
+ “ERL_ABORT_ON_FAILURE”,
+ “TTYSL_DEBUG_LOG”
+ ]).

翻文档吧,都是很有意思的控制参数。

祝玩得开心!

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Erlang探索, 源码分析 Tags:

再谈crashdump产生注意事项

August 23rd, 2013 No comments

原创文章,转载请注明: 转载自系统技术非业余研究

本文链接地址: 再谈crashdump产生注意事项

在前面的博文里面,我们提到了crashdump的作用, 以及看门狗heart的工作原理,我们可以在程序crash后,让heart看门狗重新帮我们拉起来。

这里有几个问题需要注意:
1. 看门狗检查失效的时间,默认是65秒。
2. erlang系统在crash的时候会记录crashdump, 操作系统会产生coredump, 这个时间到底是多长。

代码证明如下:

/* heart.c */
...
/*  Maybe interesting to change */
/* Times in seconds */
#define  HEART_BEAT_BOOT_DELAY       60  /* 1 minute */
#define  SELECT_TIMEOUT               5  /* Every 5 seconds we reset the                                                  
                                            watchdog timer */

/* heart_beat_timeout is the maximum gap in seconds between two                                                           
   consecutive heart beat messages from Erlang, and HEART_BEAT_BOOT_DELAY                                                 
   is the the extra delay that wd_keeper allows for, to give heart a                                                      
   chance to reboot in the "normal" way before the hardware watchdog                                                      
   enters the scene. heart_beat_report_delay is the time allowed for reporting                                            
   before rebooting under VxWorks. */

int heart_beat_timeout = 60;
int heart_beat_report_delay = 30;
int heart_beat_boot_delay = HEART_BEAT_BOOT_DELAY;
...

这二个时间都会影响系统重新启动的间隔时间。
而crashdump的dump文件名、dump时间和优先级由下面几个变量来控制:

ERL_CRASH_DUMP
If the emulator needs to write a crash dump, the value of this variable will be the file name of the crash dump file. If the variable is not set, the name of the crash dump file will be erl_crash.dump in the current directory.

ERL_CRASH_DUMP_NICE
Unix systems: If the emulator needs to write a crash dump, it will use the value of this variable to set the nice value for the process, thus lowering its priority. The allowable range is 1 through 39 (higher values will be replaced with 39). The highest value, 39, will give the process the lowest priority.

ERL_CRASH_DUMP_SECONDS
Unix systems: This variable gives the number of seconds that the emulator will be allowed to spend writing a crash dump. When the given number of seconds have elapsed, the emulator will be terminated by a SIGALRM signal.

If the environment variable is not set or it is set to zero seconds, ERL_CRASH_DUMP_SECONDS=0, the runtime system will not even attempt to write the crash dump file. It will just terminate.

If the environment variable is set to negative valie, e.g. ERL_CRASH_DUMP_SECONDS=-1, the runtime system will wait indefinitely for the crash dump file to be written.

This environment variable is used in conjuction with heart if heart is running:

ERL_CRASH_DUMP_SECONDS=0
Suppresses the writing a crash dump file entirely, thus rebooting the runtime system immediately. This is the same as not setting the environment variable.

ERL_CRASH_DUMP_SECONDS=-1
Setting the environment variable to a negative value will cause the termination of the runtime system to wait until the crash dump file has been completly written.

ERL_CRASH_DUMP_SECONDS=S
Will wait for S seconds to complete the crash dump file and then terminate the runtime system.

如果我们不想产生coredump 可以透过 -env ERL_CRASH_DUMP_SECONDS 0 来关掉,避免产生dump时间过长的悲剧。同时每次crashdump产生的文件名相同,可以在启动通过 -env ERL_CRASH_DUMP erl_crash_date_time.dump 来修改,避免覆盖掉。

祝玩得开心!

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Erlang探索, 源码分析 Tags:

Erlang heart – 高可靠性的最后防线

August 23rd, 2013 No comments

原创文章,转载请注明: 转载自系统技术非业余研究

本文链接地址: Erlang heart – 高可靠性的最后防线

我们写的程序不可能都没有bug, 都存在crash的危险。很多时候我们需要个看门狗(watchdog)程序,在发现系统不正常的时候,就把系统重新启动。这类watchdog程序从内核到各种高可用程序都会设置有一个。erlang系统当然不能免俗,也会有几个heart.

我们来看下流程和效果:

$ export HEART_COMMAND="erl -heart"
$ erl -heart
heart_beat_kill_pid = 12640
Erlang R15B03 (erts-5.9.3.1)  [64-bit] [smp:16:16] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.9.3.1  (abort with ^G)
1> os:getpid().
"12640"
2> 
CTRL + Z 挂起erlang

$ pstree -p

+-beam.smp(12640)-+-heart(12670)
| | | |-{beam.smp}(12647)
| | | |-{beam.smp}(12648)
| | | |-{beam.smp}(12650)
| | | |-{beam.smp}(12653)
| | | |-{beam.smp}(12654)
| | | |-{beam.smp}(12655)
| | | |-{beam.smp}(12656)
| | | |-{beam.smp}(12657)
| | | |-{beam.smp}(12658)
| | | |-{beam.smp}(12659)
| | | |-{beam.smp}(12660)
| | | |-{beam.smp}(12661)
| | | |-{beam.smp}(12662)
| | | |-{beam.smp}(12663)
| | | |-{beam.smp}(12664)
| | | |-{beam.smp}(12665)
| | | |-{beam.smp}(12666)
| | | |-{beam.smp}(12667)
| | | |-{beam.smp}(12668)
| | | `-{beam.smp}(12669)
| | `-pstree(13702)

$ heart: Fri Aug 23 20:36:25 2013: heart-beat time-out, no activity for 65 seconds
heart_beat_kill_pid = 27920

我们看到erl重新被启动起来了。 现在简单的分析下原理:
heart由2部份组成:1. 外部程序: heart 2. erlang port模块: heart.erl

当开启heart的时候(erl – heart …) 外部程序heart被erlang模块heart.erl作为独立的进程启动起来,监视emulator的运作. heart.erl 每隔一定的时间向heart外部程序报告状态。当外部heart没有监测到心跳的时候就要采取行动, 重新运行$HEART_COMMAND所指定的命令。
Read more…

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Erlang探索, 源码分析 Tags:

application之染色特性分析和应用

August 18th, 2013 No comments

原创文章,转载请注明: 转载自系统技术非业余研究

本文链接地址: application之染色特性分析和应用

我们知道典型的erlang虚拟机里面会运行好多application,这些app互相依赖,相互协作,形成一个生态圈。典型场景见下图:

Screen Shot 2013-08-18 at 3.22.01 PM

每个app里面都会有很多进程,这些进程为这个app负责,会有些共同特性。那么这些进程如何区分出来属于哪个app的呢?就像我们伟大的祖国,有56个民族一样,这些民族都有自己的文化、服饰,甚至相貌,一看就和其他族群不太一样。他们的基因里面就携带了某种东西,这些东西子子孙孙传下去,一直保持下去。那么同样的,每个app里面的进程就和我们人,一样也会生老病死,也会有生命周期。他们是靠什么来识别的呢? 典型的application里面有很多层次的进程,通常成树状,和我们人类的组织差不多,见下图:

Screen Shot 2013-08-18 at 3.21.45 PM

我们先来看下application的文档和关键的几个函数:

which_applications() -> [{Application, Description, Vsn}]
Returns a list with information about the applications which are currently running. Application is the application name. Description and Vsn are the values of its description and vsn application specification keys, respectively.

示例如下:

1> application:which_applications().
[{os_mon,”CPO CXC 138 46″,”2.2.9″},
{sasl,”SASL CXC 138 11″,”2.2.1″},
{stdlib,”ERTS CXC 138 10″,”1.18.3″},
{kernel,”ERTS CXC 138 10″,”2.15.3″}]

我们可以看到我们运行的几个app的名字,版本号,描述等基本信息,再细节的就没有了。那第一,二个图中的这些信息是哪里来的呢?

Read more…

Post Footer automatically generated by wp-posturl plugin for wordpress.