源码分析 | 系统技术非业余研究

Recon-Erlang线上系统诊断工具

February 27th, 2014 Yu Feng 6 comments

原创文章，转载请注明： 转载自系统技术非业余研究

Erlang系统素以稳定可靠闻名，但是它也是c实现的，也是要管理比如内存，锁等等复杂的事情，也会出现Crash，而且crash的时候大部分原因是因为内存问题。为此erlang运行期提供了强大的自省机制，帮忙用户诊断问题。自省机制过于强大，而且大部分的信息是散落在各处的，不是太资深的用户很难总体把握，而且线上系统读取这些信息的时候，也要考虑对系统的影响。这时候recon来帮忙了，源码在这里。

Recon的定位很清晰:

Recon wants to be a set of tools usable in production to diagnose Erlang problems or inspect production environment safely.

它的作者Ferd就是 Learn You Some Erlang for great good! 书的作者（中文版我在翻译）参见：http://learnyousomeerlang.com/
Ferd同学素以讲解事情深入简出为最大特点，思路特别清晰，目前就职于Heroku, 负责开发logplex项目，而Recon就是在运维logplex时候吃自己狗粮的时候的产物，参见这篇文章的描述，非常有技术含量的分析，特别的分析了erlang内存分配器的工作原理，如何观察，提高内存分配率。

回到Recon, 它的文档非常清晰，见这里。

Recon主要包括三个模块：

recon
Main module, contains basic functionality to interact with the recon application. It includes functions to gather information about processes and the general state of the virtual machine, ports, and OTP behaviours running in the node. It also includes a few functions to facilitate RPC calls with distributed Erlang nodes.

recon_lib
Regroups useful functionality used by recon when dealing with data from the node. Would be an interesting place to look if you were looking to extend Recon’s functionality

recon_alloc
Regroups functions to deal with Erlang’s memory allocators, or particularly, to try to present the allocator data in a way that makes it simpler to discover the presence of possible problems.

和一系列脚本用于在发生crashdump的时候帮助用户分析到原因, 设计的时候充分考虑到了对系统的最小影响，在线上使用是很安全的。

其中最有价值的是 recon_alloc，基本上把内存分配器的细节和复杂都屏蔽起来，用户可以很好的看到内存工作的效率.

借助recon_alloc发现了很多线上内存分配器利用率过低的问题。

祝玩得开心！

注： Recon新版本Provides production-safe tracing facilities, to dig into the execution of programs and function calls as they are running. 参见这里

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Erlang探索, 源码分析 Tags: recon

Erlang R17新特性浅评

February 26th, 2014 Yu Feng 1 comment

原创文章，转载请注明： 转载自系统技术非业余研究

本文链接地址: Erlang R17新特性浅评

Erlang R17RC2 源码已经就绪，参见这里
后续版本的发布时间，官方的时间安排参见这里，摘抄如下：

Preliminary dates for the upcoming release:
Release: erts, emu,comp |Code stop |Documentation stop |Release Date

17.0-rc2 2014-02-21 2014-02-21 2014-02-21 2014-02-26
17.0 2014-03-10 2014-03-17 2014-03-19 2014-03-26

We will focus the time between 17.0-rc2 and 17.0 on bug fixes, improvements, and testing. Therefore you are most welcome to submit patches regarding such issues and we will try our best to include them before 17.0 is released.
Especially bugs introduced in 17.0-rcX.

R17 release note在这里，闪亮点摘抄如下：

Maps, a new dictionary data type (experimental) 支持map数据结构，这是语言层面很大的变化，除了虚拟机和编译器的支持外，外围工具如debugger,profiler, ei, dialyzer等整个体系都需要大幅改变支持。

The {active, N} socket option for TCP, UDP, and SCTP 参看这篇：inet驱动新增加{active,N} socket选项

A new (optional) scheduler utilization balancing mechanism 这个特性改变了原来调度器full or not（在调度器用满之前，不会再用新的调度器）的特点，添加了一种调度器之间利用率尽量平衡的调度算法。参见这篇：R17新的调度策略+sub

Migration of memory carriers has been enabled by default on all ERTS internal memory allocators 内存carriers在不同的调度器之间会互相迁移，提高内存的利用率。

Increased garbage collection tenure rate 垃圾回收更友好，同时异步化

Experimental “dirty schedulers” functionality 参见
Erlang/OTP 17.0-rc1 新引入的”脏调度器”浅析引入的原因例证NIF使用的误区

Funs can now be given names 参考：Erlang R16B03发布，R17已发力

Miscellaneous unicode support enhancements 全线支持unicode

A new, semantic version scheme for OTP and its applications 版本号管理更细致。

总体来看，VM的变化更多向着异步化和高性能（内存使用效率、CPU调度效率）的方向快速前进，系统也越来越强健, 一年一个大版本也很快。

祝玩得开心。

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Erlang探索, 源码分析 Tags: R17

erlang虚拟机内部文档

January 16th, 2014 Yu Feng 1 comment

原创文章，转载请注明： 转载自系统技术非业余研究

本文链接地址: erlang虚拟机内部文档

erlang的运行期系统其实是个非常强悍的服务器，除了完善的分布式方面的实现，还有极高的性能。这些性能是通过压榨CPU、内存、锁获取到的，一句话概括，这些高性能实现是个宝藏。

但是一般的用户没有好的指导是很难挖到宝的，原因是这些高性能的获取和软硬件的体系紧密相关，以及erlang以消息为导向的哲学下的平衡，本身就超越了一般用户的使用场景。

幸运的是erlang开发团队认识到这个问题，开始为我们描述运行期内部的工作原理，相关的文档见这里

容我稍微摘抄下：
Carrier Migration

The ERTS memory allocators manage memory blocks in two types of raw memory chunks. We call these chunks of raw memory carriers. Singleblock carriers which only contain one large block, and multiblock carriers which contain multiple blocks. A carrier is typically created using mmap() on unix systems. However, how a carrier is created is of minor importance. An allocator instance typically manages a mixture of single- and multiblock carriers.

Erlang R16支持带颜色的控制台

December 27th, 2013 Yu Feng 6 comments

原创文章，转载请注明： 转载自系统技术非业余研究

本文链接地址: Erlang R16支持带颜色的控制台

Erlang通过fix tty驱动的过滤，在R16版本支持带颜色的控制台，这个特性在我们做各种监控工具高亮非常有帮助，参见R16的Readme:

Support ANSI in console Unix platforms will no longer filter control sequences to the ttsl driver thus enabling ANSI and colors in console. (Thanks to Pedram Nimreezi)

应用程序方面已经有日志系统lager率先支持“Colored terminal output (requires R16+)”

我们来演示下：

$ erl
Erlang R16B02 (erts-5.10.3) [source] [64-bit] [smp:16:16] [async-threads:10] [hipe] [kernel-poll:false]

Eshell V5.10.3  (abort with ^G)
1> [io:fwrite("~s~s",[Level, Color])
1> ||
1> {Level, Color}<-
1> [
1> {debug,     "\e[0;38m" },
1> {info,      "\e[1;37m" },
1> {notice,    "\e[1;36m" },
1> {warning,   "\e[1;33m" },
1> {error,     "\e[1;31m" },
1> {critical,  "\e[1;35m" },
1> {alert,     "\e[1;44m" },
1> {emergency, "\e[1;41m" },
1> {eol,        "\e[0m\r\n"}
1> ]
1> ].
debuginfonoticewarningerrorcriticalalertemergencyeol
[ok,ok,ok,ok,ok,ok,ok,ok,ok]
2>

效果如下图：

祝玩得开心。

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Erlang探索, 源码分析 Tags: ANSI， console

heroku_crashdumps避免crashdump覆盖

December 22nd, 2013 Yu Feng Comments off

原创文章，转载请注明： 转载自系统技术非业余研究

本文链接地址: heroku_crashdumps避免crashdump覆盖

erlang虚拟机挂掉的时候会产生个crashdump, 尽可能多的保存当时的现场，相关的使用和配置之前我写了不少的博文来介绍。
但是实际使用中的时候，每次crash的时候这个现场文件都叫erl_crash.dump，会把前一次的覆盖掉，不利于生产环境定位问题。

当然我们可以透过环境变量来配置：

ERL_CRASH_DUMP
If the emulator needs to write a crash dump, the value of this variable will be the file name of the crash dump file. If the variable is not set, the name of the crash dump file will be erl_crash.dump in the current directory

或者我们还有heroku_crashdumps项目可以帮我们更省力，项目参见这里，核心代码就这几行:

  case dumpdir() of
        undefined -> ok;
        Dir ->
            File = string:join([os:getenv("INSTANCE_NAME"),
                                "boot",
                                datetime(now)], "_"),
            DumpFile = filename:join(Dir, File),
            error_logger:info_msg("at=setup_crashdumps dumpfile=~p", [DumpFile]),
            os:putenv("ERL_CRASH_DUMP", DumpFile)
    end.

它的价值在于以application方式来进行的，有二个好处：
1. 容易打包。因为包括rebar在内的打包工具都是以app为单位的。
2. 使用起来更简单。 crashdump自动会把日期时间加上去。

具体使用够简单，参见项目说明，我就不罗嗦了，我来演示下效果：

$ git clone https://github.com/heroku/heroku_crashdumps.git
Initialized empty Git repository in /home/chuba/heroku_crashdumps/heroku_crashdumps/.git/
remote: Counting objects: 11, done.
remote: Compressing objects: 100% (8/8), done.
cremote: Total 11 (delta 2), reused 11 (delta 2)
Unpacking objects: 100% (11/11), done.
$ cd heroku_crashdumps/
$ ./rebar compile
==> heroku_crashdumps (compile)
Compiled src/heroku_crashdumps_app.erl

$ ERL_CRASH_DUMP="." INSTANCE_NAME="erl_crash.dump"  erl -pa ebin
Erlang R16B02 (erts-5.10.3) [source] [64-bit] [smp:16:16] [async-threads:10] [hipe] [kernel-poll:false]

Eshell V5.10.3  (abort with ^G)
1> heroku_crashdumps_app:start().

=INFO REPORT==== 22-Dec-2013::16:12:41 ===
at=setup_crashdumps dumpfile="./erl_crash.dump_boot_2013-12-22T08:12:41+00:00"true
2> 
BREAK: (a)bort (c)ontinue (p)roc info (i)nfo (l)oaded
       (v)ersion (k)ill (D)b-tables (d)istribution
A

Crash dump was written to: ./erl_crash.dump_boot_2013-12-22T08:12:41+00:00
Crash dump requested by userAborted
$ ls ./erl_crash.dump_boot_2013-12-22T08:12:41+00:00
./erl_crash.dump_boot_2013-12-22T08:12:41+00:00

小结：偷懒是美德！

祝玩得开心！

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Erlang探索, 源码分析 Tags: crashdump, heroku_crashdumps

Erlang R16B03发布，R17已发力

December 21st, 2013 Yu Feng 2 comments

原创文章，转载请注明： 转载自系统技术非业余研究

本文链接地址: Erlang R16B03发布，R17已发力

Erlang R16B03发布了，通常03版本是bug fix版本，进入生产版本，官方的说明如下：

OTP R16B03 is a service release with mostly a number of small corrections and user contributions. But there are some new functions worth mentioning as well, here are some of them:

A new memory allocation feature called “super carrier” has been introduced. It can for example be used for pre-allocation of all memory that the runtime system should be able to use. It is enabled by passing the +MMscs (size in MB) command line argument. For more information see the documentation of the +MMsco, +MMscrfsd, +MMscrpm, +MMscs, +MMusac, and, +Mlpm command line arguments in the erts_alloc(3) documentation.
The LDAP client (eldap application) now supports the start_tls operation. This upgrades an existing tcp connection to encryption using TLS, see eldap:start_tls/2 and /3.
The FTP client (inets application) now supports FTP over TLS (ftps).

其中最大的改进就是super carrier，参见我之前写的博文, 这个特性在于专机专用的系统，内存的使用效率就高很多，同时这个版本对多个核心之间内存倒腾的效率和利用率也高很多，值得大家去用。内存的利用率和碎片率通常不会引起大家的关注，在生产的服务器中，这点非常值得关注。

R16B03发布了后，官方马不停蹄的就进入R17的开发。其中最大的期待就是语言方面的改进，包括eep37和map数据结构。 eep37特性已经进入master, commit在这里，该特性的具体描述在这里.

简单的说：就是你过去在shell下写如下的fun过去是不可能的。

fun Fact(N) when N > 0 ->
            N * Fact(N - 1);
        Fact(0) ->
            1
end.

但是我们又经常需要这样的匿名函数，比如spawn函数，很不方便。 eep37就是解决这样的事情的。

我们来演示下，首先安装R17:

$ kerl build git git://github.com/erlang/otp.git master r17 && kerl install r17  r17
$ r17/bin/erl   
Erlang/OTP 17.0-rc0 [erts-6.0] [source-7d4e5e2] [64-bit] [smp:16:16] [async-threads:10] [hipe] [kernel-poll:false] [type-assertions] [debug-compiled] [lock-checking] [systemtap]

Eshell V6.0  (abort with ^G)
1> fun Fact(N) when N > 0 ->
1>             N * Fact(N - 1);
1>         Fact(0) ->
1>             1
1>     end.
#Fun<erl_eval.29.42696066>
2> F=e(-1).
#Fun<erl_eval.29.42696066>
3> F(10).
3628800

eep37这个特性涉及到的改变还是很大的:语言规格，编译器，调试器，dialyzer, tracer， shell, emacs插件等等，全系列的改变，所以要放在R17里面来，虽然看提交，代码早在1年前准备好了。我们再来体验下：

$ cat eep37.erl 
-module(eep37).

-compile(export_all).

-spec self() -> fun(() -> fun()).
self() ->
    fun Self() -> Self end.

-spec fact() -> fun((non_neg_integer()) -> non_neg_integer()).
fact() ->
    fun Fact(N) when N > 0 ->
            N * Fact(N - 1);
        Fact(0) ->
            1
    end.

$ r17/bin/erl 
Erlang/OTP 17.0-rc0 [erts-6.0] [source-7d4e5e2] [64-bit] [smp:16:16] [async-threads:10] [hipe] [kernel-poll:false] [type-assertions] [debug-compiled] [lock-checking] [systemtap]

Eshell V6.0  (abort with ^G)
1> eep37:fact().    
#Fun<eep37.1.16748913>
2> F=e(-1).
#Fun<eep37.1.16748913>
3> F(10).
3628800
4>

小结：语言也在不停的让用户爽。

祝玩得开心！

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Erlang探索, 源码分析 Tags: eep37, R16B03, R17, super carrier

量化Erlang进程调度的代价

November 14th, 2013 Yu Feng Comments off

原创文章，转载请注明： 转载自系统技术非业余研究

本文链接地址: 量化Erlang进程调度的代价

我们都知道erlang的基本哲学之一就是“小消息大计算”，简单的说就是尽可能的在消息里面携带完整的计算需要的信息，然后计算要尽可能的多，最好远超过消息传递的代价。但是为什么要这样呢？erlang消息发送的效率是很高的, 参见这篇文章

Roughly speaking, I’m seeing 3.4 million deliveries per second one-way, and 1.4 million roundtrips per second (2.8 million deliveries per second) in a ping-pong setup in the same environment as previously – a 2.8GHz Pentium 4 with 1MB cache.

在我的机器上的演示下看看具体的数字：

$ erl 
Erlang R15B03 (erts-5.9.3.1) [source] [64-bit] [smp:16:16] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.9.3.1  (abort with ^G)
1> ipctest:pingpong().
832296.5692402497
2>

大概83万每秒个消息pingpong，测试程序涉及到二个Erlang进程ping和pong.
一个完整的流程涉及到 1. ping进程运行 2. ping进程等pong消息被切出。 3. pong运行 4. pong等ping消息被切出。这个流程涉及到二次Erlang进程的调度。
这是一个典型的erlang使用的场景，我们现在的问题是到底一个erlang进程调度的开销是多少？
从erts的实现来看，erlang会调用schedule()函数来选择下一个要调度的进程，而真正swapin和swapout的代价并不高，那我们来统计下schedule的开销。

还是祭出我们伟大的stap，写段调查代码先：

$ cat sch.stp
global total, coll_sch, sch
global exclude_sys_schedule

probe process("beam.smp").function("schedule") {
      sch[tid()] = gettimeofday_ns();
      total++;
}

probe process("beam.smp").function("schedule").return {
      tid = tid();
      e = gettimeofday_ns() - sch[tid];
      if (exclude_sys_schedule && e > 10 * 1000 * 1000 ) coll_sch <<< 0;
      else coll_sch <<< e;
}

function print_colls () {
      prt_line = 0;
      if(@count(coll_sch) >0) {
            printf("total %d, avg %d ns\n", total, @avg(coll_sch));
            printf("===========erts schedule(ns)===========\n");
            print(@hist_log(coll_sch));
            prt_line = 1;
      }

      if(prt_line) printf("--------------------------------------------------------------\n");
      delete coll_sch;
      delete sch;
      delete total;
}

probe timer.s(1) {
      print_colls();
}

probe begin {
      exclude_sys_schedule = $1
      println("x:");
}

$ PATH=/usr/local/lib/erlang/erts-5.9.3.1/bin/:$PATH sudo stap sch.stp 1
x:

如果调度器在不忙或者调度足够多的进程后，需要收割epoll事件，也就是会调用sys_schedule，这个时间通常会是ms级别的，我们将之排除掉，避免对平均时间的很大干扰。
Read more…

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Erlang探索, 源码分析, 调优 Tags: context_switches, schedule

Newer Entries Older Entries

系统技术非业余研究

Archive

Recon-Erlang线上系统诊断工具

Erlang R17新特性浅评

erlang虚拟机内部文档

Erlang R16支持带颜色的控制台

heroku_crashdumps避免crashdump覆盖

Erlang R16B03发布，R17已发力

量化Erlang进程调度的代价

buy me a coffee.

Recent Posts

Recent Comments

Categories

Blogroll

Archives

Meta