Archive

Archive for the ‘源码分析’ Category

Erlang 17.5引入+hpds命令行控制进程默认字典大小

April 1st, 2015 4 comments

原创文章,转载请注明: 转载自系统技术非业余研究

本文链接地址: Erlang 17.5引入+hpds命令行控制进程默认字典大小

Erlang 17.5发布引入控制进程默认字典大小的命令行参数:

Erlang/OTP 17.5 has been released
Written by Henrik, 01 Apr 2015

Some highlights of the release are:
ERTS: Added command line argument option for setting the initial size of process dictionaries.

源码变化参看: https://github.com/erlang/otp/commit/c7a07bf984739bcc679d800e5383c01e1d07ffa5
文档:https://github.com/erlang/otp/commit/89987ada3c997fd9f1e1f8c8ed73da0394bda4ee

这个+hpds参数默认是字典有10个槽位,由于典型的Erlang VM内有成千上万的进程,官方又不鼓励使用字典,因为破坏fp语义,所以这些默认的字典占用的内存是浪费掉了。 挑小了对于内存敏感的嵌入式机器还是挺可观的;调大了对字典性能有一定的提升。

祝玩得开心!

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Erlang探索, 源码分析 Tags:

集群引入inet_dist_{listen,connect}_options更精细参数微调

April 1st, 2015 1 comment

原创文章,转载请注明: 转载自系统技术非业余研究

本文链接地址: 集群引入inet_dist_{listen,connect}_options更精细参数微调

Erlang 17.5版本引入了inet_dist_{listen,connect}_options,对于结点间的互联socket可以有更精细的控制,RPC的时候性能可以微调:

raimo/inet_tcp_dist-priority-option/OTP-12476:
Document kernel inet_dist_{listen,connect}_options
Test kernel inet_dist_{listen,connect}_options
Implement kernel inet_dist_{listen,connect}_options

参看源码:https://github.com/erlang/otp/commit/14ddc5594d74979a15a256a41fba5f1297aeaa1a

随着Erlang集群和节点上千,节点之间的RPC的开销和延迟就会被放大. 因为每个节点间的互通都是通过tcp, 驱动上也都是gen_tcp所以理论上对于gen_tcp合用的参数都可以设置,在延迟和吞吐量之间来平衡。

祝玩得开心!

Post Footer automatically generated by wp-posturl plugin for wordpress.

Erlang 18RC1释出

April 1st, 2015 1 comment

原创文章,转载请注明: 转载自系统技术非业余研究

本文链接地址: Erlang 18RC1释出

三月底,官网宣布Erlang 18RC1公测,参看 这里,按照惯例Erlang每年会出一个大版本,从R11开始到现在R18,7年时间我从Erlang的发展历程中见证了一个大型系统的演变过程。

Erlang/OTP 18.0-rc1 is available for testing.
This is an alpha release, which will be followed by a planned beta release in May and a final OTP 18.0 product release in June 2015.

18.0正式版本6月份会放出, 这个版本很值得期待的内容有以下几点, 参看 News From the OTP TEAM Berlin Erlang Factory Lite 2014

重点需要解决的:
f1
当期亮点:
f2

长期的:
f3

对于这次预发布最大的亮点如下:

Some highlights of the release are:

dialyzer: The -dialyzer() attribute can be used for suppressing warnings in a module by specifying functions or warning options. It can also be used for requesting warnings in a module.
erts: The time functionality has been extended. This includes a new API for time, as well as “time warp” modes which alters the behavior when system time changes. You are strongly encouraged to use the new API instead of the old API based on erlang:now/0. erlang:now/0 has been deprecated since it will always be a scalability bottleneck. For more information see the Time and Time Correction chapter of the ERTS User’s Guide. Here is a link http://www.erlang.org/documentation/doc-7.0-rc1/erts-7.0/doc/html/time_correction.html

erts: Beside the API changes and time warp modes a lot of scalability and performance improvements regarding time management has been made. Examples are:
scheduler specific timer wheels,
scheduler specific BIF timer management,
parallel retrieval of monotonic time and system time on OS:es that support it.
erts: The previously introduced “eager check I/O” feature is now enabled by default.
erts/compiler: enhanced support for maps. Big maps new uses a HAMT (Hash Array Mapped Trie) representation internally which makes them more efficient. There is now also support for variables as map keys.
ssl: Remove default support for SSL-3.0 and added padding check for TLS-1.0 due to the Poodle vulnerability.
ssl: Remove default support for RC4 cipher suites, as they are consider too weak.
stdlib: Allow maps for supervisor flags and child specs

其中对定时器的API语义和使用方式做了重新定义,伸缩能力优化做了大量的工作,拆成基于每个调度器的,对于定时器密集型的业务会有很大的性能提升。
具体的参考:Erlang VM Developer Lukas Larsson 做的 演讲

这次解决了定时器的瓶颈,个人最期待的是Multiple poll sets, 这个解决了,性能就完美了!

祝玩的开心!

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Erlang探索, 源码分析 Tags: , ,

R17新的调度策略+sub

May 18th, 2014 9 comments

原创文章,转载请注明: 转载自系统技术非业余研究

本文链接地址: R17新的调度策略+sub

R17的release note提到:

OTP-11385 A new optional scheduler utilization balancing mechanism has
been introduced. For more information see the +sub command
line argument.

Characteristics impact: None, when not enabled. When enabled,
changed timing in the system, normally a small overhead due
to measuring of utilization and calculating balancing
information. On some systems, such as old Windows systems,
the overhead can be quite substantial. This time measurement
overhead highly depend on the underlying primitives provided
by the OS.

引入了新的调度策略,具体的实现参见:这里
作者是大名鼎鼎的rickard-green,代码质量一定不会错的。

那这调度器策略干啥的呢?参见erl文档, 写的很清楚了:

+sub true|false
Enable or disable scheduler utilization balancing of load. By default scheduler utilization balancing is disabled and instead scheduler compaction of load is enabled which will strive for a load distribution which causes as many scheduler threads as possible to be fully loaded (i.e., not run out of work). When scheduler utilization balancing is enabled the system will instead try to balance scheduler utilization between schedulers. That is, strive for equal scheduler utilization on all schedulers.

再对比下默认的调度器策略说明,+scl:

+scl true|false
Enable or disable scheduler compaction of load. By default scheduler compaction of load is enabled. When enabled, load balancing will strive for a load distribution which causes as many scheduler threads as possible to be fully loaded (i.e., not run out of work). This is accomplished by migrating load (e.g. runnable processes) into a smaller set of schedulers when schedulers frequently run out of work. When disabled, the frequency with which schedulers run out of work will not be taken into account by the load balancing logic.

就很容易明白,之前的调度策略是先让低ID的调度器忙起来,不够用的话,再把高ID的拉下水,比较节能。但是在某些专机专用的场合,调度器能耗不是重点,希望全部调度器能够参与计算,减少系统的延迟,才是重点。 那这个+sub true就是你想要的。
这个特性唯一依赖的就是高精度时钟,而linux是不缺的, 默认不开启。
写段代码验证下,fib:busy让CPU保持狂运算:

$ cat fib.erl
-module(fib).
-export([fib/1, busy/0]).

fib(0) -> 1;
fib(1) -> 1;
fib(N) -> fib(N-1) + fib(N-2).
busy()-> fib(10), busy().

分别用不同的调度器策略试验下效果, +sbt db绑定CPU,方便观察:

$ erl  +sbt db +sub true 
Erlang/OTP 17 [erts-6.0.1] [source] [64-bit] [smp:16:16] [async-threads:10] [hipe] [kernel-poll:false]

Eshell V6.0.1  (abort with ^G)
1> [spawn(fun()-> fib:busy() end)||_<-lists:seq(1,8)].
[<0.34.0>,<0.35.0>,<0.36.0>,<0.37.0>,<0.38.0>,<0.39.0>,
 <0.40.0>,<0.41.0>]
2> 

不同的策略,CPU使用情况如下图(nmon):
+sub false
nmon1
+sub true
nmon2
效果非常明显。

祝玩的开心。

Post Footer automatically generated by wp-posturl plugin for wordpress.

基于LLVM的高性能Erlang(Hipe)尝鲜

March 25th, 2014 11 comments

原创文章,转载请注明: 转载自系统技术非业余研究

本文链接地址: 基于LLVM的高性能Erlang(Hipe)尝鲜

即将发布的R17A版本引入很重要的一个针对性能提升的特性:”Support the LLVM backend in HiPE”,具体改变参见这里. 我们知道Erlang是一门领域语言,第一天就是为电信工业高可用,集群和热更新环境而设计的,语言的性能一开始不是重点。直到R12版本才加入SMP多处理器,充分适应多核化的硬件发展趋势,从此向着高性能大步迈进。

Erlang的虚拟机是register based的,性能上和python类似,和c语言大概有7倍的差距。虽然大部分的集群和网络服务器,性能瓶颈在IO上面,而且这块erts(erlang运行期系统)做的非常的强大,但是一旦涉及到大量的计算,就有点麻烦了,因为它缺乏类似java jit那样强大的支持,让语言足够的快。解决方案是自己写nif、driver或者bif,但是会破坏稳定性。

它很早有自己的hipe, 主要是Uppsala University大学的Kostis Sagonas带领学生做的, 97年开始做的,性能的提升虽然不少,但是在架构上有些缺点,而且和otp团队是二个不同的团队,在稳定性上无法达到产品质量。为了进一步解决这个问题,他带着Christos Stavrakakis和Yiannis Tsiouris,重新实现了基于LLVM后端的Hipe,也就是erllvm,官方网站在这里.

官方描述如下:

ErLLVM is a project aiming at providing multiple back ends for the High Performance Erlang (HiPE) with the use of the LLVM infastructure.

这次R17发布就是把ErLLVM融入到erlang主干版本去。那么ErLLVM的技术改进点在哪里?看下面的图就明白了。

722x227-hipe_llvm_arch

最关键的一点就是之前的hipe自己从RTL生成硬件代码,而ErlLvm把这个事情交给了llvm专业去生成,它只做RTL->llvm层的薄薄的翻译,这样稳定性的问题就offload交给了llvm,而llvm的稳定性是经过社区规模考验的。

llvm_pipeline

这样就很好的解决了稳定性和性能的问题。 Read more…

Post Footer automatically generated by wp-posturl plugin for wordpress.

Erlang 新数据类型Map的定位和性能

March 12th, 2014 4 comments

原创文章,转载请注明: 转载自系统技术非业余研究

本文链接地址: Erlang 新数据类型Map的定位和性能

Erlang R17最大的语言层面的变化莫过是引入 Map数据结构,参见:Erlang R17新特性浅评 还有 这里

Map相关的细节在EEP 43上,参见 这里

定位:

A record replacement is just that, a replacement. It’s like asking the question, “What do we have?” instead of “What can we get?” The instant rebuttal would be “What do we need?” I say Maps.

满足的约束:

The new data-type shall have semantics, syntax and operations that:

> provides an association set from key terms to value terms which can be constructed, accessed and updated using language syntax
> can be uniquely distinguished from every other data-type in the language
> has no compile-time dependency for constructing, accessing or updating contents of maps nor for passing maps between modules, processes or over Erlang distribution
> can be used in matching expressions in the language
> has a one-to-one association between printing and parsing the data-type
> has a well defined order between terms of the type and other Erlang types
> has at most O(log N) time complexity in insert and lookup operations, where N is the number of key-value associations.

思遥同学很贴心的写了一篇maps的分析,参看 Erlang 的新数据结构 map 浅析

Read more…

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Erlang探索, 源码分析 Tags: ,

cowboy-高性能简洁的erlang版web框架

February 27th, 2014 4 comments

原创文章,转载请注明: 转载自系统技术非业余研究

本文链接地址: cowboy-高性能简洁的erlang版web框架

大部分的分布式系统只要有业务价值,必须提供如API,监控,管理界面等等,而http是目前事实上的标准,换句话说分布式系统必须提供强大的web框架,编写业务才能容易上手。 Erlang系统第一天就是设计干这个的,自然有很多web框架,出名的如mochiweb, cowboy,chicagoboss, misultin,inets等框架,竞争也是非常激烈。今天我要推荐的是新秀:cowboy,项目在这里

那么Cowboy是什么呢?

Cowboy is a small, fast and modular HTTP server written in Erlang.

其定位非常明确:

Cowboy aims to provide a complete HTTP stack in a small code base. It is optimized for low latency and low memory usage, in part because it uses binary strings.

Cowboy provides routing capabilities, selectively dispatching requests to handlers written in Erlang.

Because it uses Ranch for managing connections, Cowboy can easily be embedded in any other application.

No parameterized module. No process dictionary. Clean Erlang code.

目前还支持websocket和spdy协议(由leofs赞助),是个完整的http协议栈实现,功能强劲。由于是最近才发展起来的,充分考虑了性能,强调代码的整洁,作者在产品层面对erlang的优缺点非常了解,实现的很优雅很高效。 目前多个项目都是用它支撑的,也反过来刺激它走的很稳很快,代码维护也很活跃,比如竞争对手misultin感觉没有超越的希望就放弃了。

Misultin development has been discontinued.

There currently are three main webserver libraries which basically do similar things:
* Mochiweb
* Cowboy
* Misultin
Mochiweb has been around the block for a while and it’s proven solid in production, I can only recommend it for all basic webserver needs you might have. Cowboy has a very interesting approach since it allows to use multiple TCP and UDP protocols on top of a common acceptor pool. It is a very modern approach, is very actively maintained and many projects are starting to be built around it.

在c实现中nginx是做的最好的,但是如果你用nginx写自己的业务代码的时候,你就会把整个系统的性能拖到实现者的水平,总体来讲性能不会太高。而基于cowboy的实现再加上erlang天然的优势,通常一台机器做到几万的QPS的业务,对实现者的要求非常低,可以大大促进生产力,额外的福利还有热升级和稳定可靠。

小结:技术在进步,思想也要放开。

祝玩的开心。

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Erlang探索, 源码分析 Tags: , ,