Erlang探索 | 系统技术非业余研究

R17新的调度策略+sub

May 18th, 2014 Yu Feng 9 comments

原创文章，转载请注明： 转载自系统技术非业余研究

OTP-11385 A new optional scheduler utilization balancing mechanism has
been introduced. For more information see the +sub command
line argument.

Characteristics impact: None, when not enabled. When enabled,
changed timing in the system, normally a small overhead due
to measuring of utilization and calculating balancing
information. On some systems, such as old Windows systems,
the overhead can be quite substantial. This time measurement
overhead highly depend on the underlying primitives provided
by the OS.

引入了新的调度策略，具体的实现参见：这里
作者是大名鼎鼎的rickard-green，代码质量一定不会错的。

那这调度器策略干啥的呢？参见erl文档, 写的很清楚了：

+sub true|false
Enable or disable scheduler utilization balancing of load. By default scheduler utilization balancing is disabled and instead scheduler compaction of load is enabled which will strive for a load distribution which causes as many scheduler threads as possible to be fully loaded (i.e., not run out of work). When scheduler utilization balancing is enabled the system will instead try to balance scheduler utilization between schedulers. That is, strive for equal scheduler utilization on all schedulers.

再对比下默认的调度器策略说明,+scl:

+scl true|false
Enable or disable scheduler compaction of load. By default scheduler compaction of load is enabled. When enabled, load balancing will strive for a load distribution which causes as many scheduler threads as possible to be fully loaded (i.e., not run out of work). This is accomplished by migrating load (e.g. runnable processes) into a smaller set of schedulers when schedulers frequently run out of work. When disabled, the frequency with which schedulers run out of work will not be taken into account by the load balancing logic.

就很容易明白，之前的调度策略是先让低ID的调度器忙起来，不够用的话，再把高ID的拉下水，比较节能。但是在某些专机专用的场合，调度器能耗不是重点，希望全部调度器能够参与计算，减少系统的延迟，才是重点。那这个+sub true就是你想要的。
这个特性唯一依赖的就是高精度时钟，而linux是不缺的, 默认不开启。
写段代码验证下，fib:busy让CPU保持狂运算：

$ cat fib.erl
-module(fib).
-export([fib/1, busy/0]).

fib(0) -> 1;
fib(1) -> 1;
fib(N) -> fib(N-1) + fib(N-2).
busy()-> fib(10), busy().

分别用不同的调度器策略试验下效果， +sbt db绑定CPU，方便观察：

$ erl  +sbt db +sub true 
Erlang/OTP 17 [erts-6.0.1] [source] [64-bit] [smp:16:16] [async-threads:10] [hipe] [kernel-poll:false]

Eshell V6.0.1  (abort with ^G)
1> [spawn(fun()-> fib:busy() end)||_<-lists:seq(1,8)].
[<0.34.0>,<0.35.0>,<0.36.0>,<0.37.0>,<0.38.0>,<0.39.0>,
 <0.40.0>,<0.41.0>]
2>

不同的策略，CPU使用情况如下图（nmon）：
+sub false

+sub true

效果非常明显。

祝玩的开心。

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Erlang探索, 源码分析 Tags: +sub, scheduler utilization

Erlang内存体系调优

April 28th, 2014 Yu Feng 1 comment

原创文章，转载请注明： 转载自系统技术非业余研究

本文链接地址: Erlang内存体系调优

Lukas Larsson，核心的VM开发者，最近很活跃，在Erlang内存体系上做了不少工作，包括recon项目的贡献。

他最近在erlang factory会议上分享了“Memory Allocators in the VM, Memory Management: Battle Storie”，参见这里。

Erlang内存体系架构是个复杂的体系，一般的开发人员能难一眼就能搞清楚：

所以我们需要专家的经验把我们迅速带入门，他的PPT不再提供下载，我拉了一份，在这里，原理、方法以及案例分析，很不错。

祝玩得开心！

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Erlang探索, 调优 Tags: memory, tuning, VM

Erlang公平调度的误解

April 27th, 2014 Yu Feng 4 comments

原创文章，转载请注明： 转载自系统技术非业余研究

本文链接地址: Erlang公平调度的误解

Erlang公平调度是它的哲学（或者说坚持）之一，从第一个版本的beam代码的时间片分配和抢占开始，到最近版本的bif对公平性的坚持（比如R17版binary_to_term就大幅做了修改，代码复杂很多，执行效率也有下降，但是在碰到大的binary的情况下，通过Trap机制会让出执行权，排队后再回来断点续作）， nif（加入扣除时间片的接口），这些努力保证了erlang系统是个公平的系统。

很多终端系统和业务会受益于这个哲学，如云计算。不管用户大小和业务的负载情况如何，系统性的公平性可以保证每个用户有机会被服务，对用户有很好的体验。而公平性一定是要贯穿于整个系统里面，特别是有设计哲学方面来保证，每个模块和系统的设计者都能心有默契来遵守。如果有一个人打破了，整个系统其他的公平也就失去意义。这也是为什么公平性这么难做的核心原因。

我们在云业务里面会经常碰到一个系统在服务内部用户的时候效果很好，针对公众服务的时候就遇到各种抱怨。我举个简单的例子如RDS（mysql云），如果不考虑公平性，来一个SQL服务一个，而不管SQL的大小和复杂度，就会马上面临一个问题，强势的几个用户会把这个系统的资源全部占光，其他用户的SQL根本没有机会得到执行。这个问题当然可以通过mysql后端的资源隔离，如cgroup限制CPU，内存，IO，网络的消耗，还需要通过精确的每条SQL的消耗来在适当的时候让出控制权来达到目的。这个链条上的公平非常难做，所以说大部分的系统都是蹩脚的，无法做到真正的公平，只是程度的差别而已。

虽然Erlang骨子里面就强调公平，而且在身体力行，但是实现上的细节决定了，下面几件事情上做不到公平，留下了坑。我们知道erlang代码核心的思路是根据规约数（函数执行一个，算一个规约），每个进程预先分配比如2000个时间片。时间片用完或者条件不满足（比如需要等消息）时候让出控制权，而且系统是消息驱动的。
Read more…

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Erlang探索 Tags:

Erlang内存分配器之mbcs_pool

April 27th, 2014 Yu Feng 1 comment

原创文章，转载请注明： 转载自系统技术非业余研究

本文链接地址: Erlang内存分配器之mbcs_pool

Erlang R17.0 发布的release note 里面花了挺多笔墨讲了内存carrier迁移的特性：

Support for migration of memory carriers between memory allocator instances has been introduced.
By default this feature is not enabled and do not effect the characteristics of the system. When enabled it has the following impact on the characteristics of the system:

* Reduced memory footprint when the memory load is unevenly distributed between scheduler specific allocator instances.
* Depending on the default allocaton strategy used on a specific allocator there might or might not be a slight performance loss.
* When enabled on the fix_alloc allocator, a different strategy for management of fix blocks will be used.
* The information returned from erlang:system_info({allocator, A}), and erlang:system_info({allocator_sizes, A}) will be slightly different when this feature has been enabled. An mbcs_pool tuple will be present giving information about abandoned carriers, and in the fix_alloc case no fix_types tuple will be present.

For more information, see the documentation of the +M acul command line argument.

那么什么是”migration of memory carriers between memory allocator instances“，解决什么问题呢？
官方的文档 erts/emulator/internal_doc/CarrierMigration.md, 见这里, 已经描述的非常清楚了。

我来简单的说下复述下：
Erlang的内存分配器为了提高性能，每个调度器一个都有自己的内存池，在申请/释放内存的可以避免大量的锁争用，提高了性能。但也带来内存浪费的问题，首先调度器默认使用策略是“full load or not”, 也就是说低ID的调度器如果没饱和的话，不会用下一个调度器。在高负载的情况下，更多的调度器被启用，该调度器上的内存被缓冲，留在池子里。当工作负载下去的的话，因为压力没到，高ID的调度器没机会被使用，也就是说这个时候，这个调度器上的内存就浪费掉了，从整个VM的角度来看，内存的碎片率就很高。Erlang的VM是以稳定性著名的，但是它也有Crash的时候，十有八九是因为内存爆了。我们在设计系统的时候，通常从数据量去反推需要的内存，但是如果有碎片或者浪费存在严重的话，我们就无法准确，就可能导致灾难。为了解决这个问题，最直接的反应就是当每个调度器池子里面的内存使用率低于一定程度的时候，就把该块内存出让出来，让有需要的调度器能够利用起来。这就是内存carriers迁移要解决的核心问题。

这个特性在R16加入的默认不启用，R17默认启用了，也就是说+M acul 默认是“de”. 带来的影响有下面几个：
1. 内存申请的效率，现在的测试是先在自己的池子里面分配，不满足了就找cpool，再不满足才去找mseg或者sys分配器申请。效率和利用率是鱼和熊掌不可兼得。
2. 内存分配器的内存除了在池子里面还有在cpool里，统计内存的时候要小心。
3. 内存浪费率到什么时候被废弃。

基于LLVM的高性能Erlang(Hipe)尝鲜

March 25th, 2014 Yu Feng 11 comments

原创文章，转载请注明： 转载自系统技术非业余研究

本文链接地址: 基于LLVM的高性能Erlang(Hipe)尝鲜

即将发布的R17A版本引入很重要的一个针对性能提升的特性：”Support the LLVM backend in HiPE”，具体改变参见这里. 我们知道Erlang是一门领域语言，第一天就是为电信工业高可用，集群和热更新环境而设计的，语言的性能一开始不是重点。直到R12版本才加入SMP多处理器，充分适应多核化的硬件发展趋势，从此向着高性能大步迈进。

Erlang的虚拟机是register based的，性能上和python类似，和c语言大概有7倍的差距。虽然大部分的集群和网络服务器，性能瓶颈在IO上面，而且这块erts(erlang运行期系统）做的非常的强大，但是一旦涉及到大量的计算，就有点麻烦了，因为它缺乏类似java jit那样强大的支持，让语言足够的快。解决方案是自己写nif、driver或者bif,但是会破坏稳定性。

它很早有自己的hipe, 主要是Uppsala University大学的Kostis Sagonas带领学生做的, 97年开始做的，性能的提升虽然不少，但是在架构上有些缺点，而且和otp团队是二个不同的团队，在稳定性上无法达到产品质量。为了进一步解决这个问题，他带着Christos Stavrakakis和Yiannis Tsiouris，重新实现了基于LLVM后端的Hipe，也就是erllvm，官方网站在这里.

官方描述如下：

ErLLVM is a project aiming at providing multiple back ends for the High Performance Erlang (HiPE) with the use of the LLVM infastructure.

这次R17发布就是把ErLLVM融入到erlang主干版本去。那么ErLLVM的技术改进点在哪里？看下面的图就明白了。

最关键的一点就是之前的hipe自己从RTL生成硬件代码，而ErlLvm把这个事情交给了llvm专业去生成，它只做RTL->llvm层的薄薄的翻译，这样稳定性的问题就offload交给了llvm，而llvm的稳定性是经过社区规模考验的。

这样就很好的解决了稳定性和性能的问题。 Read more…

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Erlang探索, 体系结构, 源码分析 Tags: hipe, llvm

Erlang 新数据类型Map的定位和性能

March 12th, 2014 Yu Feng 4 comments

原创文章，转载请注明： 转载自系统技术非业余研究

本文链接地址: Erlang 新数据类型Map的定位和性能

Erlang R17最大的语言层面的变化莫过是引入 Map数据结构，参见：Erlang R17新特性浅评还有这里。

Map相关的细节在EEP 43上，参见这里。

定位：

A record replacement is just that, a replacement. It’s like asking the question, “What do we have?” instead of “What can we get?” The instant rebuttal would be “What do we need?” I say Maps.

满足的约束：

The new data-type shall have semantics, syntax and operations that:

> provides an association set from key terms to value terms which can be constructed, accessed and updated using language syntax
> can be uniquely distinguished from every other data-type in the language
> has no compile-time dependency for constructing, accessing or updating contents of maps nor for passing maps between modules, processes or over Erlang distribution
> can be used in matching expressions in the language
> has a one-to-one association between printing and parsing the data-type
> has a well defined order between terms of the type and other Erlang types
> has at most O(log N) time complexity in insert and lookup operations, where N is the number of key-value associations.

思遥同学很贴心的写了一篇maps的分析，参看 Erlang 的新数据结构 map 浅析

cowboy-高性能简洁的erlang版web框架

February 27th, 2014 Yu Feng 4 comments

原创文章，转载请注明： 转载自系统技术非业余研究

本文链接地址: cowboy-高性能简洁的erlang版web框架

大部分的分布式系统只要有业务价值，必须提供如API，监控，管理界面等等，而http是目前事实上的标准，换句话说分布式系统必须提供强大的web框架，编写业务才能容易上手。 Erlang系统第一天就是设计干这个的，自然有很多web框架，出名的如mochiweb, cowboy,chicagoboss, misultin,inets等框架，竞争也是非常激烈。今天我要推荐的是新秀：cowboy，项目在这里

那么Cowboy是什么呢？

Cowboy is a small, fast and modular HTTP server written in Erlang.

其定位非常明确：

Cowboy aims to provide a complete HTTP stack in a small code base. It is optimized for low latency and low memory usage, in part because it uses binary strings.

Cowboy provides routing capabilities, selectively dispatching requests to handlers written in Erlang.

Because it uses Ranch for managing connections, Cowboy can easily be embedded in any other application.

No parameterized module. No process dictionary. Clean Erlang code.

目前还支持websocket和spdy协议(由leofs赞助），是个完整的http协议栈实现，功能强劲。由于是最近才发展起来的，充分考虑了性能，强调代码的整洁，作者在产品层面对erlang的优缺点非常了解，实现的很优雅很高效。目前多个项目都是用它支撑的，也反过来刺激它走的很稳很快，代码维护也很活跃，比如竞争对手misultin感觉没有超越的希望就放弃了。

Misultin development has been discontinued.

There currently are three main webserver libraries which basically do similar things:
* Mochiweb
* Cowboy
* Misultin
Mochiweb has been around the block for a while and it’s proven solid in production, I can only recommend it for all basic webserver needs you might have. Cowboy has a very interesting approach since it allows to use multiple TCP and UDP protocols on top of a common acceptor pool. It is a very modern approach, is very actively maintained and many projects are starting to be built around it.

在c实现中nginx是做的最好的，但是如果你用nginx写自己的业务代码的时候，你就会把整个系统的性能拖到实现者的水平，总体来讲性能不会太高。而基于cowboy的实现再加上erlang天然的优势，通常一台机器做到几万的QPS的业务，对实现者的要求非常低，可以大大促进生产力，额外的福利还有热升级和稳定可靠。

小结：技术在进步，思想也要放开。

祝玩的开心。

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Erlang探索, 源码分析 Tags: cowboy, http, web

Newer Entries Older Entries

系统技术非业余研究

Archive

R17新的调度策略+sub

Erlang内存体系调优

Erlang公平调度的误解

Erlang内存分配器之mbcs_pool

基于LLVM的高性能Erlang(Hipe)尝鲜

Erlang 新数据类型Map的定位和性能

cowboy-高性能简洁的erlang版web框架

buy me a coffee.

Recent Posts

Recent Comments

Categories

Blogroll

Archives

Meta