Flashcache使用的误区以及解决方案

Home > 工具介绍, 调优 > Flashcache使用的误区以及解决方案

October 28th, 2011 Yu Feng

原创文章，转载请注明： 转载自系统技术非业余研究

flashcache是facebook释放出来的开源的混合存储方案,用ssd来做cache提升IO设备的性能.很多硬件厂商也有类似的方案，比如说LSI raid卡. 但是这个方案是免费的软件方案，而且经过产品的考验，具体参见：
主页：https://github.com/facebook/flashcache
开源混合存储方案(Flashcache)： http://blog.yufeng.info/archives/1165
Flashcache新版重大变化： http://blog.yufeng.info/archives/1429

但是flashcache在使用中很多人会有个误区，导致性能很低。首先我们看下flashcache的设计背景和适用场景：

Introduction :
============
Flashcache is a write back block cache Linux kernel module. This
document describes the design, futures ideas, configuration, tuning of
the flashcache and concludes with a note covering the testability
hooks within flashcache and the testing that we did. Flashcache was
built primarily as a block cache for InnoDB but is general purpose and
can be used by other applications as well.

它是为数据库这样的应用的离散读写优化。如果你用在了顺序读写，就有非常大的性能问题。
那么为什么呢？我来分析下：

flashcache把内部的cache空间分成很多set, 是以set而不是整体为单位提供cache以及flush后备操作. 也就是说当一个set里面的dirty page达到一个预设的值的时候，就需要把这么dirty page 淘汰并且flush到后备设备去，以便腾出空间给更热的数据使用。
那么每个set多大呢？

To compute the target set for a given dbn
target set = (dbn / block size / set size) mod (number of sets)
Once we have the target set, linear probe within the set finds the
block. Note that a sequential range of disk blocks will all map onto a
given set.

set默认是 512*4k = 2M大小，也就是说如果你的这个set刚好是一个文件所在的块，而且每次这个文件都不停的顺序写，很快这个set都变成dirty, 那么flashcache就选择马上刷，这样加速效果就没有了。

幸好作者Mohan认识到了这个问题，提供了解决方案：

见https://github.com/facebook/flashcache/blob/master/doc/flashcache-sa-guide.txt 中的章节Tuning Sequential IO Skipping for better flashcache performance

引入了配置参数来解决这个问题：

dev.flashcache..skip_seq_thresh_kb:
Skip (don’t cache) sequential IO larger than this number (in kb).
0 (default) means cache all IO, both sequential and random.
Sequential IO can only be determined ‘after the fact’, so
this much of each sequential I/O will be cached before we skip
the rest. Does not affect searching for IO in an existing cache.

这样你可以把太大的顺序操作给过滤掉了，大大提升性能。

祝玩得开心！

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: 工具介绍, 调优 Tags: Flashcache, skip_seq_thresh_kb

Comments (4)

jebtang

November 8th, 2011 at 01:40 | #1

Reply | Quote

不错，正好可以试试效果。最近发现同样是4K的block size，在相同的测试环境下，XFS的表现比EXT3好了不少，准备用blktracek看看他们的I/O到底有什么不同。
niko

June 30th, 2013 at 21:45 | #2

Reply | Quote

“set默认是 512*4k = 2M大小，也就是说如果你的这个set刚好是一个文件所在的块，而且每次这个文件都不停的顺序写，很快这个set都变成dirty, 那么flashcache就选择马上刷，这样加速效果就没有了”

这个解释不对的吧？ hash时连续的两个block会映射到相邻的set里吧,而不是同一个set。

niko Reply:
June 30th, 2013 at 9:55 pm
my mistake, sorry….
hihi356

July 28th, 2013 at 13:00 | #3

Reply | Quote

@niko

Note that a sequential range of disk blocks will all map onto a given set.
文档里这句话说的应该是映射到一个set里边吧
huangbt

March 4th, 2014 at 22:30 | #4

Reply | Quote

哥们，我如果把这个target set = (dbn / block size / set size) mod (number of sets) 算法改掉，改成同一个文件会映射到相邻的set里，你觉得怎么样。我的应用场景会有比较多的顺序IO，而且有些文件比较大（可能大于100M）

jesuszhu Reply:
August 12th, 2014 at 10:15 pm
行是行，可你这样弄的话，一个顺序写落到ssd上就成随机写了。

Comments are closed.

Erlang版TCP服务器对抗攻击解决方案简单的Map reduce用的收集函数

系统技术非业余研究