Archive

Posts Tagged ‘mbw’

详解服务器内存带宽计算和使用情况测量

September 12th, 2011 32 comments

原创文章,转载请注明: 转载自系统技术非业余研究

本文链接地址: 详解服务器内存带宽计算和使用情况测量

前段时间我们在MYSQL调优上发现有瓶颈,怀疑是过多拷贝内存,导致内存带宽用完。在Linux下CPU的使用情况有top工具, IO设备的使用情况有iostat工具,就是没有内存使用情况的测量工具。 我们可以看到大量的memcpy和字符串拷贝(可以用systemtap来测量),但是像简单的数据移动操作就无法统计,我们希望在硬件层面有办法可以查到CPU在过去的一段时间内总共对主存系统发起了多少读写字节数。

所以我们内存测量的的目标就归结为二点:1. 目前我们这样的服务器真正的内存带宽是多少。 2. 我们的应用到底占用了多少带宽。

首先来看下我们的服务器配置情况:

$ sudo ~/aspersa/summary 
# Aspersa System Summary Report ##############################
        Date | 2011-09-12 11:23:11 UTC (local TZ: CST +0800)
    Hostname | my031121.sqa.cm4
      Uptime | 13 days,  3:52,  2 users,  load average: 0.02, 0.01, 0.00
      System | Dell Inc.; PowerEdge R710; vNot Specified (<OUT OF SPEC>)
 Service Tag | DHY6S2X
     Release | Red Hat Enterprise Linux Server release 5.4 (Tikanga)
      Kernel | 2.6.18-164.el5
Architecture | CPU = 64-bit, OS = 64-bit
   Threading | NPTL 2.5
    Compiler | GNU CC version 4.1.2 20080704 (Red Hat 4.1.2-44).
     SELinux | Disabled
# Processor ##################################################
  Processors | physical = 2, cores = 12, virtual = 24, hyperthreading = yes
      Speeds | 24x2926.089
      Models | 24xIntel(R) Xeon(R) CPU X5670 @ 2.93GHz
      Caches | 24x12288 KB
# Memory #####################################################
       Total | 94.40G
        Free | 4.39G
        Used | physical = 90.01G, swap = 928.00k, virtual = 90.01G
     Buffers | 1.75G
      Caches | 7.85G
        Used | 78.74G
  Swappiness | vm.swappiness = 0
 DirtyPolicy | vm.dirty_ratio = 40, vm.dirty_background_ratio = 10
  Locator   Size     Speed             Form Factor   Type          Type Detail
  ========= ======== ================= ============= ============= ===========
  DIMM_A1   8192 MB  1333 MHz (0.8 ns) DIMM          {OUT OF SPEC} Synchronous
  DIMM_A2   8192 MB  1333 MHz (0.8 ns) DIMM          {OUT OF SPEC} Synchronous
  DIMM_A3   8192 MB  1333 MHz (0.8 ns) DIMM          {OUT OF SPEC} Synchronous
  DIMM_A4   8192 MB  1333 MHz (0.8 ns) DIMM          {OUT OF SPEC} Synchronous
  DIMM_A5   8192 MB  1333 MHz (0.8 ns) DIMM          {OUT OF SPEC} Synchronous
  DIMM_A6   8192 MB  1333 MHz (0.8 ns) DIMM          {OUT OF SPEC} Synchronous
  DIMM_B1   8192 MB  1333 MHz (0.8 ns) DIMM          {OUT OF SPEC} Synchronous
  DIMM_B2   8192 MB  1333 MHz (0.8 ns) DIMM          {OUT OF SPEC} Synchronous
  DIMM_B3   8192 MB  1333 MHz (0.8 ns) DIMM          {OUT OF SPEC} Synchronous
  DIMM_B4   8192 MB  1333 MHz (0.8 ns) DIMM          {OUT OF SPEC} Synchronous
  DIMM_B5   8192 MB  1333 MHz (0.8 ns) DIMM          {OUT OF SPEC} Synchronous
  DIMM_B6   8192 MB  1333 MHz (0.8 ns) DIMM          {OUT OF SPEC} Synchronous
  DIMM_A7   {EMPTY}  Unknown           DIMM          {OUT OF SPEC} Synchronous
  DIMM_A8   {EMPTY}  Unknown           DIMM          {OUT OF SPEC} Synchronous
  DIMM_A9   {EMPTY}  Unknown           DIMM          {OUT OF SPEC} Synchronous
  DIMM_B7   {EMPTY}  Unknown           DIMM          {OUT OF SPEC} Synchronous
  DIMM_B8   {EMPTY}  Unknown           DIMM          {OUT OF SPEC} Synchronous
  DIMM_B9   {EMPTY}  Unknown           DIMM          {OUT OF SPEC} Synchronous
...

DELL R710的机器上有2个X5670CPU,每个上面有6个core,超线程,所以共有24个逻辑CPU。上面插了12根 8192MB(1333 MHz)内存条。

我们的机器架构从之前的FSB总线结构变成现在的numa架构,谢谢@fcicq提供的信息,请参考下图(来源):

我们可以清楚的看到每个CPU都有自己的内存控制器直接连接到内存去,而且有3个通道, CPU直接通过QPI连接。 内存控制器和QPI上面都会流动数据。
Read more…

Post Footer automatically generated by wp-posturl plugin for wordpress.