vmstat | 系统技术非业余研究

itop更方便的了解Linux下中断情况

March 4th, 2011 Yu Feng 3 comments

原创文章，转载请注明： 转载自系统技术非业余研究

乘着公司搬家的功夫,写点东西!

在作网络程序的时候, 经常需要了解interrupts和软中断的平衡情况, 需要知道每秒有多少中断发生,发生在哪个cpu上.
Linux下中断来源可以从 /proc/interrupts 中了解到:

$ cat /proc/interrupts 
           CPU0       CPU1       
  0:     247701     250313   IO-APIC-edge      timer
  1:        501        567   IO-APIC-edge      i8042
  3:          1          1   IO-APIC-edge    
  8:          1          0   IO-APIC-edge      rtc0
  9:        256        240   IO-APIC-fasteoi   acpi
 12:       1134       1149   IO-APIC-edge      i8042
 16:        629        554   IO-APIC-fasteoi   nvidia
 17:      21313      20869   IO-APIC-fasteoi   firewire_ohci, eth1
 18:          0          0   IO-APIC-fasteoi   mmc0
 19:      51822      50079   IO-APIC-fasteoi   ata_piix, ata_piix
 20:       5605       5255   IO-APIC-fasteoi   ehci_hcd:usb2, uhci_hcd:usb3, uhci_hcd:usb6
 21:          0          0   IO-APIC-fasteoi   uhci_hcd:usb4, uhci_hcd:usb7
 22:         33         33   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb5, uhci_hcd:usb8
 45:        337        247   PCI-MSI-edge      eth0
 46:        441        447   PCI-MSI-edge      hda_intel
NMI:          0          0   Non-maskable interrupts
LOC:     169176     174899   Local timer interrupts
SPU:          0          0   Spurious interrupts
PMI:          0          0   Performance monitoring interrupts
PND:          0          0   Performance pending work
RES:      42289      40236   Rescheduling interrupts
CAL:        154       1076   Function call interrupts
TLB:       5838       5365   TLB shootdowns
TRM:          0          0   Thermal event interrupts
THR:          0          0   Threshold APIC interrupts
MCE:          0          0   Machine check exceptions
MCP:          5          5   Machine check polls
ERR:          1
MIS:          0

软中断可以从/proc/softirqs 了解到:

$ cat /proc/softirqs 
                CPU0       CPU1       
      HI:          0          0
   TIMER:     160508    1170976
  NET_TX:          2          2
  NET_RX:       3303       3165
   BLOCK:      50964      49198
BLOCK_IOPOLL:          0          0
 TASKLET:      24743      24284
   SCHED:      39483      41848
 HRTIMER:         34         40
     RCU:      92193      92592

总的中断次数可以从vmstat或者dstat了解到:

$ vmstat
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 3  0      0  44160 327144 876600    0    0   894   584  458 2295 11  5 70 15

itop提供了更方便的方式了解,作者Hunz在源码里面写:

It’s quite simple but it does its job.

虽然简单,但是适用:

Ubutun下可以这样安装: apt-get install itop

$ itop
INT                NAME          RATE             MAX
  0 [PIC-edge      time]   628 Ints/s     (max:   628)
  1 [PIC-edge      i804]     4 Ints/s     (max:     4)
 17 [PIC-fasteoi   fire]     8 Ints/s     (max:    22)
 19 [PIC-fasteoi   ata_]     1 Ints/s     (max:    14)
 20 [PIC-fasteoi   ehci]    25 Ints/s     (max:    25)
 45 [MSI-edge      eth0]     1 Ints/s     (max:     1)

他会计算每秒每个中断源中断的次数,看起来比较方便.

祝玩的开心!

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Linux, 工具介绍 Tags: interrupts, itop, softirqs, vmstat, 中断

Linux下谁在切换我们的进程

October 8th, 2010 Yu Feng 14 comments

原创文章，转载请注明： 转载自系统技术非业余研究

本文链接地址: Linux下谁在切换我们的进程

我们在做Linux服务器的时候经常会需要知道谁在做进程切换，什么原因需要做进程切换。因为进程切换的代价很高，我给出一个LMbench测试出来的数字：
Context switching – times in microseconds – smaller is better
————————————————————————-
Host OS 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw
——— ————- —— —— —— —— —— ——- ——-
my174.cm4 Linux 2.6.18- 6.1100 7.0200 6.1100 8.7400 7.7200 8.96000 9.62000

在我的很高端的服务器上，进程切换的开销在8us左右，这个相对于高性能的服务器是不可接受的，所以我们要在一个时间片内尽可能的多做事情，而不是把时间浪费在无谓的切换上。

好奇害死猫，我们来调查下谁在切换我们的进程：

[root@my174 admin]# dstat 1
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw 
  0   0 100   0   0   0|   0     0 | 796B 1488B|   0     0 |1004   128 
  0   0 100   0   0   0|   0     0 | 280B  728B|   0     0 |1005   114 
  0   0 100   0   0   0|   0     0 | 280B  728B|   0     0 |1005   128 
  0   0 100   0   0   0|   0     0 | 280B  728B|   0     0 |1005   114 
  0   0 100   0   0   0|   0   320k| 280B  728B|   0     0 |1008   143 
...

我们可以看到 csw的数目是 120/S, 但是dstat或者vmstat类似的工具并没有告诉我们谁在干坏事。好吧！我们自己动手行吧。
祭出我们可爱的systemtap!

[root@my174 admin]# cat >cswmon.stp
#! /usr/bin/env stap
#
#

global csw_count
global idle_count

probe scheduler.cpu_off {
  csw_count[task_prev, task_next]++
  idle_count+=idle
}


function fmt_task(task_prev, task_next)
{
   return sprintf("%s(%d)->%s(%d)",
                                task_execname(task_prev), 
                                task_pid(task_prev), 
                                task_execname(task_next), 
                                task_pid(task_next))
}

function print_cswtop () {
  printf ("%45s %10s\n", "Context switch", "COUNT")
  foreach ([task_prev, task_next] in csw_count- limit 20) {
    printf("%45s %10d\n", fmt_task(task_prev, task_next), csw_count[task_prev, task_next])
  }
  printf("%45s %10d\n", "idle", idle_count)

  delete csw_count
  delete idle_count
}

probe timer.s($1) {
  print_cswtop ()
  printf("--------------------------------------------------------------\n")
}
CTRL+D

这个脚本会每隔设定的时间打印出TOP 20切换最多的进程和他的pid, 我们来看下结果把：

[root@my174 admin]# stap cswmon.stp 5
                               Context switch      COUNT
                swapper(0)->systemtap/11(908)        500
                systemtap/11(908)->swapper(0)        498
                swapper(0)->fct1-worker(2492)         50
                fct1-worker(2492)->swapper(0)         50
                swapper(0)->fct0-worker(2191)         50
                fct0-worker(2191)->swapper(0)         50
                      swapper(0)->bond0(3432)         50
                      bond0(3432)->swapper(0)         50
                      stapio(879)->swapper(0)         26
                      swapper(0)->stapio(879)         25
                      stapio(879)->swapper(0)         19
                      swapper(0)->stapio(879)         17
                   swapper(0)->watchdog/9(31)          5
                   watchdog/9(31)->swapper(0)          5
                    swapper(0)->mysqld(18346)          5
                    mysqld(18346)->swapper(0)          5
                  swapper(0)->watchdog/13(43)          5
                  watchdog/13(43)->swapper(0)          5
                  swapper(0)->watchdog/14(46)          5
                  watchdog/14(46)->swapper(0)          5
                                         idle        859
--------------------------------------------------------------
...

我们可以看到进程从哪里切换到哪里，并且发生了多少次，最后一行，我打印出来idle的次数，也就是说这时候系统没啥事情做，就切换到idle(0)这个进程去休息去了。

通过上面的调查，我们会很清楚的了解到我们系统的开销发生在那里，方便我们定位问题。
玩的开心！

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Linux Tags: context switch, csw, dstat, lmbench, stap, vmstat, 上下文切换

系统技术非业余研究

Archive

itop更方便的了解Linux下中断情况

Linux下谁在切换我们的进程

buy me a coffee.

Recent Posts

Recent Comments

Categories

Blogroll

Archives

Meta