diff options
Diffstat (limited to 'Notes/dtruss_notes.txt')
-rw-r--r-- | Notes/dtruss_notes.txt | 97 |
1 files changed, 97 insertions, 0 deletions
diff --git a/Notes/dtruss_notes.txt b/Notes/dtruss_notes.txt new file mode 100644 index 000000000000..8ecbecf864bc --- /dev/null +++ b/Notes/dtruss_notes.txt @@ -0,0 +1,97 @@ +************************************************************************** +* The following are additional notes on the dtruss program. +* +* $Id: dtruss_notes.txt 44 2007-09-17 07:47:20Z brendan $ +* +* COPYRIGHT: Copyright (c) 2007 Brendan Gregg. +************************************************************************** + + +* Can I trust the elapsed and on-cpu times? + +Firstly, lets see dtruss do something cool, + + # dtruss -eo -n bash + PID/LWP ELAPSD CPU SYSCALL(args) = return + 6215/1: 57 37 write(0x2, "h\0", 0x1) = 1 0 + 6215/1: 357210 45 read(0x0, "e\0", 0x1) = 1 0 + 6215/1: 53 37 write(0x2, "e\0", 0x1) = 1 0 + 6215/1: 359510 46 read(0x0, "l\0", 0x1) = 1 0 + 6215/1: 57 42 write(0x2, "l\0", 0x1) = 1 0 + 6215/1: 166495 47 read(0x0, "l\0", 0x1) = 1 0 + 6215/1: 56 40 write(0x2, "l\0", 0x1) = 1 0 + 6215/1: 346076 44 read(0x0, "o\0", 0x1) = 1 0 + 6215/1: 54 38 write(0x2, "o\0", 0x1) = 1 0 + 6215/1: 349852 45 read(0x0, " \0", 0x1) = 1 0 + 6215/1: 54 39 write(0x2, " \0", 0x1) = 1 0 + +In the above, the slow elapsed times for reads are due to the process context +switching off the CPU while we wait for the next keystroke. For example, +the second line shows an on-CPU time of 45 us and an elapsed time of 357210 us. +In fact, the elapsed times are equal to the inter-keystroke delays. + + +What about the writes? Their elapsed times are longer than the on-CPU times +also. Did we context switch off for them too? ... Lets run a different demo, + + # dtruss -eo date + ELAPSD CPU SYSCALL(args) = return + Mon Jul 25 21:41:40 EST 2005 + 44 23 resolvepath("/usr/bin/date\0", 0x80476CC, 0x3FF) = 13 0 + 10 1 sysconfig(0x6, 0xB25A1, 0xFEC1D444) = 4096 0 + 36 28 resolvepath("/usr/lib/ld.so.1\0", 0x80476CC, 0x3FF) = 12 0 + 18 9 xstat(0x2, 0x8047FEB, 0x8047AF8) = 0 0 + 25 16 open("/var/ld/ld.config\0", 0x0, 0x0) = -1 Err#2 + 27 18 xstat(0x2, 0xD27FBF38, 0x80473B0) = 0 0 + 17 9 resolvepath("/lib/libc.so.1\0", 0x8047438, 0x3FF) = 14 0 + 21 13 open("/lib/libc.so.1\0", 0x0, 0x0) = 3 0 + 30 22 mmap(0x10000, 0x1000, 0x5) = -763559936 0 + 15 6 mmap(0x10000, 0xCE000, 0x0) = -764411904 0 + 24 16 mmap(0xD2700000, 0xB5A45, 0x5) = -764411904 0 + 21 12 mmap(0xD27C6000, 0x5EB3, 0x3) = -763600896 0 + 18 9 mmap(0xD27CC000, 0x15C0, 0x3) = -763576320 0 + 14 5 munmap(0xD27B6000, 0x10000) = 0 0 + 186 176 memcntl(0xD2700000, 0x1B8D8, 0x4) = 0 0 + 17 7 close(0x3) = 0 0 + [...] + +For every syscall, the elapsed time is around 10 us (microseconds) slower +than the on-cpu time. These aren't micro context switches, this is due to +DTrace slowing down the program! The more closely we measure something the +more we effect it. (See Heisenberg's uncertainty principle). + +Ok, so for the above output we can tell that each elapsed time is around 10 us +longer than it should be. That's fine, since it's fairly consistant and not +a huge difference. This is an x86 server with a 867 MHz CPU. + + +Now lets try the same on an Ultra 5 with a 360 MHz CPU, + + # dtruss -eo date + ELAPSD CPU SYSCALL(args) = return + 216 142 resolvepath("/usr/bin/date\0", 0xFFBFF338, 0x3FF) = 13 0 + 234 187 resolvepath("/usr/lib/ld.so.1\0", 0xFFBFF338, 0x3FF) = 12 0 + 113 67 stat("/usr/bin/date\0", 0xFFBFF818, 0xFFBFFFEB) = 0 0 + 136 90 open("/var/ld/ld.config\0", 0x0, 0x0) = -1 Err#2 + 107 61 stat("/opt/onbld/lib/libc.so.1\0", 0xFFBFF330, 0xFFBFF55C) = -1 Err#2 + 98 54 stat("/opt/SUNWspro/lib/libc.so.1\0", 0xFFBFF330, 0xFFBFF55C) = -1 Err#2 + 96 53 stat("/opt/SUNWmlib/lib/libc.so.1\0", 0xFFBFF330, 0xFFBFF55C) = -1 Err#2 + 97 54 stat("/usr/sfw/lib/libc.so.1\0", 0xFFBFF330, 0xFFBFF55C) = -1 Err#2 + 96 53 stat("/lib/libc.so.1\0", 0xFFBFF330, 0xFFBFF55C) = 0 0 + 134 92 resolvepath("/lib/libc.so.1\0", 0xFFBFEF30, 0x3FF) = 14 0 + 109 69 open("/lib/libc.so.1\0", 0x0, 0x0) = 3 0 + 177 132 mmap(0x10000, 0x2000, 0x5) = -12976128 0 + [...] + +Now the time difference is around 40 us, and fairly consistant. + + +This difference is find so long as we bear it in mind. Or, run DTrace +on faster servers where the difference is much less. + + + +* The output appears shuffled? + +Read the answer to this in ALLsnoop_notes.txt. + |