diff options
Diffstat (limited to 'lib/libpmc/pmc.k8.3')
-rw-r--r-- | lib/libpmc/pmc.k8.3 | 703 |
1 files changed, 703 insertions, 0 deletions
diff --git a/lib/libpmc/pmc.k8.3 b/lib/libpmc/pmc.k8.3 new file mode 100644 index 000000000000..229f2a1ca8f2 --- /dev/null +++ b/lib/libpmc/pmc.k8.3 @@ -0,0 +1,703 @@ +.\" Copyright (c) 2003-2008 Joseph Koshy. All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" This software is provided by Joseph Koshy ``as is'' and +.\" any express or implied warranties, including, but not limited to, the +.\" implied warranties of merchantability and fitness for a particular purpose +.\" are disclaimed. in no event shall Joseph Koshy be liable +.\" for any direct, indirect, incidental, special, exemplary, or consequential +.\" damages (including, but not limited to, procurement of substitute goods +.\" or services; loss of use, data, or profits; or business interruption) +.\" however caused and on any theory of liability, whether in contract, strict +.\" liability, or tort (including negligence or otherwise) arising in any way +.\" out of the use of this software, even if advised of the possibility of +.\" such damage. +.\" +.\" $FreeBSD$ +.\" +.Dd September 17, 2008 +.Os +.Dt PMC.K8 3 +.Sh NAME +.Nm pmc.k8 +.Nd measurement events for +.Tn AMD +.Tn Athlon 64 +(K8 family) CPUs +.Sh LIBRARY +.Lb libpmc +.Sh SYNOPSIS +.In pmc.h +.Sh DESCRIPTION +AMD K8 PMCs are present in the +.Tn "AMD Athlon64" +and +.Tn "AMD Opteron" +series of CPUs. +They are documented in the +.Rs +.%B "BIOS and Kernel Developer's Guide for the AMD Athlon(tm) 64 and AMD Opteron Processors" +.%N "Publication No. 26094" +.%D "April 2004" +.%Q "Advanced Micro Devices, Inc." +.Re +.Ss PMC Features +AMD K8 PMCs are 48 bits wide. +Each CPU contains 4 PMCs with the following capabilities: +.Bl -column "PMC_CAP_INTERRUPT" "Support" +.It Em Capability Ta Em Support +.It PMC_CAP_CASCADE Ta \&No +.It PMC_CAP_EDGE Ta Yes +.It PMC_CAP_INTERRUPT Ta Yes +.It PMC_CAP_INVERT Ta Yes +.It PMC_CAP_READ Ta Yes +.It PMC_CAP_PRECISE Ta \&No +.It PMC_CAP_SYSTEM Ta Yes +.It PMC_CAP_TAGGING Ta \&No +.It PMC_CAP_THRESHOLD Ta Yes +.It PMC_CAP_USER Ta Yes +.It PMC_CAP_WRITE Ta Yes +.El +.Ss Event Qualifiers +.Pp +Event specifiers for AMD K8 PMCs can have the following optional +qualifiers: +.Bl -tag -width indent +.It Li count= Ns Ar value +Configure the counter to increment only if the number of configured +events measured in a cycle is greater than or equal to +.Ar value . +.It Li edge +Configure the counter to only count negated-to-asserted transitions +of the conditions expressed by the other fields. +In other words, the counter will increment only once whenever a given +condition becomes true, irrespective of the number of clocks during +which the condition remains true. +.It Li inv +Invert the sense of comparision when the +.Dq Li count +qualifier is present, making the counter to increment when the +number of events per cycle is less than the value specified by +the +.Dq Li count +qualifier. +.It Li mask= Ns Ar qualifier +Many event specifiers for AMD K8 PMCs need to be additionally +qualified using a mask qualifier. +These additional qualifiers are event-specific and are documented +along with their associated event specifiers below. +.It Li os +Configure the PMC to count events happening at privilege level 0. +.It Li usr +Configure the PMC to count events occurring at privilege levels 1, 2 +or 3. +.El +.Pp +If neither of the +.Dq Li os +or +.Dq Li usr +qualifiers were specified, the default is to enable both. +.Ss AMD K8 Event Specifiers +The event specifiers supported on AMD K8 PMCs are: +.Bl -tag -width indent +.It Li k8-bu-cpu-clk-unhalted +Count the number of clock cycles when the CPU is not in the HLT or +STPCLK states. +.It Li k8-bu-fill-request-l2-miss Op Li ,mask= Ns Ar qualifier +Count fill requests that missed in the L2 cache. +This event may be further qualified using +.Ar qualifier , +which is a +.Ql + +separated set of the following keywords: +.Pp +.Bl -tag -width indent -compact +.It Li dc-fill +Count data cache fill requests. +.It Li ic-fill +Count instruction cache fill requests. +.It Li tlb-reload +Count TLB reloads. +.El +.Pp +The default is to count all types of requests. +.It Li k8-bu-internal-l2-request Op Li ,mask= Ns Ar qualifier +Count internally generated requests to the L2 cache. +This event may be further qualified using +.Ar qualifier , +which is a +.Ql + +separated set of the following keywords: +.Pp +.Bl -tag -width indent -compact +.It Li cancelled +Count cancelled requests. +.It Li dc-fill +Count data cache fill requests. +.It Li ic-fill +Count instruction cache fill requests. +.It Li tag-snoop +Count tag snoop requests. +.It Li tlb-reload +Count TLB reloads. +.El +.Pp +The default is to count all types of requests. +.It Li k8-dc-access +Count data cache accesses including microcode scratchpad accesses. +.It Li k8-dc-copyback Op Li ,mask= Ns Ar qualifier +Count data cache copyback operations. +This event may be further qualified using +.Ar qualifier , +which is a +.Ql + +separated set of the following keywords: +.Pp +.Bl -tag -width indent -compact +.It Li exclusive +Count operations for lines in the +.Dq exclusive +state. +.It Li invalid +Count operations for lines in the +.Dq invalid +state. +.It Li modified +Count operations for lines in the +.Dq modified +state. +.It Li owner +Count operations for lines in the +.Dq owner +state. +.It Li shared +Count operations for lines in the +.Dq shared +state. +.El +.Pp +The default is to count operations for lines in all the +above states. +.It Li k8-dc-dcache-accesses-by-locks Op Li ,mask= Ns Ar qualifier +Count data cache accesses by lock instructions. +This event is only available on processors of revision C or later +vintage. +This event may be further qualified using +.Ar qualifier , +which is a +.Ql + +separated set of the following keywords: +.Pp +.Bl -tag -width indent -compact +.It Li accesses +Count data cache accesses by lock instructions. +.It Li misses +Count data cache misses by lock instructions. +.El +.Pp +The default is to count all accesses. +.It Li k8-dc-dispatched-prefetch-instructions Op Li ,mask= Ns Ar qualifier +Count the number of dispatched prefetch instructions. +This event may be further qualified using +.Ar qualifier , +which is a +.Ql + +separated set of the following keywords: +.Pp +.Bl -tag -width indent -compact +.It Li load +Count load operations. +.It Li nta +Count non-temporal operations. +.It Li store +Count store operations. +.El +.Pp +The default is to count all operations. +.It Li k8-dc-l1-dtlb-miss-and-l2-dtlb-hit +Count L1 DTLB misses that are L2 DTLB hits. +.It Li k8-dc-l1-dtlb-miss-and-l2-dtlb-miss +Count L1 DTLB misses that are also misses in the L2 DTLB. +.It Li k8-dc-microarchitectural-early-cancel-of-an-access +Count microarchitectural early cancels of data cache accesses. +.It Li k8-dc-microarchitectural-late-cancel-of-an-access +Count microarchitectural late cancels of data cache accesses. +.It Li k8-dc-misaligned-data-reference +Count misaligned data references. +.It Li k8-dc-miss +Count data cache misses. +.It Li k8-dc-one-bit-ecc-error Op Li ,mask= Ns Ar qualifier +Count one bit ECC errors found by the scrubber. +This event may be further qualified using +.Ar qualifier , +which is a +.Ql + +separated set of the following keywords: +.Pp +.Bl -tag -width indent -compact +.It Li scrubber +Count scrubber detected errors. +.It Li piggyback +Count piggyback scrubber errors. +.El +.Pp +The default is to count both kinds of errors. +.It Li k8-dc-refill-from-l2 Op Li ,mask= Ns Ar qualifier +Count data cache refills from L2 cache. +This event may be further qualified using +.Ar qualifier , +which is a +.Ql + +separated set of the following keywords: +.Pp +.Bl -tag -width indent -compact +.It Li exclusive +Count operations for lines in the +.Dq exclusive +state. +.It Li invalid +Count operations for lines in the +.Dq invalid +state. +.It Li modified +Count operations for lines in the +.Dq modified +state. +.It Li owner +Count operations for lines in the +.Dq owner +state. +.It Li shared +Count operations for lines in the +.Dq shared +state. +.El +.Pp +The default is to count operations for lines in all the +above states. +.It Li k8-dc-refill-from-system Op Li ,mask= Ns Ar qualifier +Count data cache refills from system memory. +This event may be further qualified using +.Ar qualifier , +which is a +.Ql + +separated set of the following keywords: +.Pp +.Bl -tag -width indent -compact +.It Li exclusive +Count operations for lines in the +.Dq exclusive +state. +.It Li invalid +Count operations for lines in the +.Dq invalid +state. +.It Li modified +Count operations for lines in the +.Dq modified +state. +.It Li owner +Count operations for lines in the +.Dq owner +state. +.It Li shared +Count operations for lines in the +.Dq shared +state. +.El +.Pp +The default is to count operations for lines in all the +above states. +.It Li k8-fp-dispatched-fpu-ops Op Li ,mask= Ns Ar qualifier +Count the number of dispatched FPU ops. +This event is supported in revision B and later CPUs. +This event may be further qualified using +.Ar qualifier , +which is a +.Ql + +separated set of the following keywords: +.Pp +.Bl -tag -width indent -compact +.It Li add-pipe-excluding-junk-ops +Count add pipe ops excluding junk ops. +.It Li add-pipe-junk-ops +Count junk ops in the add pipe. +.It Li multiply-pipe-excluding-junk-ops +Count multiply pipe ops excluding junk ops. +.It Li multiply-pipe-junk-ops +Count junk ops in the multiply pipe. +.It Li store-pipe-excluding-junk-ops +Count store pipe ops excluding junk ops +.It Li store-pipe-junk-ops +Count junk ops in the store pipe. +.El +.Pp +The default is to count all types of ops. +.It Li k8-fp-cycles-with-no-fpu-ops-retired +Count cycles when no FPU ops were retired. +This event is supported in revision B and later CPUs. +.It Li k8-fp-dispatched-fpu-fast-flag-ops +Count dispatched FPU ops that use the fast flag interface. +This event is supported in revision B and later CPUs. +.It Li k8-fr-decoder-empty +Count cycles when there was nothing to dispatch (i.e., the decoder +was empty). +.It Li k8-fr-dispatch-stalls +Count all dispatch stalls. +.It Li k8-fr-dispatch-stall-for-segment-load +Count dispatch stalls for segment loads. +.It Li k8-fr-dispatch-stall-for-serialization +Count dispatch stalls for serialization. +.It Li k8-fr-dispatch-stall-from-branch-abort-to-retire +Count dispatch stalls from branch abort to retiral. +.It Li k8-fr-dispatch-stall-when-fpu-is-full +Count dispatch stalls when the FPU is full. +.It Li k8-fr-dispatch-stall-when-ls-is-full +Count dispatch stalls when the load/store unit is full. +.It Li k8-fr-dispatch-stall-when-reorder-buffer-is-full +Count dispatch stalls when the reorder buffer is full. +.It Li k8-fr-dispatch-stall-when-reservation-stations-are-full +Count dispatch stalls when reservation stations are full. +.It Li k8-fr-dispatch-stall-when-waiting-for-all-to-be-quiet +Count dispatch stalls when waiting for all to be quiet. +.\" XXX What does "waiting for all to be quiet" mean? +.It Li k8-fr-dispatch-stall-when-waiting-far-xfer-or-resync-branch-pending +Count dispatch stalls when a far control transfer or a resync branch +is pending. +.It Li k8-fr-fpu-exceptions Op Li ,mask= Ns Ar qualifier +Count FPU exceptions. +This event is supported in revision B and later CPUs. +This event may be further qualified using +.Ar qualifier , +which is a +.Ql + +separated set of the following keywords: +.Pp +.Bl -tag -width indent -compact +.It Li sse-and-x87-microtraps +Count SSE and x87 microtraps. +.It Li sse-reclass-microfaults +Count SSE reclass microfaults +.It Li sse-retype-microfaults +Count SSE retype microfaults +.It Li x87-reclass-microfaults +Count x87 reclass microfaults. +.El +.Pp +The default is to count all types of exceptions. +.It Li k8-fr-interrupts-masked-cycles +Count cycles when interrupts were masked (by CPU RFLAGS field IF was zero). +.It Li k8-fr-interrupts-masked-while-pending-cycles +Count cycles while interrupts were masked while pending (i.e., cycles +when INTR was asserted while CPU RFLAGS field IF was zero). +.It Li k8-fr-number-of-breakpoints-for-dr0 +Count the number of breakpoints for DR0. +.It Li k8-fr-number-of-breakpoints-for-dr1 +Count the number of breakpoints for DR1. +.It Li k8-fr-number-of-breakpoints-for-dr2 +Count the number of breakpoints for DR2. +.It Li k8-fr-number-of-breakpoints-for-dr3 +Count the number of breakpoints for DR3. +.It Li k8-fr-retired-branches +Count retired branches including exceptions and interrupts. +.It Li k8-fr-retired-branches-mispredicted +Count mispredicted retired branches. +.It Li k8-fr-retired-far-control-transfers +Count retired far control transfers (which are always mispredicted). +.It Li k8-fr-retired-fastpath-double-op-instructions Op Li ,mask= Ns Ar qualifier +Count retired fastpath double op instructions. +This event is supported in revision B and later CPUs. +This event may be further qualified using +.Ar qualifier , +which is a +.Ql + +separated set of the following keywords: +.Pp +.Bl -tag -width indent -compact +.It Li low-op-pos-0 +Count instructions with the low op in position 0. +.It Li low-op-pos-1 +Count instructions with the low op in position 1. +.It Li low-op-pos-2 +Count instructions with the low op in position 2. +.El +.Pp +The default is to count all types of instructions. +.It Li k8-fr-retired-fpu-instructions Op Li ,mask= Ns Ar qualifier +Count retired FPU instructions. +This event is supported in revision B and later CPUs. +This event may be further qualified using +.Ar qualifier , +which is a +.Ql + +separated set of the following keywords: +.Pp +.Bl -tag -width indent -compact +.It Li mmx-3dnow +Count MMX and 3DNow!\& instructions. +.It Li packed-sse-sse2 +Count packed SSE and SSE2 instructions. +.It Li scalar-sse-sse2 +Count scalar SSE and SSE2 instructions +.It Li x87 +Count x87 instructions. +.El +.Pp +The default is to count all types of instructions. +.It Li k8-fr-retired-near-returns +Count retired near returns. +.It Li k8-fr-retired-near-returns-mispredicted +Count mispredicted near returns. +.It Li k8-fr-retired-resyncs +Count retired resyncs (non-control transfer branches). +.It Li k8-fr-retired-taken-hardware-interrupts +Count retired taken hardware interrupts. +.It Li k8-fr-retired-taken-branches +Count retired taken branches. +.It Li k8-fr-retired-taken-branches-mispredicted +Count retired taken branches that were mispredicted. +.It Li k8-fr-retired-taken-branches-mispredicted-by-addr-miscompare +Count retired taken branches that were mispredicted only due to an +address miscompare. +.It Li k8-fr-retired-uops +Count retired uops. +.It Li k8-fr-retired-x86-instructions +Count retired x86 instructions including exceptions and interrupts. +.It Li k8-ic-fetch +Count instruction cache fetches. +.It Li k8-ic-instruction-fetch-stall +Count cycles in stalls due to instruction fetch. +.It Li k8-ic-l1-itlb-miss-and-l2-itlb-hit +Count L1 ITLB misses that are L2 ITLB hits. +.It Li k8-ic-l1-itlb-miss-and-l2-itlb-miss +Count ITLB misses that miss in both L1 and L2 ITLBs. +.It Li k8-ic-microarchitectural-resync-by-snoop +Count microarchitectural resyncs caused by snoops. +.It Li k8-ic-miss +Count instruction cache misses. +.It Li k8-ic-refill-from-l2 +Count instruction cache refills from L2 cache. +.It Li k8-ic-refill-from-system +Count instruction cache refills from system memory. +.It Li k8-ic-return-stack-hits +Count hits to the return stack. +.It Li k8-ic-return-stack-overflow +Count overflows of the return stack. +.It Li k8-ls-buffer2-full +Count load/store buffer2 full events. +.It Li k8-ls-locked-operation Op Li ,mask= Ns Ar qualifier +Count locked operations. +For revision C and later CPUs, the following qualifiers are supported: +.Pp +.Bl -tag -width indent -compact +.It Li cycles-in-request +Count the number of cycles in the lock request/grant stage. +.It Li cycles-to-complete +Count the number of cycles a lock takes to complete once it is +non-speculative and is the older load/store operation. +.It Li locked-instructions +Count the number of lock instructions executed. +.El +.Pp +The default is to count the number of lock instructions executed. +.It Li k8-ls-microarchitectural-late-cancel +Count microarchitectural late cancels of operations in the load/store +unit. +.It Li k8-ls-microarchitectural-resync-by-self-modifying-code +Count microarchitectural resyncs caused by self-modifying code. +.It Li k8-ls-microarchitectural-resync-by-snoop +Count microarchitectural resyncs caused by snoops. +.It Li k8-ls-retired-cflush-instructions +Count retired CFLUSH instructions. +.It Li k8-ls-retired-cpuid-instructions +Count retired CPUID instructions. +.It Li k8-ls-segment-register-load Op Li ,mask= Ns Ar qualifier +Count segment register loads. +This event may be further qualified using +.Ar qualifier , +which is a +.Ql + +separated set of the following keywords: +.Bl -tag -width indent -compact +.It Li cs +Count CS register loads. +.It Li ds +Count DS register loads. +.It Li es +Count ES register loads. +.It Li fs +Count FS register loads. +.It Li gs +Count GS register loads. +.\" .It Li hs +.\" Count HS register loads. +.\" XXX "HS" register? +.It Li ss +Count SS register loads. +.El +.Pp +The default is to count all types of loads. +.It Li k8-nb-memory-controller-bypass-saturation Op Li ,mask= Ns Ar qualifier +Count memory controller bypass counter saturation events. +This event may be further qualified using +.Ar qualifier , +which is a +.Ql + +separated set of the following keywords: +.Pp +.Bl -tag -width indent -compact +.It Li dram-controller-interface-bypass +Count DRAM controller interface bypass. +.It Li dram-controller-queue-bypass +Count DRAM controller queue bypass. +.It Li memory-controller-hi-pri-bypass +Count memory controller high priority bypasses. +.It Li memory-controller-lo-pri-bypass +Count memory controller low priority bypasses. +.El +.Pp +.It Li k8-nb-memory-controller-dram-slots-missed +Count memory controller DRAM command slots missed (in MemClks). +.It Li k8-nb-memory-controller-page-access-event Op Li ,mask= Ns Ar qualifier +Count memory controller page access events. +This event may be further qualified using +.Ar qualifier , +which is a +.Ql + +separated set of the following keywords: +.Pp +.Bl -tag -width indent -compact +.It Li page-conflict +Count page conflicts. +.It Li page-hit +Count page hits. +.It Li page-miss +Count page misses. +.El +.Pp +The default is to count all types of events. +.It Li k8-nb-memory-controller-page-table-overflow +Count memory control page table overflow events. +.It Li k8-nb-probe-result Op Li ,mask= Ns Ar qualifier +Count probe events. +This event may be further qualified using +.Ar qualifier , +which is a +.Ql + +separated set of the following keywords: +.Pp +.Bl -tag -width indent -compact +.It Li probe-hit +Count all probe hits. +.It Li probe-hit-dirty-no-memory-cancel +Count probe hits without memory cancels. +.It Li probe-hit-dirty-with-memory-cancel +Count probe hits with memory cancels. +.It Li probe-miss +Count probe misses. +.El +.It Li k8-nb-sized-commands Op Li ,mask= Ns Ar qualifier +Count sized commands issued. +This event may be further qualified using +.Ar qualifier , +which is a +.Ql + +separated set of the following keywords: +.Pp +.Bl -tag -width indent -compact +.It Li nonpostwrszbyte +.It Li nonpostwrszdword +.It Li postwrszbyte +.It Li postwrszdword +.It Li rdszbyte +.It Li rdszdword +.It Li rdmodwr +.El +.Pp +The default is to count all types of commands. +.It Li k8-nb-memory-controller-turnaround Op Li ,mask= Ns Ar qualifier +Count memory control turnaround events. +This event may be further qualified using +.Ar qualifier , +which is a +.Ql + +separated set of the following keywords: +.Pp +.Bl -tag -width indent -compact +.\" XXX doc is unclear whether these are cycle counts or event counts +.It Li dimm-turnaround +Count DIMM turnarounds. +.It Li read-to-write-turnaround +Count read to write turnarounds. +.It Li write-to-read-turnaround +Count write to read turnarounds. +.El +.Pp +The default is to count all types of events. +.It Li k8-nb-ht-bus0-bandwidth Op Li ,mask= Ns Ar qualifier +.It Li k8-nb-ht-bus1-bandwidth Op Li ,mask= Ns Ar qualifier +.It Li k8-nb-ht-bus2-bandwidth Op Li ,mask= Ns Ar qualifier +Count events on the HyperTransport(tm) buses. +These events may be further qualified using +.Ar qualifier , +which is a +.Ql + +separated set of the following keywords: +.Pp +.Bl -tag -width indent -compact +.It Li buffer-release +Count buffer release messages sent. +.It Li command +Count command messages sent. +.It Li data +Count data messages sent. +.It Li nop +Count nop messages sent. +.El +.Pp +The default is to count all types of messages. +.El +.Ss Event Name Aliases +The following table shows the mapping between the PMC-independent +aliases supported by +.Lb libpmc +and the underlying hardware events used. +.Bl -column "branch-mispredicts" "Description" +.It Em Alias Ta Em Event +.It Li branches Ta Li k8-fr-retired-taken-branches +.It Li branch-mispredicts Ta Li k8-fr-retired-taken-branches-mispredicted +.It Li dc-misses Ta Li k8-dc-miss +.It Li ic-misses Ta Li k8-ic-miss +.It Li instructions Ta Li k8-fr-retired-x86-instructions +.It Li interrupts Ta Li k8-fr-taken-hardware-interrupts +.It Li unhalted-cycles Ta Li k8-by-cpu-clk-unhalted +.El +.Sh SEE ALSO +.Xr pmc 3 , +.Xr pmc.k7 3 , +.Xr pmc.p4 3 , +.Xr pmc.p5 3 , +.Xr pmc.p6 3 , +.Xr pmc.tsc 3 , +.Xr pmclog 3 , +.Xr hwpmc 4 +.Sh HISTORY +The +.Nm pmc +library first appeared in +.Fx 6.0 . +.Sh AUTHORS +The +.Lb libpmc +library was written by +.An "Joseph Koshy" +.Aq jkoshy@FreeBSD.org . |