Previous: Deterministic Profiler, Up: Profiling


14.2 Statistical Profiler

The sb-sprof module, loadable by

     (require :sb-sprof)

provides an alternate profiler which works by taking samples of the program execution at regular intervals, instead of instrumenting functions like sb-profile:profile does. You might find sb-sprof more useful than the deterministic profiler when profiling functions in the common-lisp-package, SBCL internals, or code where the instrumenting overhead is excessive.

14.2.1 Example Usage

     (require :sb-sprof)
     
     (declaim (optimize speed))
     
     (defun cpu-test (n)
       (let ((a 0))
         (dotimes (i (expt 2 n) a)
           (setf a (logxor a
                           (* i 5)
                           (+ a i))))))
     
     ;;;; CPU profiling
     
     ;;; Take up to 1000 samples of running (CPU-TEST 26), and give a flat
     ;;; table report at the end. Profiling will end one the body has been
     ;;; evaluated once, whether or not 1000 samples have been taken.
     (sb-sprof:with-profiling (:max-samples 1000
                               :report :flat
                               :loop nil)
       (cpu-test 26))
     
     ;;; Take 1000 samples of running (CPU-TEST 24), and give a flat
     ;;; table report at the end. The body will be re-evaluated in a loop
     ;;; until 1000 samples have been taken. A sample count will be printed
     ;;; after each iteration.
     (sb-sprof:with-profiling (:max-samples 1000
                               :report :flat
                               :loop t
                               :show-progress t)
       (cpu-test 24))
     
     ;;;; Allocation profiling
     
     (defun foo (&rest args)
       (mapcar (lambda (x) (float x 1d0)) args))
     
     (defun bar (n)
       (declare (fixnum n))
       (apply #'foo (loop repeat n collect n)))
     
     (sb-sprof:with-profiling (:max-samples 10000
                               :mode :alloc
                               :report :flat)
       (bar 1000))

14.2.2 Output

The flat report format will show a table of all functions that the profiler encountered on the call stack during sampling, ordered by the number of samples taken while executing that function.

                Self        Total        Cumul
       Nr  Count     %  Count     %  Count     % Function
     ------------------------------------------------------------------------
        1    165  38.3    165  38.3    165  38.3 SB-KERNEL:TWO-ARG-XOR
        2    141  32.7    141  32.7    306  71.0 SB-VM::GENERIC-+
        3     67  15.5    145  33.6    373  86.5 CPU-TEST-2

For each function, the table will show three absolute and relative sample counts. The Self column shows samples taken while directly executing that function. The Total column shows samples taken while executing that function or functions called from it (sampled to a platform-specific depth). The Cumul column shows the sum of all Self columns up to and including that line in the table.

The profiler also hooks into the disassembler such that instructions which have been sampled are annotated with their relative frequency of sampling. This information is not stored across different sampling runs.

     ;      6CF:       702E             JO L4              ; 6/242 samples
     ;      6D1:       D1E3             SHL EBX, 1
     ;      6D3:       702A             JO L4
     ;      6D5: L2:   F6C303           TEST BL, 3         ; 2/242 samples
     ;      6D8:       756D             JNE L8
     ;      6DA:       8BC3             MOV EAX, EBX       ; 5/242 samples
     ;      6DC: L3:   83F900           CMP ECX, 0         ; 4/242 samples

14.2.3 Platform support

This module is known not to work consistently on the Alpha platform, for technical reasons related to the implementation of a machine language idiom for marking sections of code to be treated as atomic by the garbage collector; However, it should work on other platforms, and the deficiency on the Alpha will eventually be rectified.

Allocation profiling is only supported on SBCL builds that use the generational garbage collector. Tracking of call stacks at a depth of more than two levels is only supported on x86 and x86-64.

14.2.4 Macros

— Macro: sb-sprof:with-profiling (&key sample-interval alloc-interval max-samples reset mode loop show-progress report) &body body

Repeatedly evaluate body with statistical profiling turned on. In multi-threaded operation, only the thread in which with-profiling was evaluated will be profiled by default. If you want to profile multiple threads, invoke the profiler with start-profiling.

The following keyword args are recognized:

:sample-interval <n>
Take a sample every <n> seconds. Default is *sample-interval*.
:alloc-interval <n>
Take a sample every time <n> allocation regions (approximately 8kB) have been allocated since the last sample. Default is *alloc-interval*.
:mode <mode>
If :cpu, run the profiler in cpu profiling mode. If :alloc, run the profiler in allocation profiling mode.
:max-samples <max>
Repeat evaluating body until <max> samples are taken. Default is *max-samples*.
:report <type>
If specified, call report with :type <type> at the end.
:reset <bool>
It true, call reset at the beginning.
:loop <bool>
If true (the default) repeatedly evaluate body. If false, evaluate if only once.

— Macro: sb-sprof:with-sampling (&optional on) &body body

Evaluate body with statistical sampling turned on or off.

14.2.5 Functions

— Function: sb-sprof:report &key type max min-percent call-graph stream show-progress

Report statistical profiling results. The following keyword args are recognized:

:type <type>
Specifies the type of report to generate. If :flat, show flat report, if :graph show a call graph and a flat report. If nil, don't print out a report.
:stream <stream>
Specify a stream to print the report on. Default is *standard-output*.
:max <max>
Don't show more than <max> entries in the flat report.
:min-percent <min-percent>
Don't show functions taking less than <min-percent> of the total time in the flat report.
:show-progress <bool>
If true, print progress messages while generating the call graph.
:call-graph <graph>
Print a report from <graph> instead of the latest profiling results.

Value of this function is a call-graph object representing the resulting call-graph.

— Function: sb-sprof:reset

Reset the profiler.

— Function: sb-sprof:start-profiling &key max-samples mode sample-interval alloc-interval sampling

Start profiling statistically if not already profiling. The following keyword args are recognized:

:sample-interval <n>
Take a sample every <n> seconds. Default is *sample-interval*.
:alloc-interval <n>
Take a sample every time <n> allocation regions (approximately 8kB) have been allocated since the last sample. Default is *alloc-interval*.
:mode <mode>
If :cpu, run the profiler in cpu profiling mode. If :alloc, run the profiler in allocation profiling mode.
:max-samples <max>
Maximum number of samples. Default is *max-samples*.
:sampling <bool>
If true, the default, start sampling right away. If false, start-sampling can be used to turn sampling on.

— Function: sb-sprof:stop-profiling

Stop profiling if profiling.

14.2.6 Variables

— Variable: sb-sprof:*max-samples*

Default number of samples taken.

— Variable: sb-sprof:*sample-interval*

Default number of seconds between samples.

14.2.7 Credits

sb-sprof is an SBCL port, with enhancements, of Gerd Moellmann's statistical profiler for CMUCL.