mirror of
				https://github.com/ossrs/srs.git
				synced 2025-03-09 15:49:59 +00:00 
			
		
		
		
	
		
			
				
	
	
		
			536 lines
		
	
	
	
		
			19 KiB
		
	
	
	
		
			HTML
		
	
	
	
	
	
			
		
		
	
	
			536 lines
		
	
	
	
		
			19 KiB
		
	
	
	
		
			HTML
		
	
	
	
	
	
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
 | 
						|
<HTML>
 | 
						|
 | 
						|
<HEAD>
 | 
						|
  <link rel="stylesheet" href="designstyle.css">
 | 
						|
  <title>Gperftools CPU Profiler</title>
 | 
						|
</HEAD>
 | 
						|
 | 
						|
<BODY>
 | 
						|
 | 
						|
<p align=right>
 | 
						|
  <i>Last modified
 | 
						|
  <script type=text/javascript>
 | 
						|
    var lm = new Date(document.lastModified);
 | 
						|
    document.write(lm.toDateString());
 | 
						|
  </script></i>
 | 
						|
</p>
 | 
						|
 | 
						|
<p>This is the CPU profiler we use at Google.  There are three parts
 | 
						|
to using it: linking the library into an application, running the
 | 
						|
code, and analyzing the output.</p>
 | 
						|
 | 
						|
<p>On the off-chance that you should need to understand it, the CPU
 | 
						|
profiler data file format is documented separately,
 | 
						|
<a href="cpuprofile-fileformat.html">here</a>.
 | 
						|
 | 
						|
 | 
						|
<H1>Linking in the Library</H1>
 | 
						|
 | 
						|
<p>To install the CPU profiler into your executable, add
 | 
						|
<code>-lprofiler</code> to the link-time step for your executable.
 | 
						|
(It's also probably possible to add in the profiler at run-time using
 | 
						|
<code>LD_PRELOAD</code>, e.g.
 | 
						|
<code>% env LD_PRELOAD="/usr/lib/libprofiler.so" <binary></code>,
 | 
						|
but this isn't necessarily recommended.)</p>
 | 
						|
 | 
						|
<p>This does <i>not</i> turn on CPU profiling; it just inserts the
 | 
						|
code.  For that reason, it's practical to just always link
 | 
						|
<code>-lprofiler</code> into a binary while developing; that's what we
 | 
						|
do at Google.  (However, since any user can turn on the profiler by
 | 
						|
setting an environment variable, it's not necessarily recommended to
 | 
						|
install profiler-linked binaries into a production, running
 | 
						|
system.)</p>
 | 
						|
 | 
						|
 | 
						|
<H1>Running the Code</H1>
 | 
						|
 | 
						|
<p>There are several alternatives to actually turn on CPU profiling
 | 
						|
for a given run of an executable:</p>
 | 
						|
 | 
						|
<ol>
 | 
						|
  <li> <p>Define the environment variable CPUPROFILE to the filename
 | 
						|
       to dump the profile to.  For instance, if you had a version of
 | 
						|
       <code>/bin/ls</code> that had been linked against libprofiler,
 | 
						|
       you could run:</p>
 | 
						|
       <pre>% env CPUPROFILE=ls.prof /bin/ls</pre>
 | 
						|
  </li>
 | 
						|
  <li> <p>In addition to defining the environment variable CPUPROFILE
 | 
						|
       you can also define CPUPROFILESIGNAL.  This allows profiling to be
 | 
						|
       controlled via the signal number that you specify.  The signal number
 | 
						|
       must be unused by the program under normal operation. Internally it
 | 
						|
       acts as a switch, triggered by the signal, which is off by default.
 | 
						|
       For instance, if you had a copy of <code>/bin/chrome</code> that had been
 | 
						|
       been linked against libprofiler, you could run:</p>
 | 
						|
       <pre>% env CPUPROFILE=chrome.prof CPUPROFILESIGNAL=12 /bin/chrome &</pre>
 | 
						|
       <p>You can then trigger profiling to start:</p>
 | 
						|
       <pre>% killall -12 chrome</pre>
 | 
						|
	   <p>Then after a period of time you can tell it to stop which will
 | 
						|
       generate the profile:</p>
 | 
						|
       <pre>% killall -12 chrome</pre>
 | 
						|
  </li>
 | 
						|
  <li> <p>In your code, bracket the code you want profiled in calls to
 | 
						|
       <code>ProfilerStart()</code> and <code>ProfilerStop()</code>.
 | 
						|
       (These functions are declared in <code><gperftools/profiler.h></code>.)
 | 
						|
       <code>ProfilerStart()</code> will take
 | 
						|
       the profile-filename as an argument.</p>
 | 
						|
  </li>
 | 
						|
</ol>
 | 
						|
 | 
						|
<p>In Linux 2.6 and above, profiling works correctly with threads,
 | 
						|
automatically profiling all threads.  In Linux 2.4, profiling only
 | 
						|
profiles the main thread (due to a kernel bug involving itimers and
 | 
						|
threads).  Profiling works correctly with sub-processes: each child
 | 
						|
process gets its own profile with its own name (generated by combining
 | 
						|
CPUPROFILE with the child's process id).</p>
 | 
						|
 | 
						|
<p>For security reasons, CPU profiling will not write to a file -- and
 | 
						|
is thus not usable -- for setuid programs.</p>
 | 
						|
 | 
						|
<p>See the include-file <code>gperftools/profiler.h</code> for
 | 
						|
advanced-use functions, including <code>ProfilerFlush()</code> and
 | 
						|
<code>ProfilerStartWithOptions()</code>.</p>
 | 
						|
 | 
						|
 | 
						|
<H2>Modifying Runtime Behavior</H2>
 | 
						|
 | 
						|
<p>You can more finely control the behavior of the CPU profiler via
 | 
						|
environment variables.</p>
 | 
						|
 | 
						|
<table frame=box rules=sides cellpadding=5 width=100%>
 | 
						|
 | 
						|
<tr valign=top>
 | 
						|
  <td><code>CPUPROFILE_FREQUENCY=<i>x</i></code></td>
 | 
						|
  <td>default: 100</td>
 | 
						|
  <td>
 | 
						|
    How many interrupts/second the cpu-profiler samples.
 | 
						|
  </td>
 | 
						|
</tr>
 | 
						|
 | 
						|
<tr valign=top>
 | 
						|
  <td><code>CPUPROFILE_REALTIME=1</code></td>
 | 
						|
  <td>default: [not set]</td>
 | 
						|
  <td>
 | 
						|
    If set to any value (including 0 or the empty string), use
 | 
						|
    ITIMER_REAL instead of ITIMER_PROF to gather profiles.  In
 | 
						|
    general, ITIMER_REAL is not as accurate as ITIMER_PROF, and also
 | 
						|
    interacts badly with use of alarm(), so prefer ITIMER_PROF unless
 | 
						|
    you have a reason prefer ITIMER_REAL.
 | 
						|
  </td>
 | 
						|
</tr>
 | 
						|
 | 
						|
</table>
 | 
						|
 | 
						|
 | 
						|
<h1><a name="pprof">Analyzing the Output</a></h1>
 | 
						|
 | 
						|
<p><code>pprof</code> is the script used to analyze a profile.  It has
 | 
						|
many output modes, both textual and graphical.  Some give just raw
 | 
						|
numbers, much like the <code>-pg</code> output of <code>gcc</code>,
 | 
						|
and others show the data in the form of a dependency graph.</p>
 | 
						|
 | 
						|
<p>pprof <b>requires</b> <code>perl5</code> to be installed to run.
 | 
						|
It also requires <code>dot</code> to be installed for any of the
 | 
						|
graphical output routines, and <code>gv</code> to be installed for
 | 
						|
<code>--gv</code> mode (described below).
 | 
						|
</p>
 | 
						|
 | 
						|
<p>Here are some ways to call pprof.  These are described in more
 | 
						|
detail below.</p>
 | 
						|
 | 
						|
<pre>
 | 
						|
% pprof /bin/ls ls.prof
 | 
						|
                       Enters "interactive" mode
 | 
						|
% pprof --text /bin/ls ls.prof
 | 
						|
                       Outputs one line per procedure
 | 
						|
% pprof --gv /bin/ls ls.prof
 | 
						|
                       Displays annotated call-graph via 'gv'
 | 
						|
% pprof --gv --focus=Mutex /bin/ls ls.prof
 | 
						|
                       Restricts to code paths including a .*Mutex.* entry
 | 
						|
% pprof --gv --focus=Mutex --ignore=string /bin/ls ls.prof
 | 
						|
                       Code paths including Mutex but not string
 | 
						|
% pprof --list=getdir /bin/ls ls.prof
 | 
						|
                       (Per-line) annotated source listing for getdir()
 | 
						|
% pprof --disasm=getdir /bin/ls ls.prof
 | 
						|
                       (Per-PC) annotated disassembly for getdir()
 | 
						|
% pprof --text localhost:1234
 | 
						|
                       Outputs one line per procedure for localhost:1234
 | 
						|
% pprof --callgrind /bin/ls ls.prof
 | 
						|
                       Outputs the call information in callgrind format
 | 
						|
</pre>
 | 
						|
 | 
						|
 | 
						|
<h3>Analyzing Text Output</h3>
 | 
						|
 | 
						|
<p>Text mode has lines of output that look like this:</p>
 | 
						|
<pre>
 | 
						|
       14   2.1%  17.2%       58   8.7% std::_Rb_tree::find
 | 
						|
</pre>
 | 
						|
 | 
						|
<p>Here is how to interpret the columns:</p>
 | 
						|
<ol>
 | 
						|
  <li> Number of profiling samples in this function
 | 
						|
  <li> Percentage of profiling samples in this function
 | 
						|
  <li> Percentage of profiling samples in the functions printed so far
 | 
						|
  <li> Number of profiling samples in this function and its callees
 | 
						|
  <li> Percentage of profiling samples in this function and its callees
 | 
						|
  <li> Function name
 | 
						|
</ol>
 | 
						|
 | 
						|
<h3>Analyzing Callgrind Output</h3>
 | 
						|
 | 
						|
<p>Use <a href="http://kcachegrind.sourceforge.net">kcachegrind</a> to 
 | 
						|
analyze your callgrind output:</p>
 | 
						|
<pre>
 | 
						|
% pprof --callgrind /bin/ls ls.prof > ls.callgrind
 | 
						|
% kcachegrind ls.callgrind
 | 
						|
</pre>
 | 
						|
 | 
						|
<p>The cost is specified in 'hits', i.e. how many times a function
 | 
						|
appears in the recorded call stack information. The 'calls' from
 | 
						|
function a to b record how many times function b was found in the
 | 
						|
stack traces directly below function a.</p>
 | 
						|
 | 
						|
<p>Tip: if you use a debug build the output will include file and line
 | 
						|
number information and kcachegrind will show an annotated source
 | 
						|
code view.</p>
 | 
						|
 | 
						|
<h3>Node Information</h3>
 | 
						|
 | 
						|
<p>In the various graphical modes of pprof, the output is a call graph
 | 
						|
annotated with timing information, like so:</p>
 | 
						|
 | 
						|
<A HREF="pprof-test-big.gif">
 | 
						|
<center><table><tr><td>
 | 
						|
   <img src="pprof-test.gif">
 | 
						|
</td></tr></table></center>
 | 
						|
</A>
 | 
						|
 | 
						|
<p>Each node represents a procedure.  The directed edges indicate
 | 
						|
caller to callee relations.  Each node is formatted as follows:</p>
 | 
						|
 | 
						|
<center><pre>
 | 
						|
Class Name
 | 
						|
Method Name
 | 
						|
local (percentage)
 | 
						|
<b>of</b> cumulative (percentage)
 | 
						|
</pre></center>
 | 
						|
 | 
						|
<p>The last one or two lines contains the timing information.  (The
 | 
						|
profiling is done via a sampling method, where by default we take 100
 | 
						|
samples a second.  Therefor one unit of time in the output corresponds
 | 
						|
to about 10 milliseconds of execution time.) The "local" time is the
 | 
						|
time spent executing the instructions directly contained in the
 | 
						|
procedure (and in any other procedures that were inlined into the
 | 
						|
procedure).  The "cumulative" time is the sum of the "local" time and
 | 
						|
the time spent in any callees.  If the cumulative time is the same as
 | 
						|
the local time, it is not printed.</p>
 | 
						|
 | 
						|
<p>For instance, the timing information for test_main_thread()
 | 
						|
indicates that 155 units (about 1.55 seconds) were spent executing the
 | 
						|
code in <code>test_main_thread()</code> and 200 units were spent while
 | 
						|
executing <code>test_main_thread()</code> and its callees such as
 | 
						|
<code>snprintf()</code>.</p>
 | 
						|
 | 
						|
<p>The size of the node is proportional to the local count.  The
 | 
						|
percentage displayed in the node corresponds to the count divided by
 | 
						|
the total run time of the program (that is, the cumulative count for
 | 
						|
<code>main()</code>).</p>
 | 
						|
 | 
						|
<h3>Edge Information</h3>
 | 
						|
 | 
						|
<p>An edge from one node to another indicates a caller to callee
 | 
						|
relationship.  Each edge is labelled with the time spent by the callee
 | 
						|
on behalf of the caller.  E.g, the edge from
 | 
						|
<code>test_main_thread()</code> to <code>snprintf()</code> indicates
 | 
						|
that of the 200 samples in <code>test_main_thread()</code>, 37 are
 | 
						|
because of calls to <code>snprintf()</code>.</p>
 | 
						|
 | 
						|
<p>Note that <code>test_main_thread()</code> has an edge to
 | 
						|
<code>vsnprintf()</code>, even though <code>test_main_thread()</code>
 | 
						|
doesn't call that function directly.  This is because the code was
 | 
						|
compiled with <code>-O2</code>; the profile reflects the optimized
 | 
						|
control flow.</p>
 | 
						|
 | 
						|
<h3>Meta Information</h3>
 | 
						|
 | 
						|
<p>The top of the display should contain some meta information
 | 
						|
like:</p>
 | 
						|
<pre>
 | 
						|
      /tmp/profiler2_unittest
 | 
						|
      Total samples: 202
 | 
						|
      Focusing on: 202
 | 
						|
      Dropped nodes with <= 1 abs(samples)
 | 
						|
      Dropped edges with <= 0 samples
 | 
						|
</pre>
 | 
						|
 | 
						|
<p>This section contains the name of the program, and the total
 | 
						|
samples collected during the profiling run.  If the
 | 
						|
<code>--focus</code> option is on (see the <a href="#focus">Focus</a>
 | 
						|
section below), the legend also contains the number of samples being
 | 
						|
shown in the focused display.  Furthermore, some unimportant nodes and
 | 
						|
edges are dropped to reduce clutter.  The characteristics of the
 | 
						|
dropped nodes and edges are also displayed in the legend.</p>
 | 
						|
 | 
						|
<h3><a name=focus>Focus and Ignore</a></h3>
 | 
						|
 | 
						|
<p>You can ask pprof to generate a display focused on a particular
 | 
						|
piece of the program.  You specify a regular expression.  Any portion
 | 
						|
of the call-graph that is on a path which contains at least one node
 | 
						|
matching the regular expression is preserved.  The rest of the
 | 
						|
call-graph is dropped on the floor.  For example, you can focus on the
 | 
						|
<code>vsnprintf()</code> libc call in <code>profiler2_unittest</code>
 | 
						|
as follows:</p>
 | 
						|
 | 
						|
<pre>
 | 
						|
% pprof --gv --focus=vsnprintf /tmp/profiler2_unittest test.prof
 | 
						|
</pre>
 | 
						|
<A HREF="pprof-vsnprintf-big.gif">
 | 
						|
<center><table><tr><td>
 | 
						|
   <img src="pprof-vsnprintf.gif">
 | 
						|
</td></tr></table></center>
 | 
						|
</A>
 | 
						|
 | 
						|
<p>Similarly, you can supply the <code>--ignore</code> option to
 | 
						|
ignore samples that match a specified regular expression.  E.g., if
 | 
						|
you are interested in everything except calls to
 | 
						|
<code>snprintf()</code>, you can say:</p>
 | 
						|
<pre>
 | 
						|
% pprof --gv --ignore=snprintf /tmp/profiler2_unittest test.prof
 | 
						|
</pre>
 | 
						|
 | 
						|
 | 
						|
<h3>Interactive mode</a></h3>
 | 
						|
 | 
						|
<p>By default -- if you don't specify any flags to the contrary --
 | 
						|
pprof runs in interactive mode.  At the <code>(pprof)</code> prompt,
 | 
						|
you can run many of the commands described above.  You can type
 | 
						|
<code>help</code> for a list of what commands are available in
 | 
						|
interactive mode.</p>
 | 
						|
 | 
						|
<h3><a name=options>pprof Options</a></h3>
 | 
						|
 | 
						|
For a complete list of pprof options, you can run <code>pprof
 | 
						|
--help</code>.
 | 
						|
 | 
						|
<h4>Output Type</h4>
 | 
						|
 | 
						|
<p>
 | 
						|
<center>
 | 
						|
<table frame=box rules=sides cellpadding=5 width=100%>
 | 
						|
<tr valign=top>
 | 
						|
  <td><code>--text</code></td>
 | 
						|
  <td>
 | 
						|
    Produces a textual listing.  (Note: If you have an X display, and
 | 
						|
    <code>dot</code> and <code>gv</code> installed, you will probably
 | 
						|
    be happier with the <code>--gv</code> output.)
 | 
						|
  </td>
 | 
						|
</tr>
 | 
						|
<tr valign=top>
 | 
						|
  <td><code>--gv</code></td>
 | 
						|
  <td>
 | 
						|
    Generates annotated call-graph, converts to postscript, and
 | 
						|
    displays via gv (requres <code>dot</code> and <code>gv</code> be
 | 
						|
    installed).
 | 
						|
  </td>
 | 
						|
</tr>
 | 
						|
<tr valign=top>
 | 
						|
  <td><code>--dot</code></td>
 | 
						|
  <td>
 | 
						|
    Generates the annotated call-graph in dot format and
 | 
						|
    emits to stdout (requres <code>dot</code> be installed).
 | 
						|
  </td>
 | 
						|
</tr>
 | 
						|
<tr valign=top>
 | 
						|
  <td><code>--ps</code></td>
 | 
						|
  <td>
 | 
						|
    Generates the annotated call-graph in Postscript format and
 | 
						|
    emits to stdout (requres <code>dot</code> be installed).
 | 
						|
  </td>
 | 
						|
</tr>
 | 
						|
<tr valign=top>
 | 
						|
  <td><code>--pdf</code></td>
 | 
						|
  <td>
 | 
						|
    Generates the annotated call-graph in PDF format and emits to
 | 
						|
    stdout (requires <code>dot</code> and <code>ps2pdf</code> be
 | 
						|
    installed).
 | 
						|
  </td>
 | 
						|
</tr>
 | 
						|
<tr valign=top>
 | 
						|
  <td><code>--gif</code></td>
 | 
						|
  <td>
 | 
						|
    Generates the annotated call-graph in GIF format and
 | 
						|
    emits to stdout (requres <code>dot</code> be installed).
 | 
						|
  </td>
 | 
						|
</tr>
 | 
						|
<tr valign=top>
 | 
						|
  <td><code>--list=<<i>regexp</i>></code></td>
 | 
						|
  <td>
 | 
						|
    <p>Outputs source-code listing of routines whose
 | 
						|
    name matches <regexp>.  Each line
 | 
						|
    in the listing is annotated with flat and cumulative
 | 
						|
    sample counts.</p>
 | 
						|
 | 
						|
    <p>In the presence of inlined calls, the samples
 | 
						|
    associated with inlined code tend to get assigned
 | 
						|
    to a line that follows the location of the 
 | 
						|
    inlined call.  A more precise accounting can be
 | 
						|
    obtained by disassembling the routine using the
 | 
						|
    --disasm flag.</p>
 | 
						|
  </td>
 | 
						|
</tr>
 | 
						|
<tr valign=top>
 | 
						|
  <td><code>--disasm=<<i>regexp</i>></code></td>
 | 
						|
  <td>
 | 
						|
    Generates disassembly of routines that match
 | 
						|
    <regexp>, annotated with flat and
 | 
						|
    cumulative sample counts and emits to stdout.
 | 
						|
  </td>
 | 
						|
</tr>
 | 
						|
</table>
 | 
						|
</center>
 | 
						|
 | 
						|
<h4>Reporting Granularity</h4>
 | 
						|
 | 
						|
<p>By default, pprof produces one entry per procedure.  However you can
 | 
						|
use one of the following options to change the granularity of the
 | 
						|
output.  The <code>--files</code> option seems to be particularly
 | 
						|
useless, and may be removed eventually.</p>
 | 
						|
 | 
						|
<center>
 | 
						|
<table frame=box rules=sides cellpadding=5 width=100%>
 | 
						|
<tr valign=top>
 | 
						|
  <td><code>--addresses</code></td>
 | 
						|
  <td>
 | 
						|
     Produce one node per program address.
 | 
						|
  </td>
 | 
						|
</tr>
 | 
						|
  <td><code>--lines</code></td>
 | 
						|
  <td>
 | 
						|
     Produce one node per source line.
 | 
						|
  </td>
 | 
						|
</tr>
 | 
						|
  <td><code>--functions</code></td>
 | 
						|
  <td>
 | 
						|
     Produce one node per function (this is the default).
 | 
						|
  </td>
 | 
						|
</tr>
 | 
						|
  <td><code>--files</code></td>
 | 
						|
  <td>
 | 
						|
     Produce one node per source file.
 | 
						|
  </td>
 | 
						|
</tr>
 | 
						|
</table>
 | 
						|
</center>
 | 
						|
 | 
						|
<h4>Controlling the Call Graph Display</h4>
 | 
						|
 | 
						|
<p>Some nodes and edges are dropped to reduce clutter in the output
 | 
						|
display.  The following options control this effect:</p>
 | 
						|
 | 
						|
<center>
 | 
						|
<table frame=box rules=sides cellpadding=5 width=100%>
 | 
						|
<tr valign=top>
 | 
						|
  <td><code>--nodecount=<n></code></td>
 | 
						|
  <td>
 | 
						|
    This option controls the number of displayed nodes.  The nodes
 | 
						|
    are first sorted by decreasing cumulative count, and then only
 | 
						|
    the top N nodes are kept.  The default value is 80.
 | 
						|
  </td>
 | 
						|
</tr>
 | 
						|
<tr valign=top>
 | 
						|
  <td><code>--nodefraction=<f></code></td>
 | 
						|
  <td>
 | 
						|
    This option provides another mechanism for discarding nodes
 | 
						|
    from the display.  If the cumulative count for a node is
 | 
						|
    less than this option's value multiplied by the total count
 | 
						|
    for the profile, the node is dropped.  The default value
 | 
						|
    is 0.005; i.e. nodes that account for less than
 | 
						|
    half a percent of the total time are dropped.  A node
 | 
						|
    is dropped if either this condition is satisfied, or the
 | 
						|
    --nodecount condition is satisfied.
 | 
						|
  </td>
 | 
						|
</tr>
 | 
						|
<tr valign=top>
 | 
						|
  <td><code>--edgefraction=<f></code></td>
 | 
						|
  <td>
 | 
						|
    This option controls the number of displayed edges.  First of all,
 | 
						|
    an edge is dropped if either its source or destination node is
 | 
						|
    dropped.  Otherwise, the edge is dropped if the sample
 | 
						|
    count along the edge is less than this option's value multiplied
 | 
						|
    by the total count for the profile.  The default value is
 | 
						|
    0.001; i.e., edges that account for less than
 | 
						|
    0.1% of the total time are dropped.
 | 
						|
  </td>
 | 
						|
</tr>
 | 
						|
<tr valign=top>
 | 
						|
  <td><code>--focus=<re></code></td>
 | 
						|
  <td>
 | 
						|
    This option controls what region of the graph is displayed
 | 
						|
    based on the regular expression supplied with the option.
 | 
						|
    For any path in the callgraph, we check all nodes in the path
 | 
						|
    against the supplied regular expression.  If none of the nodes
 | 
						|
    match, the path is dropped from the output.
 | 
						|
  </td>
 | 
						|
</tr>
 | 
						|
<tr valign=top>
 | 
						|
  <td><code>--ignore=<re></code></td>
 | 
						|
  <td>
 | 
						|
    This option controls what region of the graph is displayed
 | 
						|
    based on the regular expression supplied with the option.
 | 
						|
    For any path in the callgraph, we check all nodes in the path
 | 
						|
    against the supplied regular expression.  If any of the nodes
 | 
						|
    match, the path is dropped from the output.
 | 
						|
  </td>
 | 
						|
</tr>
 | 
						|
</table>
 | 
						|
</center>
 | 
						|
 | 
						|
<p>The dropped edges and nodes account for some count mismatches in
 | 
						|
the display.  For example, the cumulative count for
 | 
						|
<code>snprintf()</code> in the first diagram above was 41.  However
 | 
						|
the local count (1) and the count along the outgoing edges (12+1+20+6)
 | 
						|
add up to only 40.</p>
 | 
						|
 | 
						|
 | 
						|
<h1>Caveats</h1>
 | 
						|
 | 
						|
<ul>
 | 
						|
  <li> If the program exits because of a signal, the generated profile
 | 
						|
       will be <font color=red>incomplete, and may perhaps be
 | 
						|
       completely empty</font>.
 | 
						|
  <li> The displayed graph may have disconnected regions because
 | 
						|
       of the edge-dropping heuristics described above.
 | 
						|
  <li> If the program linked in a library that was not compiled
 | 
						|
       with enough symbolic information, all samples associated
 | 
						|
       with the library may be charged to the last symbol found
 | 
						|
       in the program before the library.  This will artificially
 | 
						|
       inflate the count for that symbol.
 | 
						|
  <li> If you run the program on one machine, and profile it on
 | 
						|
       another, and the shared libraries are different on the two
 | 
						|
       machines, the profiling output may be confusing: samples that
 | 
						|
       fall within  shared libaries may be assigned to arbitrary
 | 
						|
       procedures.
 | 
						|
  <li> If your program forks, the children will also be profiled
 | 
						|
       (since they inherit the same CPUPROFILE setting).  Each process
 | 
						|
       is profiled separately; to distinguish the child profiles from
 | 
						|
       the parent profile and from each other, all children will have
 | 
						|
       their process-id appended to the CPUPROFILE name.
 | 
						|
  <li> Due to a hack we make to work around a possible gcc bug, your
 | 
						|
       profiles may end up named strangely if the first character of
 | 
						|
       your CPUPROFILE variable has ascii value greater than 127.
 | 
						|
       This should be exceedingly rare, but if you need to use such a
 | 
						|
       name, just set prepend <code>./</code> to your filename:
 | 
						|
       <code>CPUPROFILE=./Ägypten</code>.
 | 
						|
</ul>
 | 
						|
 | 
						|
 | 
						|
<hr>
 | 
						|
<address>Sanjay Ghemawat<br>
 | 
						|
<!-- Created: Tue Dec 19 10:43:14 PST 2000 -->
 | 
						|
<!-- hhmts start -->
 | 
						|
Last modified: Fri May  9 14:41:29 PDT 2008
 | 
						|
<!-- hhmts end -->
 | 
						|
</address>
 | 
						|
</BODY>
 | 
						|
</HTML>
 |