mirror of
				https://github.com/ossrs/srs.git
				synced 2025-03-09 15:49:59 +00:00 
			
		
		
		
	
		
			
				
	
	
		
			391 lines
		
	
	
	
		
			13 KiB
		
	
	
	
		
			HTML
		
	
	
	
	
	
			
		
		
	
	
			391 lines
		
	
	
	
		
			13 KiB
		
	
	
	
		
			HTML
		
	
	
	
	
	
| <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
 | |
| <HTML>
 | |
| 
 | |
| <HEAD>
 | |
|   <link rel="stylesheet" href="designstyle.css">
 | |
|   <title>Gperftools Heap Profiler</title>
 | |
| </HEAD>
 | |
| 
 | |
| <BODY>
 | |
| 
 | |
| <p align=right>
 | |
|   <i>Last modified
 | |
|   <script type=text/javascript>
 | |
|     var lm = new Date(document.lastModified);
 | |
|     document.write(lm.toDateString());
 | |
|   </script></i>
 | |
| </p>
 | |
| 
 | |
| <p>This is the heap profiler we use at Google, to explore how C++
 | |
| programs manage memory.  This facility can be useful for</p>
 | |
| <ul>
 | |
|   <li> Figuring out what is in the program heap at any given time
 | |
|   <li> Locating memory leaks
 | |
|   <li> Finding places that do a lot of allocation
 | |
| </ul>
 | |
| 
 | |
| <p>The profiling system instruments all allocations and frees.  It
 | |
| keeps track of various pieces of information per allocation site.  An
 | |
| allocation site is defined as the active stack trace at the call to
 | |
| <code>malloc</code>, <code>calloc</code>, <code>realloc</code>, or,
 | |
| <code>new</code>.</p>
 | |
| 
 | |
| <p>There are three parts to using it: linking the library into an
 | |
| application, running the code, and analyzing the output.</p>
 | |
| 
 | |
| 
 | |
| <h1>Linking in the Library</h1>
 | |
| 
 | |
| <p>To install the heap profiler into your executable, add
 | |
| <code>-ltcmalloc</code> to the link-time step for your executable.
 | |
| Also, while we don't necessarily recommend this form of usage, it's
 | |
| possible to add in the profiler at run-time using
 | |
| <code>LD_PRELOAD</code>:
 | |
| <pre>% env LD_PRELOAD="/usr/lib/libtcmalloc.so" <binary></pre>
 | |
| 
 | |
| <p>This does <i>not</i> turn on heap profiling; it just inserts the
 | |
| code.  For that reason, it's practical to just always link
 | |
| <code>-ltcmalloc</code> into a binary while developing; that's what we
 | |
| do at Google.  (However, since any user can turn on the profiler by
 | |
| setting an environment variable, it's not necessarily recommended to
 | |
| install profiler-linked binaries into a production, running
 | |
| system.)  Note that if you wish to use the heap profiler, you must
 | |
| also use the tcmalloc memory-allocation library.  There is no way
 | |
| currently to use the heap profiler separate from tcmalloc.</p>
 | |
| 
 | |
| 
 | |
| <h1>Running the Code</h1>
 | |
| 
 | |
| <p>There are several alternatives to actually turn on heap profiling
 | |
| for a given run of an executable:</p>
 | |
| 
 | |
| <ol>
 | |
|   <li> <p>Define the environment variable HEAPPROFILE to the filename
 | |
|        to dump the profile to.  For instance, to profile
 | |
|        <code>/usr/local/bin/my_binary_compiled_with_tcmalloc</code>:</p>
 | |
|        <pre>% env HEAPPROFILE=/tmp/mybin.hprof /usr/local/bin/my_binary_compiled_with_tcmalloc</pre>
 | |
|   <li> <p>In your code, bracket the code you want profiled in calls to
 | |
|        <code>HeapProfilerStart()</code> and <code>HeapProfilerStop()</code>.
 | |
|        (These functions are declared in <code><gperftools/heap-profiler.h></code>.)
 | |
|        <code>HeapProfilerStart()</code> will take the
 | |
|        profile-filename-prefix as an argument.  Then, as often as
 | |
|        you'd like before calling <code>HeapProfilerStop()</code>, you
 | |
|        can use <code>HeapProfilerDump()</code> or
 | |
|        <code>GetHeapProfile()</code> to examine the profile.  In case
 | |
|        it's useful, <code>IsHeapProfilerRunning()</code> will tell you
 | |
|        whether you've already called HeapProfilerStart() or not.</p>
 | |
| </ol>
 | |
| 
 | |
| 
 | |
| <p>For security reasons, heap profiling will not write to a file --
 | |
| and is thus not usable -- for setuid programs.</p>
 | |
| 
 | |
| <H2>Modifying Runtime Behavior</H2>
 | |
| 
 | |
| <p>You can more finely control the behavior of the heap profiler via
 | |
| environment variables.</p>
 | |
| 
 | |
| <table frame=box rules=sides cellpadding=5 width=100%>
 | |
| 
 | |
| <tr valign=top>
 | |
|   <td><code>HEAP_PROFILE_ALLOCATION_INTERVAL</code></td>
 | |
|   <td>default: 1073741824 (1 Gb)</td>
 | |
|   <td>
 | |
|     Dump heap profiling information each time the specified number of
 | |
|     bytes has been allocated by the program.
 | |
|   </td>
 | |
| </tr>
 | |
| 
 | |
| <tr valign=top>
 | |
|   <td><code>HEAP_PROFILE_INUSE_INTERVAL</code></td>
 | |
|   <td>default: 104857600 (100 Mb)</td>
 | |
|   <td>
 | |
|     Dump heap profiling information whenever the high-water memory
 | |
|     usage mark increases by the specified number of bytes.
 | |
|   </td>
 | |
| </tr>
 | |
| 
 | |
| <tr valign=top>
 | |
|   <td><code>HEAP_PROFILE_TIME_INTERVAL</code></td>
 | |
|   <td>default: 0</td>
 | |
|   <td>
 | |
|     Dump heap profiling information each time the specified
 | |
|     number of seconds has elapsed.
 | |
|   </td>
 | |
| </tr>
 | |
| 
 | |
| <tr valign=top>
 | |
|   <td><code>HEAPPROFILESIGNAL</code></td>
 | |
|   <td>default: disabled</td>
 | |
|   <td>
 | |
|     Dump heap profiling information whenever the specified signal is sent to the
 | |
|     process.
 | |
|   </td>
 | |
| </tr>
 | |
| 
 | |
| <tr valign=top>
 | |
|   <td><code>HEAP_PROFILE_MMAP</code></td>
 | |
|   <td>default: false</td>
 | |
|   <td>
 | |
|     Profile <code>mmap</code>, <code>mremap</code> and <code>sbrk</code>
 | |
|     calls in addition
 | |
|     to <code>malloc</code>, <code>calloc</code>, <code>realloc</code>,
 | |
|     and <code>new</code>.  <b>NOTE:</b> this causes the profiler to
 | |
|     profile calls internal to tcmalloc, since tcmalloc and friends use
 | |
|     mmap and sbrk internally for allocations.  One partial solution is
 | |
|     to filter these allocations out when running <code>pprof</code>,
 | |
|     with something like
 | |
|     <code>pprof --ignore='DoAllocWithArena|SbrkSysAllocator::Alloc|MmapSysAllocator::Alloc</code>.
 | |
|   </td>
 | |
| </tr>
 | |
| 
 | |
| <tr valign=top>
 | |
|   <td><code>HEAP_PROFILE_ONLY_MMAP</code></td>
 | |
|   <td>default: false</td>
 | |
|   <td>
 | |
|     Only profile <code>mmap</code>, <code>mremap</code>, and <code>sbrk</code>
 | |
|     calls; do not profile
 | |
|     <code>malloc</code>, <code>calloc</code>, <code>realloc</code>,
 | |
|     or <code>new</code>.
 | |
|   </td>
 | |
| </tr>
 | |
| 
 | |
| <tr valign=top>
 | |
|   <td><code>HEAP_PROFILE_MMAP_LOG</code></td>
 | |
|   <td>default: false</td>
 | |
|   <td>
 | |
|     Log <code>mmap</code>/<code>munmap</code> calls.
 | |
|   </td>
 | |
| </tr>
 | |
| 
 | |
| </table>
 | |
| 
 | |
| <H2>Checking for Leaks</H2>
 | |
| 
 | |
| <p>You can use the heap profiler to manually check for leaks, for
 | |
| instance by reading the profiler output and looking for large
 | |
| allocations.  However, for that task, it's easier to use the <A
 | |
| HREF="heap_checker.html">automatic heap-checking facility</A> built
 | |
| into tcmalloc.</p>
 | |
| 
 | |
| 
 | |
| <h1><a name="pprof">Analyzing the Output</a></h1>
 | |
| 
 | |
| <p>If heap-profiling is turned on in a program, the program will
 | |
| periodically write profiles to the filesystem.  The sequence of
 | |
| profiles will be named:</p>
 | |
| <pre>
 | |
|            <prefix>.0000.heap
 | |
|            <prefix>.0001.heap
 | |
|            <prefix>.0002.heap
 | |
|            ...
 | |
| </pre>
 | |
| <p>where <code><prefix></code> is the filename-prefix supplied
 | |
| when running the code (e.g. via the <code>HEAPPROFILE</code>
 | |
| environment variable).  Note that if the supplied prefix
 | |
| does not start with a <code>/</code>, the profile files will be
 | |
| written to the program's working directory.</p>
 | |
| 
 | |
| <p>The profile output can be viewed by passing it to the
 | |
| <code>pprof</code> tool -- the same tool that's used to analyze <A
 | |
| HREF="cpuprofile.html">CPU profiles</A>.
 | |
| 
 | |
| <p>Here are some examples.  These examples assume the binary is named
 | |
| <code>gfs_master</code>, and a sequence of heap profile files can be
 | |
| found in files named:</p>
 | |
| <pre>
 | |
|   /tmp/profile.0001.heap
 | |
|   /tmp/profile.0002.heap
 | |
|   ...
 | |
|   /tmp/profile.0100.heap
 | |
| </pre>
 | |
| 
 | |
| <h3>Why is a process so big</h3>
 | |
| 
 | |
| <pre>
 | |
|     % pprof --gv gfs_master /tmp/profile.0100.heap
 | |
| </pre>
 | |
| 
 | |
| <p>This command will pop-up a <code>gv</code> window that displays
 | |
| the profile information as a directed graph.  Here is a portion
 | |
| of the resulting output:</p>
 | |
| 
 | |
| <p><center>
 | |
| <img src="heap-example1.png">
 | |
| </center></p>
 | |
| 
 | |
| A few explanations:
 | |
| <ul>
 | |
| <li> <code>GFS_MasterChunk::AddServer</code> accounts for 255.6 MB
 | |
|      of the live memory, which is 25% of the total live memory.
 | |
| <li> <code>GFS_MasterChunkTable::UpdateState</code> is directly
 | |
|      accountable for 176.2 MB of the live memory (i.e., it directly
 | |
|      allocated 176.2 MB that has not been freed yet).  Furthermore,
 | |
|      it and its callees are responsible for 729.9 MB.  The
 | |
|      labels on the outgoing edges give a good indication of the
 | |
|      amount allocated by each callee.
 | |
| </ul>
 | |
| 
 | |
| <h3>Comparing Profiles</h3>
 | |
| 
 | |
| <p>You often want to skip allocations during the initialization phase
 | |
| of a program so you can find gradual memory leaks.  One simple way to
 | |
| do this is to compare two profiles -- both collected after the program
 | |
| has been running for a while.  Specify the name of the first profile
 | |
| using the <code>--base</code> option.  For example:</p>
 | |
| <pre>
 | |
|    % pprof --base=/tmp/profile.0004.heap gfs_master /tmp/profile.0100.heap
 | |
| </pre>
 | |
| 
 | |
| <p>The memory-usage in <code>/tmp/profile.0004.heap</code> will be
 | |
| subtracted from the memory-usage in
 | |
| <code>/tmp/profile.0100.heap</code> and the result will be
 | |
| displayed.</p>
 | |
| 
 | |
| <h3>Text display</h3>
 | |
| 
 | |
| <pre>
 | |
| % pprof --text gfs_master /tmp/profile.0100.heap
 | |
|    255.6  24.7%  24.7%    255.6  24.7% GFS_MasterChunk::AddServer
 | |
|    184.6  17.8%  42.5%    298.8  28.8% GFS_MasterChunkTable::Create
 | |
|    176.2  17.0%  59.5%    729.9  70.5% GFS_MasterChunkTable::UpdateState
 | |
|    169.8  16.4%  75.9%    169.8  16.4% PendingClone::PendingClone
 | |
|     76.3   7.4%  83.3%     76.3   7.4% __default_alloc_template::_S_chunk_alloc
 | |
|     49.5   4.8%  88.0%     49.5   4.8% hashtable::resize
 | |
|    ...
 | |
| </pre>
 | |
| 
 | |
| <p>
 | |
| <ul>
 | |
|   <li> The first column contains the direct memory use in MB.
 | |
|   <li> The fourth column contains memory use by the procedure
 | |
|        and all of its callees.
 | |
|   <li> The second and fifth columns are just percentage
 | |
|        representations of the numbers in the first and fourth columns.
 | |
|   <li> The third column is a cumulative sum of the second column
 | |
|        (i.e., the <code>k</code>th entry in the third column is the
 | |
|        sum of the first <code>k</code> entries in the second column.)
 | |
| </ul>
 | |
| 
 | |
| <h3>Ignoring or focusing on specific regions</h3>
 | |
| 
 | |
| <p>The following command will give a graphical display of a subset of
 | |
| the call-graph.  Only paths in the call-graph that match the regular
 | |
| expression <code>DataBuffer</code> are included:</p>
 | |
| <pre>
 | |
| % pprof --gv --focus=DataBuffer gfs_master /tmp/profile.0100.heap
 | |
| </pre>
 | |
| 
 | |
| <p>Similarly, the following command will omit all paths subset of the
 | |
| call-graph.  All paths in the call-graph that match the regular
 | |
| expression <code>DataBuffer</code> are discarded:</p>
 | |
| <pre>
 | |
| % pprof --gv --ignore=DataBuffer gfs_master /tmp/profile.0100.heap
 | |
| </pre>
 | |
| 
 | |
| <h3>Total allocations + object-level information</h3>
 | |
| 
 | |
| <p>All of the previous examples have displayed the amount of in-use
 | |
| space.  I.e., the number of bytes that have been allocated but not
 | |
| freed.  You can also get other types of information by supplying a
 | |
| flag to <code>pprof</code>:</p>
 | |
| 
 | |
| <center>
 | |
| <table frame=box rules=sides cellpadding=5 width=100%>
 | |
| 
 | |
| <tr valign=top>
 | |
|   <td><code>--inuse_space</code></td>
 | |
|   <td>
 | |
|      Display the number of in-use megabytes (i.e. space that has
 | |
|      been allocated but not freed).  This is the default.
 | |
|   </td>
 | |
| </tr>
 | |
| 
 | |
| <tr valign=top>
 | |
|   <td><code>--inuse_objects</code></td>
 | |
|   <td>
 | |
|      Display the number of in-use objects (i.e. number of
 | |
|      objects that have been allocated but not freed).
 | |
|   </td>
 | |
| </tr>
 | |
| 
 | |
| <tr valign=top>
 | |
|   <td><code>--alloc_space</code></td>
 | |
|   <td>
 | |
|      Display the number of allocated megabytes.  This includes
 | |
|      the space that has since been de-allocated.  Use this
 | |
|      if you want to find the main allocation sites in the
 | |
|      program.
 | |
|   </td>
 | |
| </tr>
 | |
| 
 | |
| <tr valign=top>
 | |
|   <td><code>--alloc_objects</code></td>
 | |
|   <td>
 | |
|      Display the number of allocated objects.  This includes
 | |
|      the objects that have since been de-allocated.  Use this
 | |
|      if you want to find the main allocation sites in the
 | |
|      program.
 | |
|   </td>
 | |
| 
 | |
| </table>
 | |
| </center>
 | |
| 
 | |
| 
 | |
| <h3>Interactive mode</a></h3>
 | |
| 
 | |
| <p>By default -- if you don't specify any flags to the contrary --
 | |
| pprof runs in interactive mode.  At the <code>(pprof)</code> prompt,
 | |
| you can run many of the commands described above.  You can type
 | |
| <code>help</code> for a list of what commands are available in
 | |
| interactive mode.</p>
 | |
| 
 | |
| 
 | |
| <h1>Caveats</h1>
 | |
| 
 | |
| <ul>
 | |
|   <li> Heap profiling requires the use of libtcmalloc.  This
 | |
|        requirement may be removed in a future version of the heap
 | |
|        profiler, and the heap profiler separated out into its own
 | |
|        library.
 | |
|      
 | |
|   <li> If the program linked in a library that was not compiled
 | |
|        with enough symbolic information, all samples associated
 | |
|        with the library may be charged to the last symbol found
 | |
|        in the program before the library.  This will artificially
 | |
|        inflate the count for that symbol.
 | |
| 
 | |
|   <li> If you run the program on one machine, and profile it on
 | |
|        another, and the shared libraries are different on the two
 | |
|        machines, the profiling output may be confusing: samples that
 | |
|        fall within the shared libaries may be assigned to arbitrary
 | |
|        procedures.
 | |
| 
 | |
|   <li> Several libraries, such as some STL implementations, do their
 | |
|        own memory management.  This may cause strange profiling
 | |
|        results.  We have code in libtcmalloc to cause STL to use
 | |
|        tcmalloc for memory management (which in our tests is better
 | |
|        than STL's internal management), though it only works for some
 | |
|        STL implementations.
 | |
| 
 | |
|   <li> If your program forks, the children will also be profiled
 | |
|        (since they inherit the same HEAPPROFILE setting).  Each
 | |
|        process is profiled separately; to distinguish the child
 | |
|        profiles from the parent profile and from each other, all
 | |
|        children will have their process-id attached to the HEAPPROFILE
 | |
|        name.
 | |
|      
 | |
|   <li> Due to a hack we make to work around a possible gcc bug, your
 | |
|        profiles may end up named strangely if the first character of
 | |
|        your HEAPPROFILE variable has ascii value greater than 127.
 | |
|        This should be exceedingly rare, but if you need to use such a
 | |
|        name, just set prepend <code>./</code> to your filename:
 | |
|        <code>HEAPPROFILE=./Ägypten</code>.
 | |
| </ul>
 | |
| 
 | |
| <hr>
 | |
| <address>Sanjay Ghemawat
 | |
| <!-- Created: Tue Dec 19 10:43:14 PST 2000 -->
 | |
| </address>
 | |
| </body>
 | |
| </html>
 |