Performance Analyzer Suggestion
This is an enhancement request for the output of the Performance Analysis Tool.
Each call stack sample consists of a list of code lines / instructions, of which all but the last represent function calls.
Could you build a list of those code lines, merged from all the call stack samples, and on each one, determine the percentage of call stacks it appears on. For example, if a given code line appears on 100 out of 1000 call stack samples, it would have a percentage of 10%. If the line appears multiple times in a given sample, that only counts as appearing once in that sample.
Then if you could sort the code lines by that percentage and display the top 50 or so, with their percents, that would be most useful.
This would be most helpful in finding lines of code that account for significant percentages of execution time, even if they are function calls, because any such line, if it is on X% of the call stacks, would save X% of the execution time if it could be eliminated.
Don't tell me about the wonders of call graphing, counting, and timing. Those don't work nearly as well as what I'm asking for.
Thanks
[1162 byte] By [
MikeD276a] at [2007-11-26 22:11:55]

# 1
RTFM :-)
er_print has had a "lines" command that does what you want for several releases.
It does not compute the percentage of callstacks in which the line appears,
rather it computes metrics, just as it would for functions. Computing the
fraction of callstacks with a line in them is not a reasonable computation --
the numbers should be weighted by the metric values associated with each
event.
The Analyzer has a "Lines" tab (not visible by default, but enabled from
the Data Presentation Dialog, Tabs tab) to show the same data.
Both of those can take a while to compute, as they require reading line number
tables from every single object file/a.out/.so in the experiment.
Marty Itzkowitz,
Project lead, Sun Studio Performance Tools
# 2
Ok, Marty, you got me. I was reading
http://docs.sun.com/source/816-2458/index.html
and
http://developers.sun.com/sunstudio/articles/perftools.html
and I missed the parts about line metrics.
Lines/instructions that are calls to functions are most interesting to me because:
1) you can tell exactly how much time would be saved if they were removed, and
2) some of them are likely to be not entirely necessary, so you might actually find a way to remove them.
Many mods that you can make to any software to make it run faster consist of making fewer calls to functions that take time, and this metric highlights those calls.
I am actually more familiar with the Microsoft tools, but I don't know why programmers are not encouraged to focus on the lines that have the greatest time attibuted to them.
# 4
Heres' the output below for lines from a simple mttest experiment.
The columns don't like up because they're intended for a fixed-width
font, and I don't know how to get a fixed width font in this window.
The manual you cite is 5 or more releases earlier that the current one.
The current page is at
http://docs.sun.com/source/819-3687/er_printRef.html#84040
You should be using Sun Studio 11, as patched, not any earlier release.
122% er_print -lines test.1.er
Objects sorted by metric: Exclusive User CPU Time
Excl.Incl.Incl.Incl.Name
User CPU User CPU Sync Wait Sync Wait
sec.sec.sec.Count
155.739155.739221.863100 <Total>
20.03420.0340. 0 computeB, line 1306 in "mttest.c"
12.49912.4990. 0 computeI, line 1365 in "mttest.c"
12.45912.4590. 0 compute, line 1290 in "mttest.c"
12.32912.3290. 0 computeG, line 1349 in "mttest.c"
12.26912.2690. 0 computeD, line 1322 in "mttest.c"
11.89811.8980. 0 computeA, line 1298 in "mttest.c"
11.85811.8580. 0 computeE, line 1330 in "mttest.c"
11.46811.4680. 0 computeC, line 1314 in "mttest.c"
11.37811.3780. 0 computeH, line 1357 in "mttest.c"
10.08710.0870. 0 <Function: take_deferred_signal, instructions without line numbers>
8.1368.1360. 0 addone, line 1387 in "mttest.c"
7.75519.1130. 0 computeF, line 1341 in "mttest.c"
3.1923.1920. 0 <Function: mutex_trylock, instructions without line numbers>
2.1822.1820. 0 addone, line 1388 in "mttest.c"
2.01113.6600. 0 <Function: slow_trylock, instructions without line numbers>
1.8911.8910. 0 do_work, line 694 in "mttest.c"
1.56111.6480. 0 <Function: mutex_trylock_adaptive, instructions without line numbers>
1.0411.0410. 0 addone, line 1386 in "mttest.c"
0.7610.7610. 0 <Function: pthread_mutex_trylock, instructions without line numbers>
0.65018.2630. 0 trylock_global, line 884 in "mttest.c"