Sunstudio 12 debugging issues
Hello,
First let me congratulate you on the new release. Having used Sunstudio 11 for a couple of years to debug large applications, I can say that the new release is a definitely important improvement.
After a couple of hours, I can say it is faster, slicker; the interface is much more reponsive compared to the IDE of sunstudio 11 (Windows like experience); the usability great. Feature wise the built in profiler, memory checker are more accessible, the debugger more feature packed, the new "project" concept and automatic build also great.
Attaching the debugger to our large binary went from 2 minutes down to 1 minute. However this is the problem: while attaching the new dbx 7.6 from the console works as expected, attaching it to the same process from the Sunstudio 12 IDE shows a different behavior: the breakpoints are ignored. So even though I am able to set a breakpoint in the C code, the debugger never stops as if it does not see the breakpoint. Trying to set the breakpoint from the sunstudio integrated dbx concole does not make a difference (it highlights the breakpoint in red correctly though). dbx finds the object files, sees the process, attaches to it, yet never stops.
Weird also (maybe explains the problem?) is that sunstudio/dbx says there are 43 LWPs. Running dbx from the terminal (outside sunstudio) shows 44 LWPS on the same process. Most threads are Java, it seems like the C thread that I want to debug is "invisible" when running dbx from sunstudio. Tried to fiddle with the LWP optimization features of dbx options (from sunstudio IDE), but yet the problem remains. Is there a way to investigate the issue? Note that the whole binary has been recompiled/relinked with Sunstudio 12 compiler...
dbx from terminal (breakpoints work fine through "stop in" statements in C code):
(dbx) attach 4605
Attached to process 4605 with 44 LWPs
t@1 (l@1) stopped in __lwp_park at 0xfe7e4cd0
0xfe7e4cd0: __lwp_park+0x0010: ta%icc,0x00000008
Current function is Sys_GetMessage
stop in __rtInsCompute
(2) stop in __rtInsCompute
(dbx) cont
t@1 (l@1) stopped in __rtInsCompute at line 8606 in file "rt_instr.c"
dbx from sunstudio 12 IDE on same process (note lower number of LWPs)
Attached to process 4605 with 43 LWPs
t@1 (l@1) stopped in __lwp_park at 0xfe7e4cd0
0xfe7e4cd0: __lwp_park+0x0010: ta%icc,0x00000008
cd /tera/sunstudio/sunstudio12/SUNWspro/bin
(dbx) intercept -set -unhandled, -unexpected
stop in `mxdebug`rt_instr.c`#__rtInsCompute
(dbx) cont
-> never stops with same test case
Tried the above process many times, restarting Sunstudio, etc and always got the same behavior and difference in LWPs
Thanks
Philippe
[2816 byte] By [
PhilMa] at [2007-11-27 6:33:47]

# 1
Looks like a mystery to me, too. I'll try to ask around, maybe others have any clue. Meanwhile, can you provide us with this information:
- your platform (Solaris or Linux?),
- version of OS,
- how many lwps does pstack reports (it's Solaris command, I don't know Linux equivalent, unfortunately)?
- can you make sure that it is really "C" thread that is disappearing (compare 'lwps' or 'threads' command output from command line dbx and from the GUI; you can use Dbx console in the IDE to enter dbx commands).
I also noticed that "stop in `mxdebug`rt_instr.c`#__rtInsCompute" is not prefixed with "(dbx)" -- is it artifact of copy-paste? What happens if you type "stop in __rtInsCompute" from the dbx console? Does this breakpoint appear in 'status' dbx command output?
Thanks for sharing your experience, by the way!
# 2
Hello Maxim
Thank you for the prompt reply. Regarding the platform it is Solaris 8, Sun v440 (SunOS v5.8)
Actually quite amazing findings, but I can not explain.
First I reran the test case under Sunstudio12/dbx versus standalone dbx from the command line. In fact both sometimes failed, sometimes worked... it turned out that if dbx or Sunstudio is launched from the application folder where the binary being debugged is stored, then the debugger stops on the breakpoint. Otherwise if dbx or sunstudio is started from another folder (same machine though), then the debugger fails to stop on breakpoints.
I am sure if this makes sense to you. The thread differences are not the issue as sometimes the process kills and starts new ones, as well as the jvm etc. I am not sure it is a kind of path/environment problem. But it is less critical as starting the tool from the right location solves the issue. It is though good to understand and fix.
I will continue with the tests.
Many thanks for your help
Regards
Philippe
PhilMa at 2007-7-12 17:59:48 >

# 3
Thanks for detailed description!
I'm still puzzled, but here are some considerations:
First of all, Solaris 8 is an unsupported platform for Sun Studio 12, especially when it comes to dbx and multithreading.
Solaris 8 has different threading model compared to Solaris 9/10/11 and dbx 7.6 now relies on it (while dbx 7.5 was made to support both). I know for sure that single-stepping will not work in a Solaris 8 multi-threaded environment, at least won't work in a stable way.
All this shouldn't affect breakpoints, however.
As for the run directory, I haven't the foggiest why it helps, looks like a coincidence to me.
Sorry, not much help from my side...
# 4
Phil,
Considering that you have a large MT application most of what I see points to
the application getting perturbed and some previously unobserved MT-related
unpredictability rearing it's ugly head.
Switching to new compiler can alter the speed of code, or memory allocation or
order of function call parameter evaluation, which in turn may govern how
your application proceeds.
In your second post you said that the problem comes and goes raegardless
of whether the app is debugged through the ide or cmdline dbx. This
points to garden-variety mt-related variability.
What caught my eye in your email though is your mention of attach time going
from 2 minutes to 1 minute. We're always on the lookout for "large" apps
which stress the scalability of dbx. Would you mind contacting me directly.
I'd like to learn a bit more about the nature of your app (statistical) and
will try and get you to do a profiling of dbx itself.