Uncaught exception in solaris using sun 5.4 c++ compiler

I have built a shared object library that throws an exception. When this shared object is called by one of the companies we work with, the exception is never caught and core is dumped. On our system the exception is caught. On linux and windows, the exception is caught. I am building the shared object using sunCC 5.4, latest patch on solaris 5.8. The system the exception is not caught is a solaris 5.9 machine with sunCC 5.5. Lists of patches installed on this machine is below.

I have built a shared object that throws an exception :

BEGIN CODE**********************

if ( idx < 0 || idx >= mNobs )

throw adco_ETsInvalidIndex( this, idx, year, period );

This exception's class and the classes it derives from are as follows (along with a comment that explains how the base adco_cError class works):

class adco_ETsInvalidIndex: public adco_cErrorStore

{

private:

// static const char* msgtext;

// static const char* alttext;

public:

adco_ETsInvalidIndex( const adco_TimeSeries* ts, int i, int y, int p );

adco_ETsInvalidIndex( const adco_TimeSeries* ts, int idx );

};

class adco_cErrorStore: public adco_cErrorImp {

public:

adco_cErrorStore( const adco_cEnumElem sev, const char* msg = NULL );

adco_cErrorStore( const adco_cErrorStore& e);

~adco_cErrorStore();

};

class adco_cErrorImp: public adco_cError {

public:

adco_cErrorImp( const adco_cEnumElem sev, char* msg = NULL );

adco_cErrorImp( const adco_cErrorImp& e )

: mSeverity(e.mSeverity), mMessage( e.mMessage ), mMsgLen( e.mMsgLen ), mGError(e.mGError) {}

adco_cEnumElem severity() const { return mSeverity; }

const char* message() const { return mMessage; }

unsigned int msglen() const { return mMsgLen; }

adco_cEnumElem msgelement() const { return mGError; }

protected:

adco_cEnumElem mSeverity;

char* mMessage;

unsigned int mMsgLen;

adco_cEnumElem mGError;

};

BASE CLASS******************************

//

// Base class for all thrown adco model errors

// When a specific error is thrown, it's copy constructor

// is used to copy the the error from the error context to the heap.

// After the stack is cut back, destroying the original error instance,

// the base copy constructor is used to copy a BASE CLASS adco_cError

// instance into the error handler (catch block) context.

// This base class instance will contain an mRealMe that points

// to the heap copy of the real error.

// When the handler context is exited, the heap copy of the real error

// and the base class instance in the handler context are destroyed.

// For this reason, the base copy constructor is smart. An attempt to

// copy a base class error instance out of the handler will yield a

// mRealMe that points to itself, and the original error info is lost.

//

class ADPPMDL_API adco_cError

{

protected:

const adco_cError* mRealMe;

adco_cError() { mRealMe = this; }

public:

adco_cError( const adco_cError& e);

virtual adco_cEnumElem severity() const;

virtual const char* message() const;

virtual unsigned int msglen() const;

virtual adco_cEnumElem msgelement() const; // in Global System ERRORDEFS list, or NULL_ELEM

}; /* class ADPPMDL_API adco_cError */

END CODE***************************************************

This throw gets through multiple catch() statements (the main one being catch(adco_cError & err). There is also a catch( ... )). Some statements are in the same library that throws the exception. Some are in wrapper libraries. It gets through all of them.

This works in Windows and Linux. It also works when called on our own computer. WHen we send the libraries (both the c wrapper and library that throws the exception) to one of the other companies that we we work with, however, the exception is not caught.

Details on our setup:

CC: Forte Developer 7 C++ 5.4 Patch 111715-17 2005/10/13

uname -X

System = SunOS

Node = aablade1

Release = 5.8

KernelID = Generic_108528-13

Machine = sun4u

BusType = <unknown>

Serial = <unknown>

Users = <unknown>

OEM# = 0

Origin# = 1

NumCPU = 1

I have installed all the latest patches for Forte Developer 7 on Solaris 8 from http://developers.sun.com/sunstudio/downloads/patches/index.jsp

I am not linking with -B symbolic. The wrapped libraries (which are a C wrapper around the library that throws the exception and in turn load the library that throws the exception) are loaded with RTLD_GLOBAL by the company's system that does not catch the exception.

Details on the system on which the exception is not caught:

General Information

Host Name is medusa

sysinfo: /dev/ksyms is not a 32-bit kernel namelist

Manufacturer is Sun (Sun Microsystems)

sysinfo: /dev/ksyms is not a 32-bit kernel namelist

sysinfo: /dev/ksyms is not a 32-bit kernel namelist

System Model is Fire 880

sysinfo: /dev/ksyms is not a 32-bit kernel namelist

ROM Version isOBP 4.7.5 2003/01/08 11:36

Number of CPUs is8

CPU Type is sparc

App Architecture issparc

Kernel Architecture issun4u

OS Name isSunOS

OS Version is5.9

Kernel Version isSunOS Release 5.9 Version Generic_117171-05 [UNIX(R) System V Release 4.0]

sysinfo: /dev/ksyms is not a 32-bit kernel namelist

Kernel Information

sysinfo: /dev/ksyms is not a 32-bit kernel namelist

SysConf Information

Max combined size of argv[] and envp[] is1048320

Max processes allowed to any UID is 29995

Clock ticks per second is100

Max simultaneous groups per user is 16

Max open files per process is256

System memory page size is8192

Job control supported isTRUE

Savid ids (seteuid()) supported is TRUE

Version of POSIX.1 standard supported is199506

Version of the X/Open standard supported is 3

Max log name is8

Max password length is 8

Number of processors (CPUs) configured is8

Number of processors (CPUs) online is8

Total number of pages of physical memory is 4456448

Number of pages of physical memory not currently in use is 2791486

Max number of I/O operations in single list I/O call is 4096

Max amount a process can decrease its async I/O priority level is0

Max number of timer expiration overruns is 2147483647

Max number of open message queue descriptors per process is 32

Max number of message priorities supported is32

Max number of realtime signals is8

Max number of semaphores per process is2147483647

Max value a semaphore may have is2147483647

Max number of queued signals per process is 32

Max number of timers per process is 32

Supports asyncronous I/O isTRUE

Supports File Synchronization isTRUE

Supports memory mapped files is TRUE

Supports process memory locking is TRUE

Supports range memory locking isTRUE

Supports memory protection isTRUE

Supports message passing isTRUE

Supports process scheduling is TRUE

Supports realtime signals isTRUE

Supports semaphores is TRUE

Supports shared memory objects isTRUE

Supports syncronized I/O isTRUE

Supports timers is TRUE

sysinfo: /dev/ksyms is not a 32-bit kernel namelist

Device Information

SUNW,Sun-Fire-880

cpu0 is a "900 MHz SUNW,UltraSPARC-III+" CPU

cpu1 is a "900 MHz SUNW,UltraSPARC-III+" CPU

cpu2 is a "900 MHz SUNW,UltraSPARC-III+" CPU

cpu3 is a "900 MHz SUNW,UltraSPARC-III+" CPU

cpu4 is a "900 MHz SUNW,UltraSPARC-III+" CPU

cpu5 is a "900 MHz SUNW,UltraSPARC-III+" CPU

cpu6 is a "900 MHz SUNW,UltraSPARC-III+" CPU

cpu7 is a "900 MHz SUNW,UltraSPARC-III+" CPU

Patches installed on this machine:

111721-04

111722-04

115113-01

113823-03

114806-02

114807-02

114808-02

114809-02

115108-02

115109-02

115110-02

115111-02

115112-02

115980-02

113820-03

112760-18

117828-01

112763-19

113817-15

114801-08

112762-16

108434-21

108435-21

111711-15

111712-15

Any ideas what might cause this? We have spent a lot of time trying to figure out the cause.

Thanks.

Will

will@ad-co.com

[8726 byte] By [cursbambno] at [2007-11-26 11:19:07]
# 1
HiIs the customer compiling/linking with the same or compatible options?Paul
Paul_Floyd at 2007-7-7 3:34:24 > top of Java-index,Development Tools,Solaris and Linux Development Tools...
# 2

Thanks for looking at this Paul.

My compilation options for the c++ library are

-KPIC -O

My linking options for the c++ library are

-G -lCrun {most objects and libraries} -z allextract -Bstatic -B direct {one static library of ours that we link to}

The customer's compilation options for compilation (they only directly call the c wrapper remember) are:

-DUSE_SMART_HEAP -DUSE_SH_ALLOC -template=wholeclass -instances=explicit -features=except -features=rtti -Drindex=rindex -Dindex=index -KPIC -g 朌DEBUG -mt -DNO_CHASEN -DBILL -g

and linking flags are:

-DUSE_SMART_HEAP -DUSE_SH_ALLOC -template=wholeclass -instances=explicit -features=except -features=rtti -Drindex=rindex -Dindex=index -KPIC -g -DDEBUG -mt-DNO_CHASEN -DBILL 杇 -staticlib=Cstd,Crun -xildoff -library=%none -lCstd -lintexcmo -ladpstdsub -ladabssub -ladppmdl -lnsl

cursbambno at 2007-7-7 3:34:24 > top of Java-index,Development Tools,Solaris and Linux Development Tools...
# 3

Have the customer update their compiler and C++ runtime libraries. Even though they are using a later compiler and a later Solaris, they might have older versions than you are using, with bugs not yet fixed.

In case that isn't clear, you might be depending on a bug fix in the compiler or runtime libraries that was fixed after C++ 5.5 or Solaris 9 was released. The customer might not have the bug fix.

clamage45 at 2007-7-7 3:34:24 > top of Java-index,Development Tools,Solaris and Linux Development Tools...
# 4
Clamage,Sorry I am not sure what you mean. We are using the sun 5.4 compiler/solaris 8 and they are using the sun 5.5 compiler/solaris 9. Do mean have them update their patches?Thanks.Will
cursbambno at 2007-7-7 3:34:24 > top of Java-index,Development Tools,Solaris and Linux Development Tools...
# 5

Update to the customer's compilation and linking options. They left out a couple options.

In addition to the options previously listed, they are also using

-dalign -xarch=v8plusa -xlibmopt

at both the compilation and linking stage.

If they take out the -mt option, they still get the core dump (caused by uncaught exception.

Thanks.

Will

cursbambno at 2007-7-7 3:34:25 > top of Java-index,Development Tools,Solaris and Linux Development Tools...
# 6

> Clamage,

>

> Sorry I am not sure what you mean. We are using the

> sun 5.4 compiler/solaris 8 and they are using the sun

> 5.5 compiler/solaris 9.

>

> Do mean have them update their patches?

I posted my comments before I saw your additional information. Perhaps the other suggestions here will solve the problem. If not, have your customer update to all the latest Sun Studio 8 patches and C++ runtime library patches.

I don't ordinarily focus on such fine distinctions, but when a library fails when used with a newer compiler on a newer OS version, I would look at patch skew.

For example, if you are using the latest patches of C++ 5.4 and SUNWlibC and they are using unpatched C++ 5.5 and SUNWlibC, they might be running into a bug that has been fixed in the later updates you are using.

clamage45 at 2007-7-7 3:34:25 > top of Java-index,Development Tools,Solaris and Linux Development Tools...
# 7
the solution was not to use -lCrun when linking the shared object at compile time.
airbag81 at 2007-7-7 3:34:25 > top of Java-index,Development Tools,Solaris and Linux Development Tools...
# 8

A C++ shared library should always be linked with libCrun, to avoid incorrect library initialization order. If you had to remove the option to get the program to work, something else is wrong, and you have a problem waiting to happen.

libCrun must be linked dynamically. Perhaps it is being linked statically or with direct linkage. Either will cause multiple versions of libCrun components to appear in the final program, which can cause a program crash.

Add the -dryrun option to the command that you use to create the shared library, and restore the -lCrun option. Examine the ld command line in the output, and see if -lCrun is preceeded by -Bstatic or -Bdirect, but not by -Bdynamic. If so, either the command line options are not in the right order, or you have run into a CC driver bug.

A command line to create a C++ shared library should look like this:

CC -G -o mylib.so -zdefs *.o <project libraries> -Bdynamic -lCstd -lCrun -lc

The system libraries are listed last, to ensure correct initializaiton order.

If you don't use libCstd, omit the -lCstd.

You might need to add otther system libraries.

You need -Bdynamic only if you have some other -B option preceeding the system libraries.

Use the -zdefs option so that the linker complains about any unresolved references. That will ensure that you list all libraries that mylib.so depends on.

clamage45 at 2007-7-7 3:34:25 > top of Java-index,Development Tools,Solaris and Linux Development Tools...