libumem.so aborts due to invalid free of basic_string __nullref member

Hi,

I'm fairly new to Solaris development and I've been trying to debug a problem for the last few days and I'm at my wits end. I've been trying, unsucessfully, to use libumem to track down a memory corruption issue, however I've had trouble just getting started. I don't know if it's a problem with the compiler or the version of the STL included with Sun Studio 11.

When I run my application using LD_PRELOAD to interpose the libumem library, the application immediately aborts with the following stack trace:

libc.so.1`_lwp_kill+8(6, 0, 20f90, ff36b7cc, ff38a000, ff38abc4)

libumem.so.1`umem_do_abort+0x1c(8, ffbfec78, 6, 20ecc, ff37680c, 0)

libumem.so.1`umem_err_recoverable+0x7c(ff377818, a, 20dc4, ff3cca0c, ff38d0d0, ff377823)

libumem.so.1`process_free+0x114(ff351660, 1, 0, 3e3a1000, 1ee5c, ff3b3904)

libCrun.so.1`void operator delete+4(ff351660, fffb, 4, ffffffff, 3fee8, fc00)

libCstd_isa.so.1`char*std::basic_string<char,std::char_traits><char> ;,std::allocator<char> >::replace+

0x3c4(55fc8, 3ff10, 0, ff351688, 0, ff351688)

libCstd_isa.so.1`std::basic_string<char,std::char_traits><char>,std ::allocator<char> >&std::basic_str

ing<char,std::char_traits><char>,std::allocator<char> >::operator=+0xe4(55fc8, ff341434, 0, 1cf6c,

ff16a780, ff351688)

libtestlib.so`void print+0x254(ffbff6c8, 1530c, 55fc8, ff156dfc, 55fc8, 55fd8)

main+0x1c(1, ffbff734, ffbff73c, 25400, feee0000, feee0040)

It looks like what is happening is that there are multiple version of the __nullref symbol present, when I believe there should only be one. I believe these multiple symbols cause comparison failures inside the STL, which ultimately lead to the application crashing.

I've managed to create a testcase that demonstrates the problem (code attached inline).

I'm using the following compiler:

CC: Sun C++ 5.8 2005/10/13

=====START testlib.cpp ==========================================

#include <string>

#include <iostream>

using namespace std;

void print(const string &s)

{

string l;

string *p = new string();

string *q = new string();

*p = "foo";

*q = "bar";

cout << *p << *q << endl;

}

=====END testlib.cpp ============================================

=====START testlib.map ==========================================

VERSION {

global:

__1cFprint6FrknDstdMbasic_string4Ccn0ALchar_traits4Cc__n0AJallocator4Cc__v_ ;

local:

* ;

};

=====END testlib.map ============================================

I compiled testlib.cpp with the following command line:

CC -G -Qoption ld "-M,testlib.map" -o libtestlib.so testlib.cpp

I then compiled the following test program that uses the libtestlib.so library:

=====START test.cpp =============================================

#include <iostream>

#include <string>

using namespace std;

extern void print(const string &s);

int main(void)

{

print("foo");

return 0;

}

=====END test.cpp ===============================================

CC -o test test.cpp -KPIC -Bdynamic -L"." -l"testlib" "-xarch=v8plus"

This works successfully when I use the 5.3 version of the compiler (Sun WorkShop 6 update 2 C++ 5.3 Patch 111685-02 2001/09/18). One difference I've noticed between these two compilers is that the two version of the STL define the __nullref member differently. The newer compiler adds a __SUNW_GLOBAL modifier to the declaration, which seems to be completely undocumented (Google yields no useful information).

Any help is greatly appreciated.

Scott

[3862 byte] By [scott@cognos] at [2007-11-26 11:42:08]
# 1

I don't know much about libumem, but if you're on sparc, you can use memory access checking feature of dbx, the Sun Studio debugger. Just load your program into debugger (dbx a.out), issue check -access and then run it. Be sure to have latest dbx 7.5 patch (121023-03), since there were a number of run-time checking related fixes lately.

MaximKartashev at 2007-7-7 11:48:44 > top of Java-index,Development Tools,Solaris and Linux Development Tools...
# 2

I just tried using dbx with check -access and I got the exact same result as with libumem:

...

RTC: Enabling Error Checking...

RTC: Running program...

Bad free (baf):

Attempting to free an unallocated block at address 0xebd112a8

stopped in operator delete at 0xe95061b8

0xe95061b8: operator delete+0x0004:callfree [PLT]! 0xe951a938

(dbx) where

=>[1] operator delete(0xebd112a8, 0xfffb, 0x4, 0xffffffff, 0x63cd8, 0xfc00), at 0xe95061b8

[2] std::basic_string<char,std::char_traits><char>,std::allocator<ch ar> >::replace(0x4b238, 0x63d00, 0x0, 0xebd112d0, 0x0, 0xebd112d0), at 0xe1d0d9b0

[3] std::basic_string<char,std::char_traits><char>,std::allocator<ch ar> >::operator=(0x4b238, 0xebd010b8, 0x0, 0x1cf6c, 0xe951a780, 0xebd112d0), at 0xe1d0ee38

[4] print(0xffbff5a0, 0x154cc, 0x4b218, 0x4b218, 0x4b238, 0x4b238), at 0xebd00c94

[5] main(0x1, 0xffbff60c, 0xffbff614, 0x25800, 0xff310040, 0xff310080), at 0x10f14

(dbx)

Again, this just leads me to believe that this is either a bug in the compiler/STL implementation, or I'm just missing a compiler/linker flag.

Thanks for the check -access tip though; I didn't know that dbx had that functionality.

Scott

scott@cognos at 2007-7-7 11:48:44 > top of Java-index,Development Tools,Solaris and Linux Development Tools...
# 3

The string class in the default libCstd depends on only one copy of __nullref being present. If there are multiple copies, the program will crash.

If you have multiple copies of __nullref, one of two things is wrong.

1. You don't have the required updates of the SUNWlibC C++ Runtime Library patch (versions newer than the required version are OK). The updated version is required on all systems that run the compiler, or that run programs created by the compiler. It is always safe to install the latest version.

2. You are not building the program correctly. In particular:

A. libCstd must be linked dynamically (libCstd.so.1) not statically (libCstd.a). Be sure that no component of the application, including any application shared libraries, links the static version of libCstd (or libCrun, by the way).

B. You must not use -Bstatic or -Bdirect when linking libCstd (or libCrun). If -Bstatic or -Bdirect appears anyhwere on a link command, -Bdynamic must appear before linking libCstd or libCrun.

clamage45 at 2007-7-7 11:48:44 > top of Java-index,Development Tools,Solaris and Linux Development Tools...