studio12/dbx: access checking broken for setlocate() call in "de" locale

It seems that access checking crashes the dbx target

when it is running in the "de" locale and a call to setlocale()

is used.

% dbx -V

Sun Dbx Debugger 7.6 SunOS_i386 2007/05/03

(running on opensolaris x86 / build 68)

% locale

LANG=de_DE.ISO8859-1

LC_CTYPE=de_DE.ISO8859-1

LC_NUMERIC=de_DE.ISO8859-1

LC_TIME=de_DE.ISO8859-1

LC_COLLATE=de_DE.ISO8859-1

LC_MONETARY=de_DE.ISO8859-1

LC_MESSAGES=de_DE.ISO8859-1

LC_ALL=de_DE.ISO8859-1

% cat hello.c

#include <stdio.h>

#include <stdlib.h>

#include <locale.h>

void

func(void)

{

int i;

printf("hello, world!\n%d\n", i);

}

int

main(int argc, char **argv)

{

char x[32768];

long *p = malloc(32768);

int i;

long s;

setlocale(LC_ALL, "");

for (i = 0; i < 32768 / sizeof(long); i++)

s += p[i];

printf("%ld\n", s);

func();

exit(1);

}

% cc -g -o hello hello.c

% bcheck -access hello

Reading hello

Reading ld.so.1

Reading rtcapihook.so

Reading libc.so.1

Reading libdl.so.1

Reading rtcaudit.so

Reading libmapmalloc.so.1

Reading libgen.so.1

Reading libm.so.2

Reading rtcboot.so

Reading librtc.so

access checking - ON

Running: hello

(process id 1662)

RTC: Enabling Error Checking...

RTC: Running program...

Reading disasm.so

Reading de_DE.ISO8859-1.so.3

terminating signal 11 SIGSEGV

% dbx -C hello

Reading hello

Reading ld.so.1

Reading rtcapihook.so

Reading libc.so.1

Reading libdl.so.1

Reading rtcaudit.so

Reading libmapmalloc.so.1

Reading libgen.so.1

Reading libm.so.2

Reading rtcboot.so

Reading librtc.so

(dbx) check -all

access checking - ON

memuse checking - ON

(dbx) run

Running: hello

(process id 1668)

RTC: Enabling Error Checking...

RTC: Running program...

Reading disasm.so

Reading de_DE.ISO8859-1.so.3

terminating signal 11 SIGSEGV

(dbx) where

dbx: program is not active

There's a new core dump in the current

directory:

% pstack /tmp/core

core '/tmp/core' of 1952:/tmp/hello

ee188106 _clear_internal_mbstate (803ed8c, 803ec28, ee1c0837, 803ed8c, ee1f5000, f) + 21

ee1c067f __charmap_init (803ed8c, ee1f5000, f, 10, 803ef38, ee1bca1f) + 27

ee1c0837 __locale_init (803ed8c) + 27

ee1bca1f setlocale (6, 8050b74, 0, 0, 8060d48, 0) + 9ff

08050a9a main(1, 8046f8c, 8046f94, ee1fa540, 8046f80, 805095f) + 2a

080509bd _start(1, 8047134, 0, 804713f, 804715a, 80471b8) + 7d

% pflags /tmp/core

core '/tmp/core' of 1952:/tmp/hello

data model = _ILP32 flags = RLC|BPTADJ|MSACCT|MSFORK

flttrace = 0x00000004

sigtrace = 0xfffffeff 0xffffffff

HUP|INT|QUIT|ILL|TRAP|ABRT|EMT|FPE|BUS|SEGV|SYS|PIPE|ALRM|TERM|USR1|USR2|CLD|PW R|WINCH|URG|POLL|STOP|TSTP|CONT|TTIN|TTOU|VTALRM|PROF|XCPU|XFSZ|WAITING|LWP|FREE ZE|THAW|CANCEL|LOST|XRES|JVM1|JVM2|RTMIN|RTMIN+1|RTMIN+2|RTMIN+3|RTMAX-3|RTMAX-2 |RTMAX-1|RTMAX

entryset = 0x00000403 0x04000000 0x00000000 0x00400000

0x80004000 0x00000000 0x00000000 0x00000000

exitset = 0x00000002 0x00000000 0x00000000 0x00400000

0x40004000 0x00000000 0x00000000 0x00000000

/1:flags = 0

sigmask = 0xfffffeff,0x0000ffff cursig = SIGSEGV

Workaround: run access checking in the "C" locale.

[3621 byte] By [Juergen.Keila] at [2007-11-27 8:44:16]
# 1
I can reproduce this using SS12 FCS, but latest build doesn't have this problem so the fix should be in first SS12 patch for dbx. It should be available in July, if I'm not mistaken.
MaximKartasheva at 2007-7-12 20:44:59 > top of Java-index,Development Tools,Solaris and Linux Development Tools...
# 2

> I can reproduce this using SS12 FCS, but latest build

> doesn't have this problem so the fix should be in

> first SS12 patch for dbx. It should be available in

> July, if I'm not mistaken.

Fine.

Here's another access check crash - maybe with the same root cause:

% cat langinfo.c

#include <langinfo.h>

int

main(int argc, char **argv)

{

char *lang = nl_langinfo(CODESET);

}

% cc -g -o langinfo langinfo.c

% bcheck -access langinfo

Reading langinfo

Reading ld.so.1

Reading rtcapihook.so

Reading libc.so.1

Reading libdl.so.1

Reading rtcaudit.so

Reading libmapmalloc.so.1

Reading libgen.so.1

Reading libm.so.2

Reading rtcboot.so

Reading librtc.so

access checking - ON

Running: langinfo

(process id 2575)

RTC: Enabling Error Checking...

RTC: Running program...

Reading disasm.so

terminating signal 11 SIGSEGV

% pstack /tmp/core

core '/tmp/core' of 2575:/tmp/langinfo

ee1abfef pthread_key_create_once_np (ee1f78c8, ee16bae0) + 2f

ee16bb7e tsdalloc (3, 80, 0, 8046f50, ee1f5000, feffcd68) + 37

ee1c0e0a __nl_langinfo_std (ee1face0, 31, feffa7d0, 8046f68, 80509cf, 31) + 22

ee1cb513 nl_langinfo (31, 0, 8046f50, 80470d8, 8046f88, 805092d) + 27

080509cf main(1, 8046f94, 8046f9c, 80508cf, 8050a08, fefd3a40) + f

0805092d _start(1, 8047140, 0, 804714e, 804715b, 80471b9) + 7d

% pflags /tmp/core

core '/tmp/core' of 2575:/tmp/langinfo

data model = _ILP32 flags = RLC|BPTADJ|MSACCT|MSFORK

flttrace = 0x00000004

sigtrace = 0xfffffeff 0xffffffff

HUP|INT|QUIT|ILL|TRAP|ABRT|EMT|FPE|BUS|SEGV|SYS|PIPE|ALRM|TERM|USR1|USR2|CLD|PW R|WINCH|URG|POLL|STOP|TSTP|CONT|TTIN|TTOU|VTALRM|PROF|XCPU|XFSZ|WAITING|LWP|FREE ZE|THAW|CANCEL|LOST|XRES|JVM1|JVM2|RTMIN|RTMIN+1|RTMIN+2|RTMIN+3|RTMAX-3|RTMAX-2 |RTMAX-1|RTMAX

entryset = 0x00000403 0x04000000 0x00000000 0x00400000

0x80004000 0x00000000 0x00000000 0x00000000

exitset = 0x00000002 0x00000000 0x00000000 0x00400000

0x40004000 0x00000000 0x00000000 0x00000000

/1:flags = 0

sigmask = 0xfffffeff,0x0000ffff cursig = SIGSEGV

Unfortunately, this time I've not yet found a workaround.

Juergen.Keila at 2007-7-12 20:44:59 > top of Java-index,Development Tools,Solaris and Linux Development Tools...
# 3

> Here's another access check crash - maybe with the

> same root cause:

And another one, with getexecname(3C):

% cat execname.c

#include <stdlib.h>

int

main(int argc, char **argv)

{

const char *exe_nm = getexecname();

}

% cc -g -o execname execname.c

% bcheck -access ./execname

Reading execname

Reading ld.so.1

Reading rtcapihook.so

Reading libc.so.1

Reading libdl.so.1

Reading rtcaudit.so

Reading libmapmalloc.so.1

Reading libgen.so.1

Reading libm.so.2

Reading rtcboot.so

Reading librtc.so

access checking - ON

Running: execname

(process id 2665)

RTC: Enabling Error Checking...

RTC: Running program...

Reading disasm.so

terminating signal 11 SIGSEGV

% pstack /tmp/core

core '/tmp/core' of 2665:/tmp/./execname

ee14a674 _getaux (7de, 8046f48, ee14aecd, 7de, feffa7d0, 8046f58) + 34

ee14a78e getauxptr (7de, feffa7d0, 8046f58, feffcd68, 8046f58, 80509cb) + e

ee14aecd getexecname (8046f4c, 80470d4, 8046f84, 805092d, 1, 8046f90) + 1d

080509cb main(1, 8046f90, 8046f98, 8060a44, ee1fa540, 8046f84) + b

0805092d _start(1, 804713c, 0, 804714c, 8047159, 80471b7) + 7d

% pflags /tmp/core

core '/tmp/core' of 2665:/tmp/./execname

data model = _ILP32 flags = RLC|BPTADJ|MSACCT|MSFORK

flttrace = 0x00000004

sigtrace = 0xfffffeff 0xffffffff

HUP|INT|QUIT|ILL|TRAP|ABRT|EMT|FPE|BUS|SEGV|SYS|PIPE|ALRM|TERM|USR1|USR2|CLD|PW R|WINCH|URG|POLL|STOP|TSTP|CONT|TTIN|TTOU|VTALRM|PROF|XCPU|XFSZ|WAITING|LWP|FREE ZE|THAW|CANCEL|LOST|XRES|JVM1|JVM2|RTMIN|RTMIN+1|RTMIN+2|RTMIN+3|RTMAX-3|RTMAX-2 |RTMAX-1|RTMAX

entryset = 0x00000403 0x04000000 0x00000000 0x00400000

0x80004000 0x00000000 0x00000000 0x00000000

exitset = 0x00000002 0x00000000 0x00000000 0x00400000

0x40004000 0x00000000 0x00000000 0x00000000

/1:flags = 0

sigmask = 0xfffffeff,0x0000ffff cursig = SIGSEGV

Juergen.Keila at 2007-7-12 20:44:59 > top of Java-index,Development Tools,Solaris and Linux Development Tools...
# 4
nl_langinfo() works fine with latest dbx (please wait for the first patch to be available to get the fix), but getexecname() is still broken. I'll file a bug and post bug ID here.
MaximKartasheva at 2007-7-12 20:44:59 > top of Java-index,Development Tools,Solaris and Linux Development Tools...
# 5
Bug ID is 6573845, it should be visible on bugs.sun.com in 24 hours.
MaximKartasheva at 2007-7-12 20:44:59 > top of Java-index,Development Tools,Solaris and Linux Development Tools...
# 6

> nl_langinfo() works fine with latest dbx (please wait

> for the first patch to be available to get the fix),

> but getexecname() is still broken. I'll file a bug

> and post bug ID here.

Oh, so all these rtc crashes do not have the same

root cause?

That is, if I find more rtc crashes like setlocale(), nl_langinfo()

getexecname() it still would be interesting to know which

functions are affected?

Juergen.Keila at 2007-7-12 20:44:59 > top of Java-index,Development Tools,Solaris and Linux Development Tools...
# 7

The cause might be the same - nested signals. Dbx replaces memory access instructions with an instruction that generates SIGSEGV; when it happens, signal handler installed by dbx is invoked and it performs all necessary checks. If the application tries to install its own handler for SIGSEGV, it might fail, thus the "Terminating signal 11" message.

The only possible fix for this is to skip instrumentation of functions that can cause nested signals; there's built-in command for that in dbx, rtc skippatch (type 'help rtc skippatch' in dbx to see more info) and there's internal list of such functions, which is frequently updated. However, it is not always easy to determine which function should not be instrumented.

So if you find more errors like this, please report them here.

MaximKartasheva at 2007-7-12 20:44:59 > top of Java-index,Development Tools,Solaris and Linux Development Tools...
# 8

> The cause might be the same - nested signals. Dbx

> replaces memory access instructions with an

> instruction that generates SIGSEGV; when it happens,

> signal handler installed by dbx is invoked and it

> performs all necessary checks. If the application

> tries to install its own handler for SIGSEGV, it

> might fail, thus the "Terminating signal 11"

> message.

I'm not sure I understand this. My sample programs do

not install SIGSEGV handlers.

But I can imagine that the issue is that dbx rtc is trying

to use setlocale(), nl_langinfo(), getexecname() while

running in the SIGSEGV handler, checking the memory

access (maybe because rtc has found some problem

and is trying to report it), and now trips over the patched

memory access instructions in these functions.

And indeed, using the dbx commands

(dbx) check -all

(dbx) rtc skippatch libc.so.1 -f _setlocale _nl_langinfo _getexecname thr_keycreate_once _thr_keycreate_once pthread_key_create_once_np _pthread_key_create_once_np

seems to work around these issues.

> The only possible fix for this is to skip

> instrumentation of functions that can cause nested

> signals; there's built-in command for that in dbx,

> rtc skippatch (type 'help rtc skippatch' in dbx to

> see more info) and there's internal list of such

> functions, which is frequently updated. However, it

> is not always easy to determine which function should

> not be instrumented.

>

> So if you find more errors like this, please report

> them here.

readdir64_r() is the next one:

% cat readdir.c

#define _POSIX_PTHREAD_SEMANTICS 1

#include <sys/types.h>

#include <dirent.h>

int

main(int argc, char **argv)

{

struct dirent dent, *dent_result;

DIR *dir = opendir("/");

readdir_r(dir, &dent, &dent_result);

}

% cc -g -o readdir readdir.c `getconf LFS_CFLAGS`

% bcheck -access readdir

Reading readdir

Reading ld.so.1

Reading rtcapihook.so

Reading libc.so.1

Reading libdl.so.1

Reading rtcaudit.so

Reading libmapmalloc.so.1

Reading libgen.so.1

Reading libm.so.2

Reading rtcboot.so

Reading librtc.so

access checking - ON

Running: readdir

(process id 4269)

RTC: Enabling Error Checking...

RTC: Running program...

Reading disasm.so

terminating signal 11 SIGSEGV

% pstack core

core 'core' of 4269:/tmp/readdir

ee166897 readdir64_r (fefa0300, 8046f40, 8046f3c, fefa0300, ee1fcdf0, 133f) + 27

08050a29 main(1, 8046f88, 8046f90, 8046f7c, 805090f, 8050a60) + 29

0805096d _start(1, 8047130, 0, 804713d, 8047158, 80471b6) + 7d

Juergen.Keila at 2007-7-12 20:44:59 > top of Java-index,Development Tools,Solaris and Linux Development Tools...
# 9
Okay, I've updated bug report.
MaximKartasheva at 2007-7-12 20:44:59 > top of Java-index,Development Tools,Solaris and Linux Development Tools...
# 10

> So if you find more errors like this, please report

> them here.

The smedia_get_handle() function in libsmedia.so

triggers *lots* or rtc SIGSEGV problems:

% cat smedia.c

#include <stdlib.h>

#include <unistd.h>

#include <fcntl.h>

#include <sys/smedia.h>

int

main(int argc, char **argv)

{

char *devname = "/dev/removable-media/rdsk/c0t0d0p0";

int fd;

if (argv[1])

devname = argv[1];

fd = open(devname, O_RDONLY | O_NONBLOCK);

if (fd < 0) {

perror(devname);

exit(1);

}

smedia_get_handle(fd);

}

% cc -g -o smedia smedia.c -lsmedia

(Run it with the raw device for your system's cd/dvd device, e.g.

something like ./smedia /dev/rdsk/c6t0d0p0)

So far, I had to add at least these libc.so functions to

rtc skippatch, but it's still crashing:

rtc skippatch libc.so.1 -f _clear_internal_mbstate \

_thr_keycreate_once _pthread_key_create_once_np \

_getaux _fstat readdir64_r \

_nsc_proc_is_cache _nsc_getdoorbsize nss_dbop_search \

_nsc_initdoor_fp _nsc_proc_is_cache _nsc_proc_is_cache \

_s_fcntl __door_info htonl ntohl \

membar_consumer

Most of the rtc crashes are inside a clnt_create() call, which is

using netdir_getbyname(), and the name service switch routines.

Note: My test box is a NIS client, using a nsswitch.nis configuration

in /etc/nsswitch.conf.

Typical crash:

e7d8cbc7 s_fcntl (e7df6650, e7df669c, e7df6650, e7df5000, 8, fdac00b8)

e7d52f59 _nsc_try1door (e7df6650, 80406b0, 80406b4, 80406b8, 804066c, 0) + 21

e7d532c3 _nsc_trydoorcall_ext (80406b0, 80406b4, 80406b8, db527718, 0, e7df5000) + 21b

e7d60eff _nsc_search (db59f258, db527718, 4, 8040758, 8066e84, 0) + bf

e7d5faf6 nss_search (db59f258, db527718, 4, 8040758) + 32

db52b36f _switch_getipnodebyname_r (8040b89, 8066e84, 8066e98, 2120, 1a, 3) + 7b

db52a44a _get_hostserv_inetnetdir_byname (8066dc8, 8040858, 8040838) + a9e

db525e0d netdir_getbyname (8066dc8, 8040898, 80408d8) + c9

db54baf8 _getclnthandle_timed (8040b89, 8066dc8, 8040918, db59f9e8) + 1ac

db54c5a4 __rpcb_findaddr_timed (1873b, 1, 8066dc8, 8040b89, 80409b4, db59f9e8) + 43c

db540b6e clnt_tp_create_timed (8040b89, 1873b, 1, 8066dc8, 0) + 42

db5404b5 clnt_create_timed (8040b89, 1873b, 1, 0, 0) + 179

db540336 clnt_create (8040b89, 1873b, 1, 0, 5, 8064790) + 26

dc9017aa is_server_running (80648a0, e230296d, 8064790, dc913000, f3142c3c, f311464c) + 3e

dc901db7 get_handle_from_fd (5, e2312a24, 8, 0, 80413a8, d0abf2df) + e3

dc9015af smedia_get_handle (5, 403, 8041340, f3114651, 64, f3142c3c) + 1b

Juergen.Keila at 2007-7-12 20:44:59 > top of Java-index,Development Tools,Solaris and Linux Development Tools...