-xalias_level effect on getting signal SEGV

Hi,

I'm using a third party tool which generates *.pc and *.c source files. The error come from ex_set() which calls 'strncpy' . This is what dbx says:

--

dbxenv suppress_startup_message 6.2

Reading ALL_ACBS_LLOANquery0.exe

dbx: warning: core object name "ALL_ACBS_LLOANq" matches

object name "ALL_ACBS_LLOANquery0.exe" within the limit of 14. assuming they match

core file header read successfully

Reading ld.so.1

Reading libclntsh.so.9.0

Reading libnsl.so.1

Reading libsocket.so.1

Reading libgen.so.1

Reading libdl.so.1

Reading libaio.so.1

Reading librt.so.1

Reading libm.so.1

Reading libthread.so.1

Reading libc.so.1

Reading libwtc9.so

Reading libmp.so.2

Reading libmd5.so.1

Reading libc_psr.so.1

detected a multithreaded program

program terminated by signal SEGV (access to address exceeded protections)

0xffffffff7d43ddbc: strncpy+0x0664:stb%i3, [%i0]

(/opt/SUNWspro/WS6U2/bin/sparcv9/dbx) where -v

current thread: t@1

=>[1] strncpy(0x100043ad8, 0x100278f61, 0x2, 0x32, 0x800, 0x100043ad8), at 0xffffffff7d43ddbc

[2] ex_set(0xffffffff7fffdf30, 0x100153d50, 0x1, 0x11289c, 0x100151b70, 0x3), at 0x100037908

[3] tr_trans_5(0x14d8, 0x1000, 0x588, 0x1400, 0x15a0, 0x5), at 0x100025600

[4] pf_process_5_data(0xffffffffffffffff, 0x1336dc, 0x3, 0x13383c, 0x1400, 0x17a0), at 0x1000140a0

[5] pf_process_5_row(0x1, 0x100147740, 0x1800, 0x1820, 0x100152d50, 0xffffffff7fffda18), at 0x100013f18

[6] pf_process_group_5(0x1850, 0x0, 0x10014cc78, 0x0, 0x100152d50, 0x100150190), at 0x100013d4c

[7] main(0x3, 0xffffffff7fffe328, 0xffffffff7fffe348, 0x0, 0x0, 0x100000000), at 0x100008cb8

(/opt/SUNWspro/WS6U2/bin/sparcv9/dbx)

-

And here is ex_set():

-

struct ex_part * ex_set( struct ex_part *target

,struct ex_part *arg1

,EX_BOOLEAN static_memory)

{

/*

* Function: ex_set

* Receives: two ex_parts (target and arg1) and

*a boolean flag telling if the target ex_part

*is in static (compiled) memory.

* Returns: 'target' ex_part, set to the value of 'arg1' ex_part.

*

* Example:

* target = ex_set(target,arg1);

*

* This function returns the ex_part it was sent.

* However, it can MALLOC space for target->string_data, if

* the third argument(static_memory, is FALSE.

* It transfers type, status, location, and data from ex_part arg1

* to the ex_part target.

*/

/* copy type, status_ind and location_ind from arg1 into a. */

target->type = arg1->type;

target->status_ind= arg1->status_ind;

target->location_ind = arg1->location_ind;

target->data = arg1->data;

/*

* If arg1 has string_data to copy (arg1->alloc_len > 0)

* we may need to allocate memory for target->string_data

* unless target is static memory, such as buffer 2 or 3.

*/

if (!static_memory

&& arg1->data_len > target->alloc_len) {

/** adding memory for target string_data to avoid truncation */

ex_str_in_part_alloc(target, arg1->data_len);

}

if ((target->alloc_len) && (arg1->data_len > 0)){

/** copy the string data, truncating if necessary */

if (arg1->data_len > 0) {

if (target->alloc_len >= arg1->data_len + 1) {

/** no truncation */

target->data_len = arg1->data_len;

strncpy(target->string_data, arg1->string_data, arg1->data_len);

target->string_data[arg1->data_len] = NULL_CHAR;

} else {

/** truncate to fit target */

target->data_len = target->alloc_len - 1;

strncpy(target->string_data, arg1->string_data, target->data_len);

target->string_data[target->data_len] = NULL_CHAR;

}

} else {

memset(target->string_data, NULL_CHAR, target->alloc_len);

target->data_len = 0;

}

} else {

target->data_len = 0;

} /** end if: (arg1->data_len > 0) */

return target;

} /** end ex_set */

-

ex_set() is frequently called (247 times) by main program.

The follwing is the make file messages during compilation:

-

Building ALL_ACBS_LLOANquery0.exe ...

/usr/ccs/bin/make -f /u01/app/dsldev/oracle/product/9.2.0/precomp/demo/proc/demo_proc.mk PROCFLAGS="MODE=ANSI" PCCSRC=ALL_ACBS_LLOANquery0 I_SYM=include= pc1

proc MODE=ANSI iname=ALL_ACBS_LLOANquery0 include=. include=/u01/app/dsldev/oracle/product/9.2.0/precomp/public include=/u01/app/dsldev/oracle/product/9.2.0/rdbms/public include=/u01/app/dsldev/oracle/product/9.2.0/rdbms/demo include=/u01/app/dsldev/oracle/product/9.2.0/plsql/public include=/u01/app/dsldev/oracle/product/9.2.0/network/public

Pro*C/C++: Release 9.2.0.7.0 - Production on Thu Mar 15 14:46:20 2007

Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.

System default option values taken from: /u01/app/dsldev/oracle/product/9.2.0/precomp/admin/pcscfg.cfg

cc -xO3 -Xa -xstrconst -dalign -xF-xildoff -errtags=yes -v -xarch=v9 -xchip=ultra3 -W2,-AKNR_S -Wd,-xsafe=unboundsym -Wc,-Qiselect-funcalign=32 -xcode=abs44 -Wc,-Qgsched-trace_late=1 -Wc,-Qgsched-T5 -xalias_level=weak -D_REENTRANT -DSS_64BIT_SERVER -DBIT64 -DMACHINE64 -K PIC -DPRECOMP -I. -I/u01/app/dsldev/oracle/product/9.2.0/precomp/public -I/u01/app/dsldev/oracle/product/9.2.0/rdbms/public -I/u01/app/dsldev/oracle/product/9.2.0/rdbms/demo -I/u01/app/dsldev/oracle/product/9.2.0/plsql/public -I/u01/app/dsldev/oracle/product/9.2.0/network/public -DSLMXMX_ENABLE -DSLTS_ENABLE -D_SVID_GETTOD -D_REENTRANT-c ALL_ACBS_LLOANquery0.c

cc -xO3 -Xa -xstrconst -dalign -xF-xildoff -errtags=yes -v -xarch=v9 -xchip=ultra3 -W2,-AKNR_S -Wd,-xsafe=unboundsym -Wc,-Qiselect-funcalign=32 -xcode=abs44 -Wc,-Qgsched-trace_late=1 -Wc,-Qgsched-T5 -xalias_level=weak -D_REENTRANT -DSS_64BIT_SERVER -DBIT64 -DMACHINE64 -K PIC -DPRECOMP -I. -I/u01/app/dsldev/oracle/product/9.2.0/precomp/public -I/u01/app/dsldev/oracle/product/9.2.0/rdbms/public -I/u01/app/dsldev/oracle/product/9.2.0/rdbms/demo -I/u01/app/dsldev/oracle/product/9.2.0/plsql/public -I/u01/app/dsldev/oracle/product/9.2.0/network/public -DSLMXMX_ENABLE -DSLTS_ENABLE -D_SVID_GETTOD -D_REENTRANT-c exfcns.c

cc -xarch=v9 -o ALL_ACBS_LLOANquery0.exe ALL_ACBS_LLOANquery0.o exfcns.o -L/u01/app/dsldev/oracle/product/9.2.0/lib/ -lclntsh `cat /u01/app/dsldev/oracle/product/9.2.0/lib/ldflags``cat /u01/app/dsldev/oracle/product/9.2.0/lib/sysliblist` -R/u01/app/dsldev/oracle/product/9.2.0/lib -laio -lposix4 -lm -lthread

Some points to consider:

1. the make file is being used for all other similar applications (more than 100 apps)

2. Only that executable has this problem.

3. ex_set() is defined in exfcns.c which is always part of compilation for all apps and never gets changed from one app to the other.

4. the exe file is not multi-threaded, even though "-lthread" is used in the make file.

5. Important: I DO NOT get the error if I compile the main program without -xO3 option which results to -xalias_level being ignored, but compiling exfcns.c with -xO3, and linking these two objects. (if I compile both source files with and without -xO3, I get the error)

Hope my posting is not too long, I just wanted to provide all required information.

Please let me know what would be the explanation for that.

I really appreciate your help and comments.

Thanks,

Babak

[7744 byte] By [babakfa] at [2007-11-26 21:47:53]
# 1

What happens if you just remove the -xalias_level=weak without changing anything else?

If the problem goes away, I see these possibilities:

1. You have a pointer alias in the program that violates the promises made by using -xalias_level=weak. The pointer alias is not necessarily in this function.

2. Use of a wild pointer (see below).

3. Compiler bug.

I don't know of a way to find pointer aliasing that violates the requirements of -xalias_level=weak . You would have to check all casts and use of unions that result in a T1* being cast to (or converted via a union) to a T2*, where neither T1 nor T2 is char or void.

If the problem remains after removing the -xalias_level=weak option we have these possibilities.

1. The call to strncpy involves overlapping data areas. The results are then undefined, and the compiler optimization level can make a difference in what happens.

2. Use of a wild pointer (see below).

3. Compiler bug.

The stack trace shows function parameters, but in optimized code, the values shown are often not the real values. Here's how you can find out the actual arguments.

Under dbx:

(dbx) stop in strncpy -count infinity

(dbx) run

[the program crashes]

(dbx) status

You will something like count = 132/infinity

Now you know that the crash happend during the 132nd call, so you can stop at the 132nd entry to the function.

(dbx) stop in strncpy -count 132

(dbx) run

Now you can inspect the function arguments. See whether they overlap. If so, that's your problem. Either arrange for them not to overlap, or use memccpy instead of strncpy.

Using a wild pointer (uninitialized pointer, invalid pointer, pointer to deleted or out-of-scope object) can result in any kind of program failure. Usually the error is distant in space and time from the error. Dbx Run Time Checking will find most uses of wild pointers.

(dbx) check all

(dbx) run

On sparc, RTC checks for all kinds of invalid access, and tells you where they happen. On x86, checking somewhat limited, due to lack of needed hardware support.

If in the end you suspect a compiler bug, file it via your Sun Services representative. If you don't have a service contract, you can file a bug report at bugs.sun.com.

Either way, you will need to provide a stand-alone test case that demonstrates the problem.

clamage45a at 2007-7-10 3:38:21 > top of Java-index,Development Tools,Solaris and Linux Development Tools...
# 2
Thanks for your detailed response. I will do your recommendations and let you know the result.
babakfa at 2007-7-10 3:38:21 > top of Java-index,Development Tools,Solaris and Linux Development Tools...
# 3

It failed even without alias_level. Every thing in code looks fine, also the arguments to strncpy, I didn't get any error while calling strlen() on both strncpy() arguments just before crashing, through dbx.

Anyway, since the C code is generated by the third party tool that I'm using I can not ship modified C code to Production, so I tried to do same functionality by different way through its user interface, and it worked with my standard compiler options (-xO3 - and alias_level=weak)

It seems it's the tool bug but I can't do anything about it, just prove to the vendor that there is such problem in their code.

I appriciate your thought and help.

Thanks.

babakfa at 2007-7-10 3:38:21 > top of Java-index,Development Tools,Solaris and Linux Development Tools...