Overflow - Compiler Settings
Trying to migrate old code currently on HP-UX to Solaris9/SPARC. Although configuration is for 64 bit, code will be compiled using Sun Studio11 in 32 bit.
Came across an issue like using strcpy can cause problems.
Example...
var1 char[9);
var2 char[9);
strcpy(var1, "ABCDEFGH");
strcpy(var2, "123456789");
printf ("Print 1 %s\n", var1);
printf ("Print 2 %s\n", var2);
This leads to incorrect results due to overflow.
gcc compiler and HP-UX compiler are able to handle this bad code.
Understand solution is to change bad code, but are there any compiler flag/setting which will take care of such errors and thus minimise any change effort?
[713 byte] By [
Madhu_Rao] at [2007-11-26 8:45:45]

# 1
> This leads to incorrect results due to overflow.
> gcc compiler and HP-UX compiler are able to handle
> this bad code.
Can you describe how do they "handle" this bad code? It is very likely that this "handling" is due to different local variables layout and the problem does not show up in [b]this[/b] case, but can be seen in another.
# 2
Enlarging on Maxim's comments:
The strcpy function copies from the start of the source array up to and including the first null byte it finds. In this case, it copies the 9 digits plus a null byte. But the destination array is declared to hold only 9 bytes. What happens after that depends on how variables are allocated, and whether the 10th byte is available and not being used for something else.
I think you will find that on HP-UX if you switch the order of the array declarations that the code will also fail. The difference is due to the order of allocation of the variables, not because HP-UX somehow "handles" the code.
You can't fit 10 bytes into a 9-byte array. Your choices are to make the destination arraby big enough to hold the source, or don't copy more source bytes than will fit.
You certainly cannot depend on this code working on different systems, or on the same system if you make even minor changes in surrounding code.
# 3
Thanks for the 2 replies..
on HP/if compiled with gcc it runs as ..
#include<stdio.h>
void main()
{
char var1[9];
char var2[9];
strcpy(var1, "ABCDEFGH");
strcpy(var2, "123456789");
printf ("Print 1 %s\n", var1);
printf ("Print 2 %s\n", var2);
}
we see :
Print 1 ABCDEFGH
Print 2 123456789
Switching to
char var2[9];
char var1[9];
also gives same result.
On Solaris 9..we see
Print 1
Print 2 123456789
Switching to
char var2[9];
char var1[9];
gives
Print 1 ABCDEFGH
Print 2 123456789
All compiled with cc command.
What we have seen is that somehow HP/gcc
arrays start on 8 byte boundaries versus 1 byte on
Sun.
I am not sure what is happening ("handling" )with HP/gcc compilers that is what I need to know.!
Message was edited by:
Madhu_Rao
# 4
All right. This is what happens with g++-generated code (I use Sun Studio debugger, dbx, to illustrate it):
[code]
stopped in main at line 13 in file "a.cc"
13printf ("Print 1 %s\n", var1);
(dbx) x &var2/30c
0x080470b4:'1' '2' '3' '4' '5' '6' '7' '8' '9' '\0' '\006' '\b' '\0' '\0' '\0' '\0'
0x080470c4:'A' 'B' 'C' 'D' 'E' 'F' 'G' 'H' '\0' 'p' '\004' '\b' 'э' 'p'
[/code]
As you can see var2 precedes var1 in memory and there's a lot of free space between them (7 bytes) and this free space contains zeroes (more or less accidentally, which means that you can't count on it).
Now, what happens with Sun's CC generated code:
[code]stopped in main at line 13 in file "a.cc"
13printf ("Print 1 %s\n", var1);
(dbx) ph &var1
&var1 = 0x80470bf <-- here var1 begins
(dbx) ph &var2
&var2 = 0x80470b6 <-- here var2 begins. 0x80470bf-0x80470b6 = 0x9 => no space between arrays
(dbx) x &var2/30c
0x080470b6:'1' '2' '3' '4' '5' '6' '7' '8' '9' '\0' 'B' 'C' 'D' 'E' 'F' 'G '
0x080470c6:'H' '\0' 'Э' 'q' '\004' '\b' 'Т' 'p' '\004' '\b' '┼' '\b' '\005 ' '\b'[/code]
Here we have var2 immediately followed by var1 with no space between them. First strcpy() thus writes string termination character ('\0') to the very beginning of var2.
Does it explain differences you have between gcc and Sun C compiler?
# 5
Maxim,
Thanks for taking your time out to explain. Actually we had done similar thing and we had seen the differences. What I need to know is why the Sun compiler behaves differently..is this something internal to the compiler..and is there anyway we can change it by flags/some settings.
That's all.
# 6
The point here is that the program has undefined behavior and cannot be expected to work on any system. You had the bad luck to use the code on a system where it worked by accident. If you want it to work elsewhere, you need to make the code valid.
I hope it is obvious that you cannot depend on being able to store 10 bytes in a 9-byte array.
# 7
> What I need to know is why the Sun compiler behaves differently..
To understand this, one need to dig deep into compiler (more specifically, compiler's back-end) implementation because this is [b]implementation-dependent[/b].
Simply put, when compiler assigns address to a variable of complex type, it [b]must[/b] align to the size of largest member within them. In your case, it is char and compiler aligns whole array on 8-bit boundary.
Why other compiler(s) align it differently? There can be reasons, but you can't [i]count[/i] on this alignment, it can be changed and the code generated by the compiler will still be considered valid.
> is this something internal to the compiler..and is there anyway we
>can change it by flags/some settings.
There's only an option to specify [i]maximum[/i] assumed memory alignment, but you need [i]minimum[/i]. Anyway, I totally agree with Stephen that you cannot depend on [i]wrong[/i] code to execute as if it was correct.
# 8
Thanks for all the help so far, just confirms our worst fears.
As I understand correcting the code is the recommended (probaly only) option. Since we are talking of 1000's of file, is there an easy way to identify such code by say a tool (need a tried and tested reliable one) or through using some compiler flag (Sun Studio or gcc) atleast generate a warning?. This information will be very useful.
# 9
Such a tool would be dbx, the Sun Studio debugger. On sparc, it has very useful memory access checking feature, which should help identifying illegal memory access, even if this access does not immediately lead to an error.
However, in this particular case it won't help because strcpy() writes to [b]allocated[/b] memory, although this memory belongs to other variable. It should help if strings are heap-allocated.
As for compiler option, I fear there are no options that suits your need.