SPARC performance "bug"

[I've submitted this as a bug report at bugs.sun.com. I'd be interested in comments however].

Compile the following code with Sun Studio Express 2006-06, with

"-fast -xarch=v9b -xia" (on an UltraIIICu arch). (I require -xia because

this in interval based code, but it doesn't seem to affect the results).

[code]

#include <algorithm>

#include <cmath>

#define MIN_MAX_MAG 4

#if MIN_MAX_MAG == 1

void min_max_mag(double& x, double& y) {

if (fabs(x) > fabs(y))

std::swap(x, y);

}

#elif MIN_MAX_MAG == 2

void min_max_mag(double& x, double& y) {

if (fabs(x) > fabs(y)) {

double t = x;

x = y;

y = t;

}

}

#elif MIN_MAX_MAG == 3

void min_max_mag(double& x, double& y) {

double x_ = x;

double y_ = y;

bool do_swap = fabs(x) > fabs(y);

if (do_swap)

x_ = y;

if (do_swap)

y_ = x;

y = y_;

x = x_;

}

#elif MIN_MAX_MAG == 4

void min_max_mag(double& x, double& y) {

double x_ = x;

double y_ = y;

if (fabs(x) > fabs(y)) {

x_ = y;

y_ = x;

}

y = y_;

x = x_;

}

#elif MIN_MAX_MAG == 5

void min_max_mag(double& x, double& y) {

double x_ = x;

double y_ = y;

if (fabs(x) > fabs(y))

x_ = y;

if (fabs(x) > fabs(y))

y_ = x;

y = y_;

x = x_;

}

#endif

void add_error(double x, double y, double& r, double& e) {

r = x + y;

min_max_mag(x, y);

double t = y - r;

e = x + t;

}

[/code]

All the versions of min_max_mag have the same "result".

It's just that using MIN_MAX_MAG 1 and 2 the compiler produce code that

uses branches whereas MIN_MAX_MAG 3, 4 and 5 uses move-on-condition

(fmov* %fcc*) instructions.

"Should" be compiler be generating move-on-condition instructions for

MIN_MAX_MAG 1 and 2? (where "should" is hard to quantify)

(Note: using Sun Studio 11, only MIN_MAX_MAG 5 had move-on-condition

instructions (although interestingly only when inlined into add_error, not

in min_max_mag itself); the others (3 and 4) had branches.

I'm glad to see there's some improvement that will be available in SS12)

It is interesting comparing the resulting output from SSE 2006-06 using

MIN_MAX_MAG 3, 4 and 5. They are subtly different when I believe they

should be the same.

[2591 byte] By [slashlib] at [2007-11-26 10:15:48]
# 1

It sounds like you are expecting source code that is logically equivalent to generate identical (and the best) object code. I think that's asking quite a lot of the compiler, particularly when some of the code uses library functions, and some is a hand-coded "equivalent".

That said, we are always concerned about performance issues, particularly when reasonable source code yields poor object code.

clamage45 at 2007-7-7 2:08:28 > top of Java-index,Development Tools,Solaris and Linux Development Tools...
# 2
RFE 6473069 has been filed for this problem, and should be visible at bugs.sun.com. (A day or so might be required for the report to be propagated.)
clamage45 at 2007-7-7 2:08:28 > top of Java-index,Development Tools,Solaris and Linux Development Tools...