SPARC performance "bug"
[I've submitted this as a bug report at bugs.sun.com. I'd be interested in comments however].
Compile the following code with Sun Studio Express 2006-06, with
"-fast -xarch=v9b -xia" (on an UltraIIICu arch). (I require -xia because
this in interval based code, but it doesn't seem to affect the results).
[code]
#include <algorithm>
#include <cmath>
#define MIN_MAX_MAG 4
#if MIN_MAX_MAG == 1
void min_max_mag(double& x, double& y) {
if (fabs(x) > fabs(y))
std::swap(x, y);
}
#elif MIN_MAX_MAG == 2
void min_max_mag(double& x, double& y) {
if (fabs(x) > fabs(y)) {
double t = x;
x = y;
y = t;
}
}
#elif MIN_MAX_MAG == 3
void min_max_mag(double& x, double& y) {
double x_ = x;
double y_ = y;
bool do_swap = fabs(x) > fabs(y);
if (do_swap)
x_ = y;
if (do_swap)
y_ = x;
y = y_;
x = x_;
}
#elif MIN_MAX_MAG == 4
void min_max_mag(double& x, double& y) {
double x_ = x;
double y_ = y;
if (fabs(x) > fabs(y)) {
x_ = y;
y_ = x;
}
y = y_;
x = x_;
}
#elif MIN_MAX_MAG == 5
void min_max_mag(double& x, double& y) {
double x_ = x;
double y_ = y;
if (fabs(x) > fabs(y))
x_ = y;
if (fabs(x) > fabs(y))
y_ = x;
y = y_;
x = x_;
}
#endif
void add_error(double x, double y, double& r, double& e) {
r = x + y;
min_max_mag(x, y);
double t = y - r;
e = x + t;
}
[/code]
All the versions of min_max_mag have the same "result".
It's just that using MIN_MAX_MAG 1 and 2 the compiler produce code that
uses branches whereas MIN_MAX_MAG 3, 4 and 5 uses move-on-condition
(fmov* %fcc*) instructions.
"Should" be compiler be generating move-on-condition instructions for
MIN_MAX_MAG 1 and 2? (where "should" is hard to quantify)
(Note: using Sun Studio 11, only MIN_MAX_MAG 5 had move-on-condition
instructions (although interestingly only when inlined into add_error, not
in min_max_mag itself); the others (3 and 4) had branches.
I'm glad to see there's some improvement that will be available in SS12)
It is interesting comparing the resulting output from SSE 2006-06 using
MIN_MAX_MAG 3, 4 and 5. They are subtly different when I believe they
should be the same.

