The current release of sse/sse2 intrinsics consists of only packed intregral instructions via the use of header sunmedia_intrin.h, located in
.../[intel-S2|intel-Linux]/prod/include/cc/sys
We have implemented a fully icc/gcc compatible set of intrinsics, with compatible header files of mmintrin.h, xmmintrin.h, emmintrin and pmmintrin.h, which will be made available to our compilers in the very near future. These will include all packed integral, floating point, cache and sse3 intrinsics.
For example:
#include <emmintrin.h>
__m128d dm1;
double dp1, dp2;
double *dp3;
void foo()
{
__m128d d1, d2;
d1 = _mm_set_pd(dp1, dp2);
d2 = _mm_load_pd(dp3);
dm1 = _mm_add_pd(d1, d2);
}
A possible generated code will then be:
foo:
movsddp1,%xmm0
unpcklpd dp2,%xmm0
movl dp3,%eax
movapd(%eax),%xmm1
addpd%xmm1,%xmm0
movapd%xmm0,dm1
ret
Alfred Huang