Hi,
I have reduced my problem to this test:
#include <immintrin.h>
int main()
{
int i;
float tmp[16];
for(i=0; i<16; i++){
tmp[i] = 5.0f;
printf("%f ", tmp[i]);
}
printf("\n");
__m512 __vtmp = _mm512_set1_ps(10.0f);
__mmask16 mask = 0x0040;
_mm512_mask_extpackstorelo_ps(&tmp, mask, __vtmp, _MM_DOWNCONV_PS_NONE, 0);
for(i=0; i<16; i++){
printf("%f ", tmp[i]);
}
printf("\n");
}
According to the description of the ISA manual, using the 0x0040, the first position of 'tmp' shouldn't be written. However, the output of this code is:
5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000
10.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000
Having a look at the assembly, for any reason, the 0x0040 is being translated to $1:
stmxcsr 64(%rsp) #4.1 c1
movl $1, %eax #11.23 c2
vprefetche0 (%rsp) #10.9 c2
orl $32832, 64(%rsp) #4.1 c6
kmov %eax, %k1 #11.23 c6
ldmxcsr 64(%rsp) #4.1 c10
vbroadcastsd .L_2il0floatpacket.1(%rip), %zmm0{%k1} #11.23 c11
xorl %ecx, %ecx #8.5 c15
movl $1084227584, %edx #10.9 c15
xorl %r12d, %r12d #8.5 c19
vpackstorelpd %zmm0, 72(%rsp){%k1} #11.23 c19
movl %edx, %ebx #11.23 c23
movq %rcx, %r15 Am I missing something?
I'm using icc (ICC) 14.0.2 20140120
Thank you