Gathers 2/4 quadword values from memory referenced by the given base address, qword indices and scale, and using the given qword mask values. The corresponding Intel® AVX2 instruction is VPGATHERQQ
.
Syntax
extern __m128i _mm_mask_i64gather_epi64(__m128i def_vals, int64 const * base, __m128i vindex, __m128i vmask, const int scale); |
extern __m256i _mm256_mask_i64gather_epi64(__m128i def_vals, int64 const * base, __m256i vindex, __m256i vmask, const int scale); |
Arguments
def_val | the vector of qword values copied to the destination when the corresponding element of the vector mask is '0'. |
base | the base address used to reference the loaded qword elements. |
vindex | the vector of qword indices used to reference the loaded qword elements. |
vmask | the vector of qword elements used as a vector mask; only the most significant bit of each qword is used as a mask. |
scale | 64-bit scale used to address the loaded qword elements; it is multiplied by the corresponding element from 'vindex'. |
Description
The intrinsics conditionally load 2/4 quadword values from memory using the base address, qword indices and 64-bit scale.
Below is the pseudo-code for the intrinsics:
_mm_mask_i64gather_epi64()
:
result[63:0] = (vmask[63]==1) ? (mem[base+vindex[63:0]*scale]) : (def_vals[63:0]); result[127:64] = (vmask[127]==1) ? (mem[base+vindex[127:64]*scale]) : (def_vals[127:64]);
_mm256_mask_i64gather_epi64()
:
result[63:0] = (vmask[63]==1) ? (mem[base+vindex[63:0]*scale]) : (def_vals[63:0]); result[127:64] = (vmask[127]==1) ? (mem[base+vindex[127:64]*scale]) : (def_vals[127:64]); result[191:128] = (vmask[191]==1) ? (mem[base+vindex[191:128]*scale]) : (def_vals[191:128]); result[255:192] = (vmask[255]==1) ? (mem[base+vindex[255:192]*scale]) : (def_vals[255:192]);