Permutes single-precision floating-point elements of the source vector into the destination vector. The corresponding Intel® AVX2 instruction is VPERMPS.
Arguments
val | the vector of 32-bit single-precision floating-point elements to be permuted |
offsets | the vector of eight 3-bit offsets (specifying values in range [0 - 7]) for the permuted elements of 256-bit vector |
Description
Use the offset values in each dword element of the vector offsets to select a single-precision floating-point element from the source vector val. The result element is copied to the corresponding element of destination vector. The intrinsic does NOT allow to copy the same element of the source vector to more than one element of the destination vector.
Below is the pseudo-code for the intrinsic:
RESULT[31:0] <- (VAL[255:0] >> (OFFSETS[2:0] * 32))[31:0]; RESULT[63:32] <- (VAL[255:0] >> (OFFSETS[34:32] * 32))[31:0]; RESULT[95:64] <- (VAL[255:0] >> (OFFSETS[66:64] * 32))[31:0]; RESULT[127:96] <- (VAL[255:0] >> (OFFSETS[98:96] * 32))[31:0]; RESULT[159:128] <- (VAL[255:0] >> (OFFSETS[130:128] * 32))[31:0]; RESULT[191:160] <- (VAL[255:0] >> (OFFSETS[162:160] * 32))[31:0]; RESULT[223:192] <- (VAL[255:0] >> (OFFSETS[194:192] * 32))[31:0]; RESULT[255:224] <- (VAL[255:0] >> (OFFSETS[226:224] * 32))[31:0];
Parent topic: Intrinsics for Permute Operations
Inglés