Miscellaneous Operations
Parent topic: Vector Intrinsics_mm512_alignr_epi32/ _mm512_mask_alignr_epi32 Shifts int32 elements right and concatenates vectors. Corresponding instruction is VALIGND. This intrinsic only applies to...
View Article_mm512_i32lo[ext]scatter_epi64/ _mm512_mask_i32lo[ext]scatter_epi64
Scatter int64 vector with int32 indices. Corresponding instruction is VPSCATTERDQ. This intrinsic only applies to Intel® Many Integrated Core Architecture (Intel® MIC Architecture).SyntaxWithout...
View Article_mm512_extloadunpackhi_ps/ _mm512_mask_extloadunpackhi_ps
Loads high 64-byte aligned portion of unaligned doubleword stream, unpacks mask-enabled elements that fall in that portion, and stores those elements in float32 vector. Corresponding instruction is...
View Article_mm512_load_ps/ _mm512_mask_load_ps
Loads float32 vector. Corresponding instruction is VMOVAPS. This intrinsic only applies to Intel® Many Integrated Core Architecture (Intel® MIC Architecture).SyntaxWithout Maskextern __m512 __cdecl...
View Article_mm256_or_si256
Performs bitwise logical OR operation on signed integer vectors. The corresponding Intel® AVX2 instruction is VPOR.Syntaxextern __m256i _mm256_or_si256(__m256i s1, __m256i s2);Argumentss1signed integer...
View Article_mm256_mullo_epi16/32
Multiplies signed packed 16/32-bit integer data elements of two vectors and stores low bits. The corresponding Intel® AVX2 instruction is VPMULLW or VPMULLD.Syntaxextern __m256i...
View Article_mm256_subs_epi8/16
Subtracts the signed 8/16-bit integer data elements with saturation of two vectors. The corresponding Intel® AVX2 instruction is VPSUBSB or VPSUBSW.Syntaxextern __m256i _mm256_subs_epi8(__m256i s1,...
View Article_addcarry_u32(), _addcarry_u64()
Computes sum of two 32/64 bit wide unsigned integer values and a carry-in and returns the value of carry-out produced by the sum. The corresponding 4th Generation Intel® Core™ Processor extension...
View Article_mm_spflt
Sets performance monitoring filtering mask. Corresponding instruction is spflt. This intrinsic only applies to Intel® Many Integrated Core Architecture (Intel® MIC Architecture).Syntaxextern unsigned...
View ArticleValgrind (Version 3.7.0) errors using Intel MKL 11 routines and Intel C++...
Hi,I'm fairly new in using valgrind and the Intel C++ Compiler and I got the following problem. I compiled and build an executable out of the following source code in example.cpp:#include "mkl.h"int...
View ArticleThe Chronicles of Phi - part 4 - Hyper-Thread Phalanx – tiled_HT2
The prior part (3) of this blog showed the effects of the first-level implementation of the Hyper-Thread Phalanx. The change in programming yielded 9.7% improvement in performance for the small model,...
View Articlequestions about intel C++ studio XE 2013
Q:intel C++ studio XE 2013 when does next version come out? Q:intel C++ studio XE 2013 does this c++ compiler come with any build tools? Q:intel C++ studio XE 2013 what can I do if I find a compiler...
View ArticleOptimization features.
Hello,I'd like to talk about two weird things in the optimization process of the compiler.#1 : sqrtsd seems preferred over sqrtpd... (just did a 30% performance boost by forcing the use of 2 sqrtpd...
View ArticleUninstalling the Intel® C++ Compiler (VS 2008)
The following applies to C++ for Microsoft Visual Studio* 2008.If you uninstall the Intel® C++ Compiler from your system, it is no longer available in Visual Studio* to open Intel® C++ Projects, and an...
View ArticleSwitching Back to the Visual C++* Compiler
The following applies to C++ for Microsoft Visual Studio* 2010 and 2008.If your project is using the Intel® C++ Compiler, you can choose to switch back to the Microsoft Visual C++* Compiler by doing...
View ArticleUser and Reference Guide for the Intel® C++ Compiler 14.0
Document number: 328222-002USIntel® C++ Composer XE 2013 SP1 - Windows* OS, Linux* OS, OS X*Legal InformationStart HereIdioma Inglés
View ArticleConditional compilation bug or feature ?
Hi,I have a software that has 2 versions of an implemented algorithm, so it used different versions of the same class (in my case its SSE vs AVX, but really, it's not relevant) in 2 different compiler...
View ArticleThe Chronicles of Phi - part 5 - Plesiochronous phasing barrier – tiled_HT3
For the next optimization, I knew what I wanted to do; I just didn’t know what to call it. In looking for words that describes loosely-synchronous, I came across plesiochronous:In telecommunications, a...
View ArticlePerformance Guide Select a Configuration dialog box
The following applies to C++ for Microsoft Visual Studio* 2012, 2010, and 2008.The Select a Configuration dialog box is part of the Performance Guide. To access the Performance Guide, click Tools>...
View ArticleUsing Code Coverage in the Visual Studio* IDE
The following applies to Microsoft Visual Studio* 2012, 2010, and 2008.The code coverage tool provides the ability to determine how much application code is executed when a specific workload is applied...
View Article