I have experienced my program crash because the compiler generates vectorized code where I think it shouldn't. Invoking the code with unaligned input causes SIGBUS or SIGSEGV. One needs to obtain the source code of the Armadillo C++ library version 4.100.1 (http://arma.sourceforge.net/download.html) to reproduce the problem.
Here is the program:
#include <armadillo> #include <iostream> int main() { arma::Mat<double> A(25, 25); std::cout << A.colptr(0) << std::endl << A.colptr(1) << std::endl; A.col(1) *= 2.0; // crash here return 0; }
I am using Intel Composer XE 2013-SP1 [icpc (ICC) 14.0.1 20131010] on Mac OS X 10.8.5 with XCode 3.2.1 to compile like so:
$ icpc -mkl -Iarmadillo-4.100.1/include arma_bug.cpp -o arma_bug -vec-report=6 arma_bug.cpp(7): (col. 14) remark: vectorization support: reference dest_45828 has aligned access arma_bug.cpp(7): (col. 14) remark: vectorization support: reference dest_45828 has aligned access arma_bug.cpp(7): (col. 14) remark: vectorization support: unroll factor set to 4 arma_bug.cpp(7): (col. 14) remark: LOOP WAS VECTORIZED arma_bug.cpp(7): (col. 14) remark: vectorization support: reference dest_45828 has aligned access arma_bug.cpp(7): (col. 14) remark: vectorization support: reference dest_45828 has aligned access arma_bug.cpp(7): (col. 14) remark: vectorization support: unroll factor set to 4 arma_bug.cpp(7): (col. 14) remark: LOOP WAS VECTORIZED
The crash occurs when invoking a loop in the Armadillo library at arrayops_meat.hpp:839.
{ for(uword i=0; i<n_elem; ++i) { dest[i] *= val; } }
The compiler makes an assumption that it is safe to vectorize the loop, which is wrong: the dest array can be unaligned. This unaligned access causes the crash. Placing #pragma novector before this loop solves the problem.
I am hoping that someone from Intel could reproduce this and confirm that it is a compiler issue.