Dear Intel developers,
I'm using Intel 15 on Intel(R) Xeon(R) CPU E5-2670 processor. I allocated an array of structure of array by using _mm_malloc_ as the follow:
struct traces_32 { float* r; float* i; }; typedef struct traces_32 traces32; ... float* traces = (traces32*)_mm_malloc(ntr * sizeof(traces32), 16); for (i = 0; i < ntr; i++) { traces[i].r = (float *)_mm_malloc( (nsamples) * sizeof(float), 16); traces[i].i = (float *)_mm_malloc( (nsamples) * sizeof(float), 16); }
My problem is when I do the first access with _mm_load_ps(traces[0].r[0]) the program get a Sig Fault, but only in debug mode. The opt version works well. ( -O3 -xHost). By using _mm_loadu_ps(traces[0].r[0]) works well in both version.
Why the aligned load fail? And why only in debug mode?
Thanks.