Dear Intel developers,
I'm using Intel 15 on Intel(R) Xeon(R) CPU E5-2670 processor. I allocated an array of structure of array by using _mm_malloc_ as the follow:
struct traces_32 {
float* r;
float* i;
};
typedef struct traces_32 traces32;
...
float* traces = (traces32*)_mm_malloc(ntr * sizeof(traces32), 16);
for (i = 0; i < ntr; i++) {
traces[i].r = (float *)_mm_malloc( (nsamples) * sizeof(float), 16);
traces[i].i = (float *)_mm_malloc( (nsamples) * sizeof(float), 16);
}My problem is when I do the first access with _mm_load_ps(traces[0].r[0]) the program get a Sig Fault, but only in debug mode. The opt version works well. ( -O3 -xHost). By using _mm_loadu_ps(traces[0].r[0]) works well in both version.
Why the aligned load fail? And why only in debug mode?
Thanks.