I have a very simple loop in C:
for (i=0; i < len; ++i) { beta[index[i]] += d * value[i]; }
In this loop beta and value are double arrays while index is an integer array. beta itself can be a very long array (potentially millions of elements), but len is typically much shorter, say, 5% of the length of beta. Of course, all the arrays are independent of each other. We can also assume that no two entries in index are the same. What bugs me is that no matter what I do nothing seems to help. So far I have tried using the restrict keyword, specifying #pragma ivdep, manual unrolling, prefetching (though I may have applied these last two with not the right unroll factor / prefetch lookahead), and even tried using mkl to first gather the values to be updated, do the update with daxpy, then scatter the results.
Any suggestions what could be done with this?
Thanks,
--Laci