Hi,
I am working on an application which I'm pretty sure is memory bound. I tried doing some simple OpenMP, but there was no speedup, which seems to confirm that the kernel is indeed memory bound.
However, if Intel's newer architectures really look like this: http://software.intel.com/sites/default/files/m/d/4/1/d/8/5-3-figure-1.gif shouldn't I be able to try to pin one thread somewhere on the second four cores to get increased memory bandwidth?
It seems like pinning a thread to a core might take some work, so I wanted to see if this makes sense before I tried it.
Thanks