How to use the auto-dispatching for AVX (plus some more questions)

Hi,

1) I spent more than a day playing with AVX intrinsics just to find out, that despite I made almost as fast as my assembler code (with ICC actually slightly faster), ICC itself produced even better code! So it seems I'm going for ICC after all, but :

- I need the software to be working on everything from SSE2 upwards, hence /arch:SSE2

- I want auto-dispatcher for AVX, since I found out the AVX code is faster on Sandy bridge and very much faster on Haswell

So I used /QaxCORE-AVX, but there was no difference and in debugger I verified it didn't create (or run) AVX code, it was using just SSE2. But it did create a great AVX code with /arch:AVX, but that wouldn't work on older CPUs, so it is not usable. So how can I enable the dispatching?

2) My software is full of vectorial cycles such as

for (int i=0; i<cnt; i++) dst[i] = (a[i] + b[i]) * c[i]

In these cases I know that I want this particular part of the code dispatched into multiple architectures (perhaps even FMA and newer in some cases). So should I mark these parts of the code somehow? Or how does this work? Is there some guide about writing code, so that it is easier for vectorization?

3) Does the vectorization (and other optimizations) work the same way on OSX as on Windows? I'll need both and I'm a little bit scared as things are usually much more problematic on OSX.

4) I actually compiled a big project with ICC and compared the realtime performance and sadly the difference much noticeable compared to MSVC, but the code ICC produced is like 30% bigger, which makes me think if despite the ICC produces better vectorized code, the code is so big, that the code cache misses are so frequent that it may degrade performance back to original level.

5) Can I use Profile guided optimizations with just ICC without buying VTune? Is it worth the trouble at all?

Thanks in advance!

How to use the auto-dispatching for AVX (plus some more questions)

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112