“FUNCTION WAS VECTORIZED” but it doesn't vectorize on the place of the function call

The present question relates to an already existing question on Stackoverflow with the difference that in this case AVX is the target ISA and that the function to be vectorized is more complex. When I use the __attribute__((vector(...))) declaration in the function definition:

__attribute__((vector(linear(a),linear(b))))
inline void foo(float* restrict a, float* restrict b) { 
   ...
   for(j=0; j<n; j++) {   
      // do something with a[j*STRIDE] and b[j*STRIDE]
   }
   for(j=n-1; j>=0; j--) {
      // do something with a[j*STRIDE] and b[j*STRIDE]
   }
}

the compiler reports the following for the function foo():

 foo.hpp(56): (col. 101) remark: FUNCTION WAS VECTORIZED
 foo.hpp(56): (col. 101) remark: FUNCTION WAS VECTORIZED

When I want to call the function with array notation or a single for loop:

int main() {
  ...
  #pragma omp parallel for
  for(k=0; k<n; k++) {
    int base = k*256*256;

    FP* __restrict a = &h_a[base];
    FP* __restrict b = &h_b[base];

    __assume_aligned(a,32);
    __assume_aligned(b,32);

    foo(&a[0:256], &b[0:256]); // line 337
    // OR for(i=0; i<n; i++) { foo(&a[i], &b[i]);
  }
}

it refuses to vectorize:

 main.c(337): (col. 3) remark: loop was not vectorized: existence of vector dependence
 main.c(337): (col. 3) remark: loop was not vectorized: existence of vector dependence
 main.c(337): (col. 3) remark: loop was not vectorized: not inner loop

The used Intel compiler flags are:

 icc -O3 -xAVX -ip -restrict -parallel -fopenmp -vec-report2 -openmp-report2

The question: If the compiler could vectorize the function foo(), why it can not use the vectorized version on the place of the function call (main.c:337)? The "remark" message suggests that the function was analysed again by the compiler, instead of simply injecting the already compiled vector code.

Note: I tried to use a for loop instead of array notation with #pragma ivdep and also #pragma simd, but non of them helped. The actual code is much larger, then it would conveniently fit in this post.

“FUNCTION WAS VECTORIZED” but it doesn't vectorize on the place of the function call

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112