I came across following code while going through intel's Auto vectorization tutorial,
for (int i=0; i<length; i++) { float s = b[i]*b[i] - 4*a[i]*c[i]; if ( s >= 0 ) { s = sqrt(s) ; x2[i] = (-b[i]+s)/(2.*a[i]); x1[i] = (-b[i]-s)/(2.*a[i]); } else { x2[i] = 0.; x1[i] = 0.; }
“if” statements are allowed if they can be implemented as masked assignments, which is usually the case.
so how exactly does the masked assignments work ? I think i have a little grasp on the masked assignment concept from here.
are these bit vectors are calculated via compilers by applying some heuristics or at run time ( BPU) ?
i understand that bit vectors with TRUE value are clubbed together and viceversa for FALSE & both of them are executed altogether.
but how exactly the mechanism works with intel machine ?
also what will be the scenario if code is non-straight line;
i.e.
for(i=0;i<lengthA;i++) for(j=0;j<lengthB;j++) { float s = a[i][j]*b[j][i]; if ( s >= 0 ) { c[i][j] = sqrt(s) ; } else { c[i][j] = 0.01 ; } }
A simplified/tutorialized explanation will be very fruitful .Eagerly awaiting your reply ,