Quantcast
Channel: Intel® C++ Compiler
Viewing all articles
Browse latest Browse all 1616

Auto-vectorization and std::vector

$
0
0

Hi,

I would like to understand why the Intel compiler happens to fail to vectorize some basic loops when using std::vector. The following code is testing 3 different "arrays" : C-style array, an homebrewed vector class, and the std::vector class. I am timing a loop that just does v[i] =i for all the elements of the array. The results are the following :

fayard@speed:Desktop$ icpc -std=c++11 -Ofast vector-simd.cpp -o main
fayard@speed:Desktop$ ./main
std::vector:	362787
999999
HomeMade:	164704
999999
C-array:	166045
999999

fayard@speed:Desktop$ g++-4.9 -std=c++11 -Ofast vector-simd.cpp -o main
fayard@speed:Desktop$ ./main
std::vector:	186377
999999
HomeMade:	179809
999999
C-array:	176598
999999

A quick look at the assembly code (or even with -vec-report2) proves that the Intel Compiler does not vectorize the loop with std::vector. As you can see, gcc 4.9 has no problem doing it. I would like to understand :

-  Why does my own vector class is fine for vectorization, and std::vector does not ? I would like to find some change in my class so that it prevents vectorization (in order to understand what happen with std::vector), but I can't. I've tried to move it to another file, and even put the getter in the .cpp file instead of the .h file, but careful compiling with -ipo still triggers the vectorization. Can anyone give me a hint ?

- Is the Intel Compiler the culprit or the standard library ? As far as I understand icpc uses the libc++ library from clang (I am on OSX), and g++-4.9 uses the libstdc++ library. I have tried to make icpc use the libstdc++ library and it does not vectorize, but it takes the library that is installed on OSX, not the one that I've compiled with gcc 4.9. Is there a way to make icpc use the standard library that I've compiled ?

- Can anyone find a code where my homemade class does not vectorize and the C-array does ?

Thanks for your help,

François

#include <iostream>
#include <vector>
#include <chrono>

class MyVector {
private:
    int n_elements;
    int* data;
public:
    MyVector(int in_n_elements) {
        n_elements = in_n_elements;
        data = new int[n_elements];
    }
    int& operator[](size_t i){
        return data[i];
    }
};


int main (int argc, char const *argv[])
{
    const int n_elements {1000000};
    const int n_iterations {1000};

    {
        std::vector<int> v(n_elements);

        std::chrono::steady_clock::time_point timeStart, timeEnd;
                timeStart = std::chrono::steady_clock::now();
        for(size_t i = 0; i < n_iterations; ++i)
        {
            for(size_t j = 0; j < n_elements; ++j)
            {
                v[j] = j;
            }
        }
        timeEnd = std::chrono::steady_clock::now();
        std::cout << "std::vector:\t"<<
            std::chrono::duration_cast<std::chrono::microseconds>(timeEnd -
            timeStart).count() << std::endl;

        std::cout << v[n_elements-1] << std::endl;
    }


    {
        MyVector v(n_elements);

        std::chrono::steady_clock::time_point timeStart, timeEnd;
                timeStart = std::chrono::steady_clock::now();
        for(size_t i = 0; i < n_iterations; ++i)
        {
            for(size_t j = 0; j < n_elements; ++j)
            {
                v[j] = j;
            }
        }
        timeEnd = std::chrono::steady_clock::now();
        std::cout << "HomeMade:\t"<<
            std::chrono::duration_cast<std::chrono::microseconds>(timeEnd -
            timeStart).count() << std::endl;

        std::cout << v[n_elements-1] << std::endl;
    }

    {
        int v[n_elements];

        std::chrono::steady_clock::time_point timeStart, timeEnd;
                timeStart = std::chrono::steady_clock::now();
        for(size_t i = 0; i < n_iterations; ++i)
        {
            for(size_t j = 0; j < n_elements; ++j)
            {
                v[j] = j;
            }
        }
        timeEnd = std::chrono::steady_clock::now();
               std::cout << "C-array:\t"<<
                   std::chrono::duration_cast<std::chrono::microseconds>(timeEnd -
                   timeStart).count() << std::endl;

        std::cout << v[n_elements-1] << std::endl;
    }

    return 0;
}

 


Viewing all articles
Browse latest Browse all 1616

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>