int vs long long for vectorization in 64 bit architecture

Revision en1, by codemastercpp, 2021-07-12 01:09:39

We know that keeping aside extra memory usage, long long is faster than int in x64

It’s possible to access data items that are narrower than the width of a machine’s data bus, but it’s typically more costly than accessing items whose widths match that of the data bus. For example, when reading a 16-bit value on a 64-bit machine, a full 64 bits worth of data must still be read from memory. The desired 16-bit field then has to be masked off and possibly shifted into place within the destination register.

But if we are also depending on vectorization for speeding up execution, ints perform better

I say this based on the fact that vectorization width is higher with 32 bits integers

For example, in the following example:

    for (int i = 0; i < n; i++)
        c[i] = a[i] + b[i];

With #define int long long

main.cpp:114:5: remark: vectorized loop (vectorization width: 2, interleaved count: 2) [-Rpass=loop-vectorize]
    for (int i = 0; i < n; i++)

Without #define int long long

main.cpp:114:5: remark: vectorized loop (vectorization width: 4, interleaved count: 2) [-Rpass=loop-vectorize]
    for (int i = 0; i < n; i++)

So normal access speeds up but vectorization slows down with 64 bit integers, then what would be the final outcome? would 64 bit integer be better or 32 bit?

History

 
 
 
 
Revisions
 
 
  Rev. Lang. By When Δ Comment
en1 English codemastercpp 2021-07-12 01:09:39 1422 Initial revision (published)