Cache Bank Conflict
constexpr int N = 1;
for (auto j = 0u; j < 1024 / N; ++j) {
for (auto i = 0u; i < N; ++i) {
sum += arr1[i] + arr2[i];
}
}
^ This is Faster?
constexpr int N = 2;
for (auto j = 0u; j < 1024 / N; ++j) {
for (auto i = 0u; i < N; ++i) {
sum += arr1[i] + arr2[i];
}
}
^ This is Faster?
* The benchmark is run under AMD Ryzen 9.
* For the full benchmark code, please refer here.
* For illustration purposes only, see FAQ for more details.