Cache Bank Conflict

constexpr int N = 1;

for (auto j = 0u; j < 1024 / N; ++j) {
  for (auto i = 0u; i < N; ++i) {
    sum += arr1[i] + arr2[i];
  }
}
^ This is Faster?
constexpr int N = 2;

for (auto j = 0u; j < 1024 / N; ++j) {
  for (auto i = 0u; i < N; ++i) {
    sum += arr1[i] + arr2[i];
  }
}
^ This is Faster?

* The benchmark is run under AMD Ryzen 9.

* For the full benchmark code, please refer here.

* For illustration purposes only, see FAQ for more details.