TLB Miss
for (auto i = 0u; i < arr.size(); ++i) {
sum += arr[i];
}
^ This is Faster?
constexpr int PAGE_STRIDE = 4096 / sizeof(int);
int idx = 0;
for (auto i = 0u; i < arr.size(); ++i) {
sum += arr[idx];
idx = (idx + PAGE_STRIDE) % arr.size();
}
^ This is Faster?
* The benchmark is run under AMD Ryzen 9.
* For the full benchmark code, please refer here.
* For illustration purposes only, see FAQ for more details.