Batch Processing

__attribute__((noinline))
void ProcessOne(int& value) {
  value = value * 3 + 1;
  if (value > 1000) value %= 1000;
}

for (auto& v : arr)
  ProcessOne(v);
^ This is Faster?
__attribute__((noinline))
void ProcessBatch(int* data, int size) {
  for (int i = 0; i < size; ++i) {
    data[i] = data[i] * 3 + 1;
    if (data[i] > 1000) data[i] %= 1000;
  }
}

ProcessBatch(arr.data(), arr.size());
^ This is Faster?

* The benchmark is run under AMD Ryzen 9.

* For the full benchmark code, please refer here.

* For illustration purposes only, see FAQ for more details.