Misalign — Which is Faster?

Misaligned memory access occurs when data is not naturally aligned to its size boundary (e.g., a 8-byte int64_t at an address not divisible by 8). On modern x86 CPUs, misaligned access is handled in hardware with a small performance penalty.

The penalty is small on modern x86 (~5%) because the CPU can split the access across cache lines transparently. However, on some architectures (e.g., older ARM), misaligned access can cause a fault or be significantly slower.

The performance impact is larger when the misaligned access crosses a cache line boundary (64 bytes), requiring two cache line fetches instead of one.

References:

Wikipedia - Data Structure Alignment