Memory Alignment
Understand memory alignment for performance and correctness
Learn how CPUs access memory and how proper alignment makes your code faster and prevents crashes on some architectures.
A Simple Example
#include <iostream>
struct Unoptimized {
char a; // 1 byte
// 3 bytes padding
int b; // 4 bytes
char c; // 1 byte
// 7 bytes padding
double d; // 8 bytes
};
struct Optimized {
double d; // 8 bytes
int b; // 4 bytes
char a; // 1 byte
char c; // 1 byte
// 2 bytes padding
};
int main() {
std::cout << "Unoptimized size: " << sizeof(Unoptimized) << " bytes\n";
std::cout << "Optimized size: " << sizeof(Optimized) << " bytes\n";
std::cout << "\nAlignment requirements:\n";
std::cout << "char: " << alignof(char) << "\n";
std::cout << "int: " << alignof(int) << "\n";
std::cout << "double: " << alignof(double) << "\n";
return 0;
}
Breaking It Down
What is Memory Alignment?
- What it means: Data addresses must be multiples of their size (int at 4-byte boundary, double at 8-byte boundary)
- Why: CPUs read memory in fixed-size chunks (typically 4 or 8 bytes) - aligned data fits cleanly
- Consequence: Misaligned data requires multiple reads and bit-shifting, which is slower
- Remember: A 4-byte int at address 0x1000 is aligned, at 0x1002 is misaligned
Struct Padding
- What it does: Compiler inserts unused bytes to align each member
- Rule: Each member is aligned according to its type (char=1, int=4, double=8)
- Total size: Struct size is rounded up to largest member's alignment
- Remember: Order matters! Group larger types first to minimize padding
Performance Impact
- Cache efficiency: Aligned data fits cleanly in cache lines (typically 64 bytes)
- Speed difference: Aligned access can be 2-10x faster than misaligned
- Architecture differences: x86 handles misaligned access (slower), ARM may crash
- Remember: For large arrays of structs, alignment savings multiply significantly
Controlling Alignment
-
sizeof(T): Returns total size including padding -
alignof(T): Returns alignment requirement for type T -
alignas(N): Forces specific alignment (e.g.,alignas(64)for cache line) - Remember: Use these tools to inspect and control memory layout
Why This Matters
- CPUs access aligned memory significantly faster - sometimes 2-10x faster than misaligned access.
- Misaligned access can be slower or even crash on ARM and other architectures.
- Compilers add padding to structs to satisfy alignment requirements - a poorly-ordered struct can waste significant memory.
- Understanding alignment helps you write cache-friendly, performance-critical code for games, databases, and systems programming.
Critical Insight
Think of memory alignment like parking cars in parking spots. If each spot is 8 feet wide and your car is 6 feet, you still take up a full spot - the 2 feet is "padding".
Now imagine you have a bus (8 feet), a car (4 feet), and two motorcycles (1 foot each). If you park them in that order, each takes its full spot. But if you park motorcycle-bus-car-motorcycle, you waste lots of space! The bus forces alignment, leaving gaps before it.
This is exactly how struct members work. Put large members first (double, then int, then char) to minimize padding and save memory.
Best Practices
Order struct members by size: Put largest types first (double, long, int, short, char) to minimize padding.
Use alignof() to check requirements: Understand alignment needs with alignof(T) before optimizing.
Pad explicitly for clarity: Use char padding[3]; to document intentional padding instead of relying on implicit padding.
Consider cache lines: For high-performance code, align structs to 64-byte cache line boundaries with alignas(64).
Common Mistakes
Ignoring struct order: Random member ordering can double or triple struct size due to padding.
Assuming packed layout: Without #pragma pack or __attribute__((packed)), compilers add padding automatically.
Portable assumptions: Alignment rules vary by architecture (x86 vs ARM) and compiler.
Array implications: Padding in a struct is multiplied by array size - a 10-byte padded struct in an array of 10,000 wastes 100KB.
Debug Challenge
This struct wastes memory due to poor ordering. Click the highlighted line to see the optimized version:
Quick Quiz
- Why do compilers add padding to structs?
- How can you minimize padding in a struct?
- What happens on ARM processors if you access misaligned data?
Practice Playground
Time to try out what you just learned! Play with the example code below, experiment by making changes and running the code to deepen your understanding.
Output:
Error:
Lesson Progress
- Fix This Code
- Quick Quiz
- Practice Playground - run once