Intermediate 14 min

Binary Files

Learn to work with binary files for efficient data storage and retrieval

Learn how to work with binary files to store and retrieve data efficiently in its raw memory format.

A Simple Example

#include <iostream>
#include <sstream>
#include <cstring>

struct Player {
    char name[50];
    int level;
    double health;
    int score;
};

void saveBinary(const Player& player, std::stringstream& buffer) {
    buffer.write(reinterpret_cast<const char*>(&player), sizeof(Player));
}

Player loadBinary(std::stringstream& buffer) {
    Player player;
    buffer.seekg(0);  // Reset to beginning
    buffer.read(reinterpret_cast<char*>(&player), sizeof(Player));
    return player;
}

int main() {
    // Create a buffer to simulate binary file I/O
    std::stringstream buffer{std::ios::binary | std::ios::in | std::ios::out};

    Player player;
    std::strcpy(player.name, "DragonSlayer");
    player.level = 42;
    player.health = 87.5;
    player.score = 125000;

    // Save to binary buffer (like writing to a file)
    saveBinary(player, buffer);
    std::cout << "Game saved to buffer!\n";
    std::cout << "Binary size: " << sizeof(Player) << " bytes\n";

    // Load from binary buffer (like reading from a file)
    Player loaded{loadBinary(buffer)};
    std::cout << "\nLoaded save:\n";
    std::cout << "Name: " << loaded.name << "\n";
    std::cout << "Level: " << loaded.level << "\n";
    std::cout << "Health: " << loaded.health << "\n";
    std::cout << "Score: " << loaded.score << "\n";

    // For real files, use:
    // std::ofstream file{"save.dat", std::ios::binary};
    // std::ifstream file{"save.dat", std::ios::binary};

    return 0;
}

Breaking It Down

Opening Binary Files

  • Use std::ios::binary flag when opening files
  • std::ofstream file{filename, std::ios::binary} for writing
  • std::ifstream file{filename, std::ios::binary} for reading
  • Remember: Without binary flag, newlines get converted on some platforms

write() and read() - Binary I/O

  • write(pointer, size): Writes raw bytes to file
  • read(pointer, size): Reads raw bytes from file
  • Both take char* pointer and byte count
  • Remember: Use reinterpret_cast to convert object pointers to char*

reinterpret_cast for Binary I/O

  • What it does: Reinterprets a pointer as a different type
  • Usage: reinterpret_cast<char*>(&object) for read/write
  • Tells compiler to treat object as raw bytes
  • Remember: Only safe for POD (Plain Old Data) types

When Binary I/O is Appropriate

  • Fixed-size structs with simple data types
  • Performance-critical data storage
  • Working with existing binary formats (images, audio, etc.)
  • Remember: Not portable across different architectures without careful design

Why This Matters

  • Text files are human-readable but inefficient. Binary files store data in raw memory format - perfect for game saves, scientific data, image files, or database-like storage.
  • A double stored as text takes 20+ bytes ("3.14159265358979"), but only 8 bytes in binary format. Binary I/O is also faster since there's no conversion overhead.
  • Understanding binary I/O is essential for performance-critical applications and working with non-text data formats.

Critical Insight

Binary I/O is just a memory dump. When you write a struct, you're copying its exact memory layout to disk. When you read it back, you're copying disk bytes directly into memory. This is blazing fast but platform-specific - different architectures might have different endianness or padding.

Think of it like taking a photograph of memory and saving it. It's fast and efficient, but only works if the "camera" (architecture) that took the photo is the same as the one viewing it. For portable binary formats, you need to serialize each field explicitly or use a library like Protocol Buffers.

Best Practices

Always use std::ios::binary flag: Without this flag, newline characters get converted on some platforms, corrupting binary data.

Only write POD types directly: Plain Old Data types (no pointers, virtual functions, or dynamic memory) are safe for direct binary I/O.

Check stream state after operations: Verify read() and write() succeeded by checking the stream state.

Document binary format: Keep clear documentation of the binary format for future reference and debugging.

Common Mistakes

Forgetting std::ios::binary: Without this flag, newline characters get converted on some platforms, corrupting binary data.

Writing std::string directly: std::string contains a pointer to heap memory. Write the length and characters separately.

Platform assumptions: Binary files aren't portable across different architectures without careful handling of endianness and padding.

Writing objects with pointers: Pointers become invalid when read back. Only write POD types or serialize member by member.

Debug Challenge

This code opens a binary file incorrectly. Click the highlighted line to fix it:

1 #include <fstream>
2
3 int main() {
4 std::ofstream file{"data.bin"};
5 int num{42};
6 file.write(reinterpret_cast<char*>(&num), sizeof(int));
7 return 0;
8 }

Quick Quiz

  1. Why is binary I/O faster than text I/O?
Binary files are smaller
The operating system optimizes binary files
Binary mode uses better compression
No conversion between memory format and text representation
  1. What does reinterpret_cast do in file I/O?
Changes the type interpretation of a pointer
Compresses the data
Validates the data before writing
Converts data to text format
  1. Can you safely write a std::vector directly to a binary file?
No, it contains pointers to heap memory
Yes, but only for POD types
Yes, if you use std::ios::binary mode
Yes, always safe

Step Through the Code

Walk through the code step by step. Watch how variables change and see the program output at each line.

Lesson Progress

  • Fix This Code
  • Quick Quiz
  • Practice Playground - run once