What is Random File I/O?

Random file I/O allows you to jump to any position within a file and read or write data there, rather than processing the file sequentially from start to finish. This is accomplished using the seekg() and seekp() functions to reposition the internal file pointer.

The file pointer mechanism

Every file stream maintains an internal file pointer that tracks the current read/write position within the file. When you read from or write to a file, the operation occurs at the file pointer's current location. By default, opening a file for reading or writing positions the file pointer at the beginning. However, opening a file in append mode (std::ios::app) positions the pointer at the end, ensuring new data doesn't overwrite existing content.

Random access using seekg() and seekp()

All previous file operations have been sequential - reading or writing file contents in order from start to finish. However, C++ supports random file access, allowing you to jump to specific positions within a file. This capability proves invaluable when working with record-based files where you want to retrieve a specific record without reading all preceding data.

Random file access works by manipulating the file pointer using seekg() (for input streams) and seekp() (for output streams). The function names derive from "get" and "put" respectively. Although some stream types maintain independent read and write positions, file streams always keep these positions synchronized. You can use seekg() and seekp() interchangeably with file streams.

Both functions accept two parameters: an offset specifying how many bytes to move, and an ios flag indicating the reference point for that offset:

Seek Flag Reference Point
beg Beginning of the file (default)
cur Current file pointer position
end End of the file

Positive offsets move the file pointer toward the file's end, while negative offsets move it toward the beginning.

Basic seeking examples

#include <fstream>

int main()
{
    std::ifstream dataFile{ "records.dat" };

    dataFile.seekg(20, std::ios::cur);    // Move forward 20 bytes from current position
    dataFile.seekg(-10, std::ios::cur);   // Move backward 10 bytes from current position
    dataFile.seekg(50, std::ios::beg);    // Move to the 50th byte from file start
    dataFile.seekg(50);                   // Same as above (beg is default)
    dataFile.seekg(-30, std::ios::end);   // Move to 30 bytes before file end

    return 0;
}

Moving to the file's beginning or end is particularly simple:

#include <fstream>

int main()
{
    std::ifstream dataFile{ "records.dat" };

    dataFile.seekg(0, std::ios::beg);     // Move to beginning
    dataFile.seekg(0, std::ios::end);     // Move to end

    return 0;
}

Important warning about text file seeking

Seeking to positions other than the file's beginning in text files can produce unexpected results. Newline characters ('\n') are actually abstractions:

  • On Windows, newlines are represented as two bytes: CR (carriage return) + LF (line feed)
  • On Unix-based systems, newlines are represented as one byte: LF (line feed)

Seeking past newlines in either direction requires a variable number of bytes depending on the file's encoding, causing results to vary between platforms.

Additionally, some operating systems pad files with trailing zero bytes (bytes with value 0). Seeking to the end of a file or offset from the end produces different results on such systems.

Practical example: Reading specific positions

Let's use seeking with a simple configuration file. Suppose we have server.cfg:

ServerName=ProductionDB
Port=5432
MaxConnections=100
Timeout=30

Here's how to seek to different positions:

#include <fstream>
#include <iostream>
#include <string>

int main()
{
    std::ifstream config{ "server.cfg" };

    if (!config)
    {
        std::cerr << "ERROR: Could not open server.cfg\n";
        return 1;
    }

    std::string data{};

    config.seekg(11);  // Skip "ServerName=" (11 characters)
    std::getline(config, data);
    std::cout << "Server: " << data << '\n';

    config.seekg(6, std::ios::cur);  // Skip "\nPort=" (6 characters)
    std::getline(config, data);
    std::cout << "Port: " << data << '\n';

    config.seekg(-11, std::ios::end);  // Move near end of file
    std::getline(config, data);
    std::cout << "Near end: " << data << '\n';

    return 0;
}

Expected output (results may vary due to newline encoding):

Server: ProductionDB
Port: 5432
Near end: Timeout=30

Note: This example demonstrates seeking mechanics but isn't production-ready. Real configuration parsers should handle files more robustly.

Binary mode for reliable seeking

Seeking works more reliably with binary files. Open files in binary mode using std::ifstream::binary:

#include <fstream>
#include <iostream>
#include <string>

int main()
{
    std::ifstream binaryFile{ "data.bin", std::ifstream::binary };

    if (!binaryFile)
    {
        std::cerr << "ERROR: Could not open data.bin\n";
        return 1;
    }

    // Seeking in binary mode is predictable and platform-independent
    binaryFile.seekg(100);
    // Read operations...

    return 0;
}

Determining file size with tellg() and tellp()

The tellg() and tellp() functions return the file pointer's absolute position. This capability helps determine file sizes:

#include <fstream>
#include <iostream>

int main()
{
    std::ifstream logFile{ "application.log" };

    if (!logFile)
    {
        std::cerr << "ERROR: Could not open application.log\n";
        return 1;
    }

    logFile.seekg(0, std::ios::end);
    std::cout << "Log file size: " << logFile.tellg() << " bytes\n";

    return 0;
}

On Windows, if application.log contains:

[INFO] Service started
[WARNING] High memory usage
[ERROR] Connection timeout

The output might be:

Log file size: 84 bytes

Note: Results vary across platforms due to different newline representations and potential file padding with trailing zeros.

Reading and writing simultaneously with fstream

The fstream class supports simultaneous reading and writing - with an important caveat: you cannot arbitrarily switch between reading and writing. After performing a read or write operation, you must modify the file position (using a seek operation) before switching to the other operation type.

If you don't want to actually move the file pointer, seek to the current position:

#include <fstream>

int main()
{
    std::fstream dataFile{ "records.dat", std::ios::in | std::ios::out };

    // Read operation
    int value{};
    dataFile >> value;

    // Must seek before switching to write, even to stay at current position
    dataFile.seekg(dataFile.tellg(), std::ios::beg);

    // Write operation
    dataFile << value + 10;

    return 0;
}

Note: Although dataFile.seekg(0, std::ios::cur) seems like it should work, some compilers optimize it away. Use the pattern shown above for reliable behavior.

Also, unlike ifstream where while (inf) works to check for more data, this doesn't work reliably with fstream.

Practical example: Log file processor

Let's build a program that reads a log file, finds all ERROR entries, and replaces "ERROR" with "FIXED":

#include <fstream>
#include <iostream>
#include <string>

int main()
{
    std::fstream logFile{ "system.log", std::ios::in | std::ios::out };

    if (!logFile)
    {
        std::cerr << "ERROR: Could not open system.log\n";
        return 1;
    }

    char ch{};

    while (logFile.get(ch))
    {
        if (ch == 'E')
        {
            // Peek ahead to check if this is "ERROR"
            char next[4]{};
            std::streampos currentPos{ logFile.tellg() };

            logFile.read(next, 4);

            if (std::string(next, 4) == "RROR")
            {
                // Move back to the 'E'
                logFile.seekg(currentPos - 1);

                // Write "FIXED" over "ERROR"
                logFile << "FIXED";

                // Seek to current position to enable reading again
                logFile.seekg(logFile.tellg(), std::ios::beg);
            }
            else
            {
                // False alarm, return to reading position
                logFile.seekg(currentPos);
            }
        }
    }

    std::cout << "Log processing complete\n";

    return 0;
}

If system.log initially contains:

[INFO] System startup
[ERROR] Database connection failed
[WARNING] Retry attempt 1
[ERROR] Authentication failed

After running the program, it becomes:

[INFO] System startup
[FIXED] Database connection failed
[WARNING] Retry attempt 1
[FIXED] Authentication failed

Additional useful file functions

The is_open() function returns true if the stream is currently open:

#include <fstream>
#include <iostream>

int main()
{
    std::fstream dataFile{};

    if (!dataFile.is_open())
    {
        std::cout << "No file is currently open\n";
    }

    dataFile.open("data.txt", std::ios::in | std::ios::out);

    if (dataFile.is_open())
    {
        std::cout << "data.txt is now open\n";
    }

    return 0;
}

To delete files, use the remove() function from <cstdio>:

#include <cstdio>
#include <iostream>

int main()
{
    if (std::remove("temporary.tmp") == 0)
    {
        std::cout << "Temporary file deleted\n";
    }
    else
    {
        std::cout << "Could not delete file\n";
    }

    return 0;
}

Critical warning about pointer persistence

Never write pointer addresses to files. While you can technically write address values to disk, this creates severe bugs. Variables occupy different memory addresses each time a program runs.

Example problem scenario:

  1. Your program stores connectionCount at address 0x00FF4420 with value 100
  2. You write both the value (100) and the address (0x00FF4420) to disk
  3. Later, you restart your program and load these values
  4. Now connectionCount lives at 0x00FF4418 instead
  5. Your loaded pointer still points to 0x00FF4420 - which now contains garbage data or belongs to a completely different variable
  6. Accessing this pointer causes undefined behavior, crashes, or corrupted data

Solution: Always save actual data values, never addresses. Reconstruct your program's data structures from the saved values when loading.

Random file access provides powerful capabilities for working with structured data files. Combined with sequential file I/O, these tools enable building sophisticated file-based data systems.

Summary

File pointer mechanism: Every file stream maintains an internal file pointer tracking the current read/write position. Operations occur at this position, and the pointer advances automatically after each operation.

Random access with seekg() and seekp(): Use seekg() for input streams and seekp() for output streams to jump to specific positions. File streams keep read and write positions synchronized, so either function works with fstream.

Seek parameters: Both functions accept an offset (bytes to move) and an ios flag indicating the reference point: beg (beginning), cur (current position), or end (end of file).

Positive and negative offsets: Positive offsets move toward the file's end, negative offsets move toward the beginning.

Text file seeking limitations: Seeking to positions other than the beginning or end in text files produces platform-dependent results due to variable newline representations and potential file padding.

Binary mode for reliability: Open files in binary mode using std::ifstream::binary for predictable, platform-independent seeking behavior.

Determining file size: Use tellg() or tellp() to get the file pointer's absolute position. Seek to the end and call tellg() to determine file size (results may vary across platforms due to newline encoding).

Simultaneous read/write with fstream: The fstream class supports both operations, but you must perform a seek operation when switching between reading and writing, even if you want to stay at the current position.

is_open() function: Returns true if a stream is currently connected to a file, allowing verification of file status.

File deletion: Use std::remove() from to delete files programmatically.

Never persist pointers: Pointer addresses are invalid across program runs. Variables occupy different memory addresses each time, making saved addresses useless or dangerous.

Understanding random file access enables efficient record-based file processing, allowing direct access to specific data without reading entire files sequentially.