Stream states and input validation

Stream state management

The ios_base class defines several state flags that indicate the current condition of a stream:

Flag Meaning
goodbit Stream is in a valid state and ready for operations
badbit A critical error occurred (e.g., attempting to read beyond file boundaries)
eofbit The stream reached the end of available data
failbit A recoverable error occurred (e.g., type mismatch during extraction)

While these flags technically reside in ios_base, we typically access them through ios (e.g., std::ios::failbit) for brevity.

The ios class provides convenient member functions to check these states:

Member Function Meaning
good() Returns true when the stream is in a valid state (goodbit set)
bad() Returns true when a critical error occurred (badbit set)
eof() Returns true when end-of-data was reached (eofbit set)
fail() Returns true when a recoverable error occurred (failbit set)
clear() Clears all error flags and restores the stream to goodbit state
clear(state) Clears all flags and sets the specified state flag
rdstate() Returns the currently active state flags
setstate(state) Sets the specified state flag without clearing others

The failbit flag requires the most attention because it activates when users provide invalid input. Consider this example:

#include <iostream>

int main()
{
    std::cout << "Enter your employee ID: ";
    int employeeId{};
    std::cin >> employeeId;

    return 0;
}

This code expects an integer. If the user enters "Manager" instead, std::cin cannot extract anything into employeeId, and the failbit activates.

When a stream enters an error state (anything other than goodbit), subsequent operations on that stream are ignored until the error is cleared using clear().

Input validation strategies

Input validation ensures user input meets specific criteria. Validation strategies typically fall into two categories: string validation and numeric validation.

String validation accepts all user input as text, then validates the string against formatting rules. For example, validating an email address might require an '@' symbol and a domain name. Modern C++ provides a regular expression library for complex pattern matching. However, regular expressions have performance costs - use them only when the convenience justifies the overhead or manual validation becomes too complex.

Numeric validation ensures user input is both a valid number and falls within acceptable ranges (e.g., between 1 and 100). Additionally, we must handle cases where users enter non-numeric characters entirely.

The <cctype> header provides useful character classification functions:

Function Returns non-zero if...
std::isalnum(int) parameter is a letter or digit
std::isalpha(int) parameter is a letter
std::iscntrl(int) parameter is a control character
std::isdigit(int) parameter is a digit
std::isgraph(int) parameter is a printable non-whitespace character
std::isprint(int) parameter is a printable character (including whitespace)
std::ispunct(int) parameter is neither alphanumeric nor whitespace
std::isspace(int) parameter is whitespace
std::isxdigit(int) parameter is a hexadecimal digit (0-9, a-f, A-F)

String validation examples

Let's validate a username that must contain only letters and underscores:

#include <algorithm>
#include <cctype>
#include <iostream>
#include <ranges>
#include <string>
#include <string_view>

bool isValidUsername(std::string_view username)
{
    return std::ranges::all_of(username, [](char ch) {
        return std::isalpha(ch) || ch == '_';
    });
}

int main()
{
    std::string username{};

    do
    {
        std::cout << "Create username (letters and underscores only): ";
        std::getline(std::cin, username);
    } while (!isValidUsername(username));

    std::cout << "Username '" << username << "' registered successfully!\n";

    return 0;
}

This validation isn't perfect - users could enter "___" or "aaa___bbb___ccc". We could refine the criteria to require at least one letter and limit consecutive underscores.

Pattern-based validation

For fixed-format input like phone numbers or product codes, we can match against a pattern. Let's create a validator where:

  • # matches any digit
  • @ matches any letter
  • _ matches any whitespace
  • ? matches any character
  • Other characters must match exactly
#include <algorithm>
#include <cctype>
#include <iostream>
#include <map>
#include <ranges>
#include <string>
#include <string_view>

bool inputMatchesPattern(std::string_view input, std::string_view pattern)
{
    if (input.length() != pattern.length())
    {
        return false;
    }

    static const std::map<char, int (*)(int)> validators{
        { '#', &std::isdigit },
        { '_', &std::isspace },
        { '@', &std::isalpha },
        { '?', [](int) { return 1; } }
    };

    return std::ranges::equal(input, pattern, [&validators](char ch, char mask) -> bool {
        auto found{ validators.find(mask) };

        if (found != validators.end())
        {
            return (*found->second)(ch);
        }

        return ch == mask;
    });
}

int main()
{
    std::string productCode{};

    do
    {
        std::cout << "Enter product code (ABC-###-####): ";
        std::getline(std::cin, productCode);
    } while (!inputMatchesPattern(productCode, "@@@-###-####"));

    std::cout << "Product code " << productCode << " validated!\n";

    return 0;
}

This approach enforces exact format matching. However, it has limitations: the special characters (#, @, _, ?) can't appear as literal characters in valid input, and it can't handle variable-length patterns like "at least two words separated by spaces". For such requirements, use the non-templated approach or regular expressions.

Numeric validation

The straightforward approach to numeric validation uses the extraction operator and checks the failbit:

#include <iostream>
#include <limits>

int main()
{
    int serverPort{};

    while (true)
    {
        std::cout << "Enter server port (1-65535): ";
        std::cin >> serverPort;

        if (std::cin.fail())
        {
            std::cin.clear();
            std::cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
            std::cout << "Invalid input. Please enter a number.\n";
            continue;
        }

        if (serverPort < 1 || serverPort > 65535)
        {
            std::cout << "Port must be between 1 and 65535.\n";
            continue;
        }

        break;
    }

    std::cout << "Server port set to: " << serverPort << '\n';

    return 0;
}

If the user enters an integer, extraction succeeds. std::cin.fail() returns false, we skip the error handling, and (assuming a valid range) we exit the loop.

If the user enters text starting with a letter, extraction fails. std::cin.fail() returns true, we clear the error state, discard the bad input, and loop again.

However, there's a subtle issue with input like "8080abc". The extraction operator reads "8080" successfully into serverPort, leaves "abc" in the input stream, and does NOT set the failbit. This creates two problems:

  1. If this input should be valid, you have garbage ("abc") remaining in the stream
  2. If this input should be invalid, it wasn't rejected (and you have garbage in the stream)

Cleaning up after successful extraction

To solve the first problem, always clear remaining input after successful extraction:

#include <iostream>
#include <limits>

int main()
{
    int serverPort{};

    while (true)
    {
        std::cout << "Enter server port (1-65535): ";
        std::cin >> serverPort;

        if (std::cin.fail())
        {
            std::cin.clear();
            std::cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
            std::cout << "Invalid input. Please enter a number.\n";
            continue;
        }

        std::cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n');

        if (serverPort < 1 || serverPort > 65535)
        {
            std::cout << "Port must be between 1 and 65535.\n";
            continue;
        }

        break;
    }

    std::cout << "Server port set to: " << serverPort << '\n';

    return 0;
}

Rejecting partially invalid input

For the second problem, we need to verify that only the newline character remained after extraction. We can check this using gcount(), which returns how many characters were ignored. If gcount() returns more than 1, extra characters existed beyond the newline:

#include <iostream>
#include <limits>

int main()
{
    int serverPort{};

    while (true)
    {
        std::cout << "Enter server port (1-65535): ";
        std::cin >> serverPort;

        if (std::cin.fail())
        {
            std::cin.clear();
            std::cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
            std::cout << "Invalid input. Please enter a number.\n";
            continue;
        }

        std::cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n');

        if (std::cin.gcount() > 1)
        {
            std::cout << "Invalid input. Extra characters detected.\n";
            continue;
        }

        if (serverPort < 1 || serverPort > 65535)
        {
            std::cout << "Port must be between 1 and 65535.\n";
            continue;
        }

        break;
    }

    std::cout << "Server port set to: " << serverPort << '\n';

    return 0;
}

String-based numeric validation

An alternative approach reads input as a string and then converts it to a number. This method sometimes proves simpler than direct numeric extraction:

#include <charconv>
#include <iostream>
#include <limits>
#include <optional>
#include <string>
#include <string_view>

std::optional<int> extractPort(std::string_view portStr)
{
    int result{};
    const auto end{ portStr.data() + portStr.length() };

    if (std::from_chars(portStr.data(), end, result).ec != std::errc{})
    {
        return {};
    }

    if (result < 1 || result > 65535)
    {
        return {};
    }

    return result;
}

int main()
{
    int serverPort{};

    while (true)
    {
        std::cout << "Enter server port (1-65535): ";
        std::string portStr{};

        if (!std::getline(std::cin >> std::ws, portStr))
        {
            std::cin.clear();
            std::cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
            continue;
        }

        auto extracted{ extractPort(portStr) };

        if (!extracted)
        {
            std::cout << "Invalid port number. Try again.\n";
            continue;
        }

        serverPort = *extracted;
        break;
    }

    std::cout << "Server port set to: " << serverPort << '\n';

    return 0;
}

Whether this approach requires more or less code depends on your specific validation requirements and constraints.

Input validation in C++ demands significant effort. Fortunately, validation logic like numeric string parsing can be encapsulated into reusable functions that work across many situations. Consider building a library of validation utilities for your projects.

Summary

Stream state flags: The ios_base class defines goodbit (valid state), badbit (critical error), eofbit (end of data), and failbit (recoverable error). These flags indicate the current condition of a stream.

State checking functions: The ios class provides good(), bad(), eof(), and fail() member functions to check state flags, plus clear() to reset flags and rdstate() to query current state.

failbit activation: The failbit flag activates when users provide invalid input (like entering text when a number is expected). Once a stream enters an error state, subsequent operations are ignored until clear() is called.

Input validation strategies: String validation accepts all input as text and validates against formatting rules. Numeric validation ensures input is both a valid number and within acceptable ranges.

Character classification: The header provides functions like std::isalnum(), std::isalpha(), std::isdigit(), and std::isspace() for validating character types.

Pattern-based validation: Create custom pattern matchers using character classification functions to enforce fixed-format input like phone numbers or product codes.

Numeric validation with extraction: Use the extraction operator and check failbit to validate numeric input, then clear the error state and ignore remaining input on failure.

Handling partially valid input: Input like "8080abc" succeeds for integer extraction but leaves garbage in the stream. Use gcount() after ignore() to detect extra characters and reject such input.

String-based numeric validation: Read input as a string first, then use std::from_chars() or similar functions to convert to numbers, providing more control over the validation process.

Reusable validation: Encapsulate validation logic into reusable functions that can be applied across different input scenarios, building a library of validation utilities.

Understanding stream states and validation techniques enables building robust user input handling that gracefully handles errors and prevents invalid data from entering your program.