C-style string symbolic constants

In the previous lesson, we discussed how to create and initialize C-style string objects:

#include <iostream>

int main()
{
    char username[]{ "Alex" }; // C-style string
    std::cout << username << '\n';

    return 0;
}

C++ supports two different ways to create C-style string symbolic constants:

#include <iostream>

int main()
{
    const char username[] { "Alex" };       // case 1: const C-style string initialized with C-style string literal
    const char* const status{ "Active" };   // case 2: const pointer to C-style string literal

    std::cout << username << ' ' << status << '\n';

    return 0;
}

This prints:

Alex Active

While the above two methods produce the same results, C++ deals with the memory allocation for these slightly differently.

In case 1, "Alex" is put into (probably read-only) memory somewhere. Then the program allocates memory for a C-style array of length 5 (four explicit characters plus the null terminator), and initializes that memory with the string "Alex". So we end up with 2 copies of "Alex" - one in global memory somewhere, and the other owned by username. Since username is const (and will never be modified), making a copy is inefficient.

In case 2, how the compiler handles this is implementation defined. What usually happens is that the compiler places the string "Active" into read-only memory somewhere, and then initializes the pointer with the address of the string.

For optimization purposes, multiple string literals may be consolidated into a single value. For example:

const char* label1{ "Button" };
const char* label2{ "Button" };

These are two different string literals with the same value. Because these literals are constants, the compiler may opt to save memory by combining these into a single shared string literal, with both label1 and label2 pointed at the same address.

Type deduction with const C-style strings

Type deduction using a C-style string literal is fairly straightforward:

    auto text1{ "Button" };  // type deduced as const char*
    auto* text2{ "Button" }; // type deduced as const char*
    auto& text3{ "Button" }; // type deduced as const char(&)[7]

Outputting pointers and C-style strings

You may have noticed something interesting about the way std::cout handles pointers of different types.

Consider the following example:

#include <iostream>

int main()
{
    int numbers[]{ 10, 20, 30, 40, 50 };
    char letters[]{ "Welcome!" };
    const char* pointer{ "Button" };

    std::cout << numbers << '\n'; // numbers will decay to type int*
    std::cout << letters << '\n'; // letters will decay to type char*
    std::cout << pointer << '\n'; // pointer is already type char*

    return 0;
}

When tested, this printed:

00D5F820
Welcome!
Button

Why did the int array print an address, but the character arrays print as strings?

The answer is that the output streams (e.g. std::cout) make some assumptions about your intent. If you pass it a non-char pointer, it will simply print the contents of that pointer (the address that the pointer is holding). However, if you pass it an object of type char* or const char*, it will assume you're intending to print a string. Consequently, instead of printing the pointer's value (an address), it will print the string being pointed to instead!

While this is desired most of the time, it can lead to unexpected results. Consider the following case:

#include <iostream>

int main()
{
    char letter{ 'W' };
    std::cout << &letter;

    return 0;
}

In this case, the programmer is intending to print the address of variable letter. However, &letter has type char*, so std::cout tries to print this as a string! And because letter is not null-terminated, we get undefined behavior.

When tested, this printed:

W╠╠╠╠╜╡8;¿■B

Why did it do this? Well, first it assumed &letter (which has type char*) was a C-style string. So it printed the 'W', and then kept going. Next in memory was a bunch of garbage. Eventually, it ran into some memory holding a 0 value, which it interpreted as a null terminator, so it stopped. What you see may be different depending on what's in memory after variable letter.

This case is somewhat unlikely to occur in real-life (as you're not likely to actually want to print memory addresses), but it is illustrative of how things work under the hood, and how programs can inadvertently go off the rails.

If you actually want to print the address of a char pointer, static_cast it to type const void*:

#include <iostream>

int main()
{
    const char* pointer{ "Button" };

    std::cout << pointer << '\n';                           // print pointer as C-style string
    std::cout << static_cast<const void*>(pointer) << '\n'; // print address held by pointer

    return 0;
}

Favor std::string_view for C-style string symbolic constants

There is little reason to use C-style string symbolic constants in modern C++. Instead, favor constexpr std::string_view objects, which tend to be just as fast (if not faster) and behave more consistently.

Best Practice
Avoid C-style string symbolic constants in favor of `constexpr std::string_view`.

Summary

Two forms of C-style string constants: Use const char array[] (makes a copy of the string literal) or const char* const (points directly to the string literal). Both produce the same output but differ in memory management.

Memory inefficiency of arrays: const char username[] = "Alex" creates two copies of "Alex": one in read-only memory (the literal) and one in the array. This is wasteful for constants that never change.

Pointer approach: const char* const status = "Active" typically stores the string once in read-only memory and just holds a pointer to it. More memory-efficient for string constants.

String literal consolidation: Compilers may consolidate multiple identical string literals into a single value in memory, with all pointers referencing the same address. This optimization is implementation-defined.

Type deduction: auto text1 = "Button" deduces const char*. auto* text2 = "Button" also deduces const char*. auto& text3 = "Button" deduces const char(&)[7] (array reference).

Output stream behavior: std::cout treats char* and const char* pointers specially, printing the string they point to rather than the address. Other pointer types print their address values.

Unexpected behavior: Printing &letter where letter is a char variable attempts to print as a string, causing undefined behavior if not null-terminated. The output continues into garbage memory until hitting a null byte.

Printing char pointer addresses: Cast to const void* to print the actual address instead of the string: std::cout << static_cast<const void*>(pointer).

Modern alternative: Use constexpr std::string_view instead of C-style string constants. It's typically as fast or faster, behaves more consistently, and avoids the quirks of C-style strings.

C-style string constants exist for legacy compatibility but modern C++ offers better alternatives that avoid the special-case behaviors and potential pitfalls of char pointers.