Ready to practice?
Sign up to access interactive coding exercises and track your progress.
Null-Terminated Character Arrays
Work with null-terminated character arrays from C legacy code.
C-style strings
C-style arrays allow us to define a sequential collection of elements. A string is a collection of sequential characters (such as "Welcome!"). C-style string literals have type const char[9] (8 explicit characters plus 1 hidden null-terminator character).
If you hadn't connected the dots before, it should be obvious now that C-style strings are just C-style arrays whose element type is char or const char!
Although C-style string literals are fine to use in our code, C-style string objects have fallen out of favor in modern C++ because they are hard to use and dangerous (with std::string and std::string_view being the modern replacements). Regardless, you may still run across uses of C-style string objects in older code, and we would be remiss not to cover them at all.
Therefore, in this lesson, we'll take a look at the most important points regarding C-style string objects in modern C++.
Defining C-style strings
To define a C-style string variable, simply declare a C-style array variable of char (or const char / constexpr char):
char message[12]{}; // an array of 12 char, indices 0 through 11
const char greeting[]{ "Welcome" }; // an array of 8 char, indices 0 through 7
constexpr char word[] { "morning" }; // an array of 7 const char, indices 0 through 6
Remember that we need an extra character for the implicit null terminator.
When defining C-style strings with an initializer, we highly recommend omitting the array length and letting the compiler calculate the length. That way if the initializer changes in the future, you won't have to remember to update the length, and there is no risk in forgetting to include an extra element to hold the null terminator.
C-style strings will decay
In lesson 17.8, we discussed how C-style arrays will decay into a pointer in most circumstances. Because C-style strings are C-style arrays, they will decay - C-style string literals decay into a const char*, and C-style string arrays decay into either a const char* or char* depending on whether the array is const. And when a C-style string decays into a pointer, the length of the string (encoded in the type information) is lost.
This loss of length information is the reason C-style strings have a null-terminator. The length of the string can be (inefficiently) regenerated by counting the number of elements between the start of the string and the null terminator.
Outputting a C-style string
When outputting a C-style string, std::cout outputs characters until it encounters the null terminator. This null terminator marks the end of the string, so that decayed strings (which have lost their length information) can still be printed.
#include <iostream>
void display(char text[])
{
std::cout << text << '\n'; // output string
}
int main()
{
char greeting[]{ "Welcome" };
std::cout << greeting << '\n'; // outputs Welcome
display(greeting);
return 0;
}
If you try to print a string that does not have a null terminator (e.g. because the null-terminator was overwritten somehow), the result will be undefined behavior. The most likely outcome in this case will be that all the characters in the string are printed, and then it will just keep printing everything in adjacent memory slots (interpreted as a character) until it happens to hit a byte of memory containing a 0 (which will be interpreted as a null terminator)!
Inputting C-style strings
Consider the case where we are asking the user to type words separated by spaces (e.g. learning programming language). How many characters will the user enter? We have no idea.
Because C-style strings are fixed-size arrays, the solution is to declare an array larger than we are ever likely to need:
#include <iostream>
int main()
{
char words[255] {}; // declare array large enough to hold 254 characters + null terminator
std::cout << "Enter your words: ";
std::cin >> words;
std::cout << "You entered: " << words << '\n';
return 0;
}
Prior to C++20, std::cin >> words would extract as many characters as possible to words (stopping at the first non-leading whitespace). Nothing is stopping the user from entering more than 254 characters (either unintentionally, or maliciously). And if that happens, the user's input will overflow the words array and undefined behavior will result.
**Array overflow** or **buffer overflow** is a computer security issue that occurs when more data is copied into storage than the storage can hold. In such cases, the memory just beyond the storage will be overwritten, leading to undefined behavior. Malicious actors can potentially exploit such flaws to overwrite the contents of memory, hoping to change the program's behavior in some advantageous way.
In C++20, operator>> was changed so that it only works for inputting non-decayed C-style strings. This allows operator>> to extract only as many characters as the C-style string's length will allow, preventing overflow. But this also means you can no longer use operator>> to input to decayed C-style strings.
The recommended way of reading C-style strings using std::cin is as follows:
#include <iostream>
#include <iterator> // for std::size
int main()
{
char words[255] {}; // declare array large enough to hold 254 characters + null terminator
std::cout << "Enter your words: ";
std::cin.getline(words, std::size(words));
std::cout << "You entered: " << words << '\n';
return 0;
}
This call to cin.getline() will read up to 254 characters (including whitespace) into words. Any excess characters will be discarded. Because getline() takes a length, we can provide the maximum number of characters to accept. With a non-decayed array, this is easy - we can use std::size() to get the array length. With a decayed array, we have to determine the length in some other way. And if we provide the wrong length, our program may malfunction or have security issues.
In modern C++, when storing inputted text from the user, it's safer to use std::string, as std::string will adjust automatically to hold as many characters as needed.
Modifying C-style strings
One important point to note is that C-style strings follow the same rules as C-style arrays. This means you can initialize the string upon creation, but you cannot assign values to it using the assignment operator after that!
char message[]{ "Welcome" }; // ok
message = "Goodbye"; // not ok!
This makes using C-style strings a bit awkward.
Since C-style strings are arrays, you can use the [] operator to change individual characters in the string:
#include <iostream>
int main()
{
char message[]{ "Welcome" };
std::cout << message << '\n';
message[0] = 'w';
std::cout << message << '\n';
return 0;
}
This program prints:
Welcome
welcome
Getting the length of a C-style string
Because C-style strings are C-style arrays, you can use std::size() (or in C++20, std::ssize()) to get the length of the string as an array. There are two caveats here:
- This doesn't work on decayed strings.
- Returns the actual length of the C-style array, not the length of the string.
#include <iostream>
#include <iterator>
int main()
{
char text[255]{ "Welcome" }; // 7 characters + null terminator
std::cout << "length = " << std::size(text) << '\n'; // prints length = 255
char *pointer { text };
std::cout << "length = " << std::size(pointer) << '\n'; // compile error
return 0;
}
An alternate solution is to use the strlen() function, which lives in the <cstring> header. strlen() will work on decayed arrays, and returns the length of the string being held, excluding the null terminator:
#include <cstring> // for std::strlen
#include <iostream>
int main()
{
char text[255]{ "Welcome" }; // 7 characters + null terminator
std::cout << "length = " << std::strlen(text) << '\n'; // prints length = 7
char *pointer { text };
std::cout << "length = " << std::strlen(pointer) << '\n'; // prints length = 7
return 0;
}
However, std::strlen() is slow, as it has to traverse through the whole array, counting characters until it hits the null terminator.
Other C-style string manipulating functions
Because C-style strings are the primary string type in C, the C language provides many functions for manipulating C-style strings. These functions have been inherited by C++ as part of the <cstring> header.
Here are a few of the most useful that you may see in older code:
strlen()- returns the length of a C-style stringstrcpy(),strncpy(),strcpy_s()- overwrites one C-style string with anotherstrcat(),strncat()- appends one C-style string to the end of anotherstrcmp(),strncmp()- compares two C-style strings (returns0if equal)
Except for strlen(), we generally recommend avoiding these.
Avoid non-const C-style string objects
Unless you have a specific, compelling reason to use non-const C-style strings, they are best avoided, as they are awkward to work with and are prone to overruns, which will cause undefined behavior (and are potential security issues).
In the rare case that you do need to work with C-style strings or fixed buffer sizes (e.g. for memory-limited devices), we'd recommend using a well-tested 3rd party fixed-length string library designed for the purpose.
Avoid non-const C-style string objects in favor of `std::string`.
Summary
C-style strings are arrays: C-style strings are C-style arrays with element type char or const char. String literals like "Welcome" have type const char[8] (7 characters plus null-terminator).
Null-terminator requirement: C-style strings must end with a null character ('\0') to mark the end of the string. This allows decayed strings (which lose length information) to still be printed and processed.
Decay and length loss: Like all C-style arrays, C-style strings decay to pointers when used in expressions, losing length information. The null-terminator enables functions to determine where the string ends by counting characters until hitting the null.
Outputting C-style strings: std::cout prints characters until encountering the null-terminator. If the null-terminator is missing or overwritten, output will continue into adjacent memory, causing undefined behavior.
Buffer overflow vulnerability: Prior to C++20, inputting to C-style strings with std::cin >> could overflow if users entered more characters than the array holds. This is a serious security vulnerability that malicious actors can exploit.
Safe input (C++20+): Use cin.getline(array, std::size(array)) to safely read input. This prevents overflow by only reading up to the specified length. Works with non-decayed arrays where std::size() is available.
Modifying C-style strings: Individual characters can be modified with operator[], but the entire array cannot be reassigned after initialization. Attempting str = "new value" fails because arrays aren't modifiable lvalues.
Getting length: std::size() returns the array length (including null-terminator) and doesn't work on decayed strings. std::strlen() from <cstring> returns the string length (excluding null-terminator) and works on decayed strings, but is slow because it must traverse the entire string.
Other C-string functions: <cstring> provides strcpy(), strcat(), strcmp(), etc., but these are generally avoided in modern C++ except for strlen() in specific scenarios.
Modern alternatives: Use std::string for modifiable strings and std::string_view for read-only strings. C-style string objects are dangerous, hard to use, and prone to security vulnerabilities.
C-style strings exist for legacy compatibility but should be avoided in new C++ code in favor of safer, more convenient standard library alternatives.
Null-Terminated Character Arrays - Quiz
Test your understanding of the lesson.
Practice Exercises
C-Style Strings
Practice working with C-style strings (null-terminated character arrays). Learn about string literals, manipulation, and when to prefer std::string.
Lesson Discussion
Share your thoughts and questions
No comments yet. Be the first to share your thoughts!