Pure virtual functions, abstract base classes, and interface classes

So far, all of the virtual functions we have written have a body (a definition). However, C++ allows you to create a special kind of virtual function called a pure virtual function (or abstract function) that has no body at all! A pure virtual function simply acts as a placeholder that is meant to be redefined by derived classes.

To create a pure virtual function, rather than define a body for the function, we simply assign the function the value 0.

#include <string_view>

class Handler
{
public:
    std::string_view greet() const { return "Hello"; } // a normal non-virtual function

    virtual std::string_view getType() const { return "Handler"; } // a normal virtual function

    virtual int getPriority() const = 0; // a pure virtual function

    int calculate() = 0; // Compile error: can not set non-virtual functions to 0
};

When we add a pure virtual function to our class, we are effectively saying, "it is up to the derived classes to implement this function".

Using a pure virtual function has two main consequences: First, any class with one or more pure virtual functions becomes an abstract base class, which means that it can not be instantiated! Consider what would happen if we could create an instance of Handler:

int main()
{
    Handler h{}; // We can't instantiate an abstract base class, but for the sake of example, pretend this was allowed
    h.getPriority(); // what would this do?

    return 0;
}

Because there's no definition for getPriority(), what would h.getPriority() resolve to?

Second, any derived class must define a body for this function, or that derived class will be considered an abstract base class as well.

A pure virtual function example

Let's take a look at an example of a pure virtual function in action. In a previous lesson, we wrote a simple Logger base class and derived FileLogger and ConsoleLogger classes from it. Here's the code as we left it:

#include <string>
#include <string_view>

class Logger
{
protected:
    std::string m_name{};

    // We're making this constructor protected because
    // we don't want people creating Logger objects directly,
    // but we still want derived classes to be able to use it.
    Logger(std::string_view name)
        : m_name{ name }
    {
    }

public:
    const std::string& getName() const { return m_name; }
    virtual std::string_view getOutput() const { return "???"; }

    virtual ~Logger() = default;
};

class FileLogger: public Logger
{
public:
    FileLogger(std::string_view name)
        : Logger{ name }
    {
    }

    std::string_view getOutput() const override { return "File"; }
};

class ConsoleLogger: public Logger
{
public:
    ConsoleLogger(std::string_view name)
        : Logger{ name }
    {
    }

    std::string_view getOutput() const override { return "Console"; }
};

We've prevented people from allocating objects of type Logger by making the constructor protected. However, it is still possible to create derived classes that do not redefine function getOutput().

For example:

#include <iostream>
#include <string>
#include <string_view>

class Logger
{
protected:
    std::string m_name{};

    // We're making this constructor protected because
    // we don't want people creating Logger objects directly,
    // but we still want derived classes to be able to use it.
    Logger(std::string_view name)
        : m_name{ name }
    {
    }

public:
    const std::string& getName() const { return m_name; }
    virtual std::string_view getOutput() const { return "???"; }

    virtual ~Logger() = default;
};

class NetworkLogger : public Logger
{
public:
    NetworkLogger(std::string_view name)
        : Logger{ name }
    {
    }

    // We forgot to redefine getOutput
};

int main()
{
    NetworkLogger nl{"server123"};
    std::cout << nl.getName() << " via " << nl.getOutput() << '\n';

    return 0;
}

This will print:

server123 via ???

What happened? We forgot to redefine function getOutput(), so nl.getOutput() resolved to Logger.getOutput(), which isn't what we wanted.

A better solution to this problem is to use a pure virtual function:

#include <string>
#include <string_view>

class Logger // This Logger is an abstract base class
{
protected:
    std::string m_name{};

public:
    Logger(std::string_view name)
        : m_name{ name }
    {
    }

    const std::string& getName() const { return m_name; }
    virtual std::string_view getOutput() const = 0; // note that getOutput is now a pure virtual function

    virtual ~Logger() = default;
};

There are a couple of things to note here. First, getOutput() is now a pure virtual function. This means Logger is now an abstract base class, and can not be instantiated. Consequently, we do not need to make the constructor protected any longer (though it doesn't hurt). Second, because our NetworkLogger class was derived from Logger, but we did not define NetworkLogger::getOutput(), NetworkLogger is also an abstract base class. Now when we try to compile this code:

#include <iostream>
#include <string>
#include <string_view>

class Logger // This Logger is an abstract base class
{
protected:
    std::string m_name{};

public:
    Logger(std::string_view name)
        : m_name{ name }
    {
    }

    const std::string& getName() const { return m_name; }
    virtual std::string_view getOutput() const = 0; // note that getOutput is now a pure virtual function

    virtual ~Logger() = default;
};

class NetworkLogger: public Logger
{
public:
    NetworkLogger(std::string_view name)
        : Logger{ name }
    {
    }

    // We forgot to redefine getOutput
};

int main()
{
    NetworkLogger nl{ "server123" };
    std::cout << nl.getName() << " via " << nl.getOutput() << '\n';

    return 0;
}

The compiler will give us an error because NetworkLogger is an abstract base class and we can not create instances of abstract base classes.

This tells us that we will only be able to instantiate NetworkLogger if NetworkLogger provides a body for getOutput().

Let's go ahead and do that:

#include <iostream>
#include <string>
#include <string_view>

class Logger // This Logger is an abstract base class
{
protected:
    std::string m_name{};

public:
    Logger(std::string_view name)
        : m_name{ name }
    {
    }

    const std::string& getName() const { return m_name; }
    virtual std::string_view getOutput() const = 0; // note that getOutput is now a pure virtual function

    virtual ~Logger() = default;
};

class NetworkLogger: public Logger
{
public:
    NetworkLogger(std::string_view name)
        : Logger(name)
    {
    }

    std::string_view getOutput() const override { return "Network"; }
};

int main()
{
    NetworkLogger nl{ "server123" };
    std::cout << nl.getName() << " via " << nl.getOutput() << '\n';

    return 0;
}

Now this program will compile and print:

server123 via Network

A pure virtual function is useful when we have a function that we want to put in the base class, but only the derived classes know what it should return. A pure virtual function makes it so the base class can not be instantiated, and the derived classes are forced to define these functions before they can be instantiated. This helps ensure the derived classes do not forget to redefine functions that the base class was expecting them to.

Just like with normal virtual functions, pure virtual functions can be called using a reference (or pointer) to a base class:

int main()
{
    NetworkLogger nl{ "server123" };
    Logger& l{ nl };

    std::cout << l.getOutput(); // resolves to NetworkLogger::getOutput(), prints "Network"

    return 0;
}

In the above example, l.getOutput() resolves to NetworkLogger::getOutput() via virtual function resolution.

A reminder

Any class with pure virtual functions should also have a virtual destructor.

Pure virtual functions with definitions

It turns out that we can create pure virtual functions that have definitions:

#include <string>
#include <string_view>

class Logger // This Logger is an abstract base class
{
protected:
    std::string m_name{};

public:
    Logger(std::string_view name)
        : m_name{ name }
    {
    }

    const std::string& getName() { return m_name; }
    virtual std::string_view getOutput() const = 0; // The = 0 means this function is pure virtual

    virtual ~Logger() = default;
};

std::string_view Logger::getOutput() const  // even though it has a definition
{
    return "generic";
}

In this case, getOutput() is still considered a pure virtual function because of the "= 0" (even though it has been given a definition) and Logger is still considered an abstract base class (and thus can't be instantiated). Any class that inherits from Logger needs to provide its own definition for getOutput() or it will also be considered an abstract base class.

When providing a definition for a pure virtual function, the definition must be provided separately (not inline).

This paradigm can be useful when you want your base class to provide a default implementation for a function, but still force any derived classes to provide their own implementation. However, if the derived class is happy with the default implementation provided by the base class, it can simply call the base class implementation directly. For example:

#include <iostream>
#include <string>
#include <string_view>

class Logger // This Logger is an abstract base class
{
protected:
    std::string m_name{};

public:
    Logger(std::string_view name)
        : m_name(name)
    {
    }

    const std::string& getName() const { return m_name; }
    virtual std::string_view getOutput() const = 0; // note that getOutput is a pure virtual function

    virtual ~Logger() = default;
};

std::string_view Logger::getOutput() const
{
    return "generic"; // some default implementation
}

class RemoteLogger: public Logger
{

public:
    RemoteLogger(std::string_view name)
        : Logger{name}
    {
    }

    std::string_view getOutput() const override // this class is no longer abstract because we defined this function
    {
        return Logger::getOutput(); // use Logger's default implementation
    }
};

int main()
{
    RemoteLogger rl{"cloud-service"};
    std::cout << rl.getName() << " via " << rl.getOutput() << '\n';

    return 0;
}

The above code prints:

cloud-service via generic

This capability isn't used very commonly.

A destructor can be made pure virtual, but must be given a definition so that it can be called when a derived object is destructed.

Interface classes

An interface class is a class that has no member variables, and where all of the functions are pure virtual! Interfaces are useful when you want to define the functionality that derived classes must implement, but leave the details of how the derived class implements that functionality entirely up to the derived class.

Interface classes are often named beginning with an I. Here's a sample interface class:

#include <string_view>

class ISerializable
{
public:
    virtual bool save(std::string_view filename) = 0;
    virtual bool load(std::string_view filename) = 0;

    virtual bool clear() = 0;

    virtual ~ISerializable() {} // make a virtual destructor in case we delete an ISerializable pointer, so the proper derived destructor is called
};

Any class inheriting from ISerializable must provide implementations for all three functions in order to be instantiated. You could derive a class named JsonSerializer, where save() stores to a JSON file, load() loads from a JSON file, and clear() removes the data. You could derive another class called BinarySerializer, where save() and load() work with binary files, and clear() resets the data.

Now, let's say you need to write some code that uses serialization. If you write your code so it includes JsonSerializer or BinarySerializer directly, then you're effectively stuck using that kind of serialization (at least without recoding your program). For example, the following function effectively forces callers of processData() to use a JsonSerializer, which may or may not be what they want.

#include <cmath>

double processData(double value, JsonSerializer& serializer)
{
    if (value < 0.0)
    {
        serializer.save("Error: negative value");
        return 0.0;
    }

    return std::sqrt(value);
}

A much better way to implement this function is to use ISerializable instead:

#include <cmath>

double processData(double value, ISerializable& serializer)
{
    if (value < 0.0)
    {
        serializer.save("Error: negative value");
        return 0.0;
    }

    return std::sqrt(value);
}

Now the caller can pass in any class that conforms to the ISerializable interface. If they want the data to go to a JSON file, they can pass in an instance of JsonSerializer. If they want it to go to a binary file, they can pass in an instance of BinarySerializer. Or if they want to do something you haven't even thought of, such as sending it to a database when there's an error, they can derive a new class from ISerializable (e.g. DatabaseSerializer) and use an instance of that! By using ISerializable, your function becomes more independent and flexible.

Don't forget to include a virtual destructor for your interface classes, so that the proper derived destructor will be called if a pointer to the interface is deleted.

Interface classes have become extremely popular because they are easy to use, easy to extend, and easy to maintain. In fact, some modern languages, such as Java and C#, have added an "interface" keyword that allows programmers to directly define an interface class without having to explicitly mark all of the member functions as abstract. Furthermore, although Java and C# will not let you use multiple inheritance on normal classes, they will let you multiple inherit as many interfaces as you like. Because interfaces have no data and no function bodies, they avoid a lot of the traditional problems with multiple inheritance while still providing much of the flexibility.

Pure virtual functions and the virtual table

For consistency, abstract classes still have virtual tables. A constructor or destructor of an abstract class can call a virtual function, and it needs to resolve to the proper function (in the same class, since the derived classes either haven't been constructed yet or have already been destroyed).

The virtual table entry for a class with a pure virtual function will generally either contain a null pointer, or point to a generic function that prints an error (sometimes this function is named __purecall).

Summary

Pure virtual functions: A pure virtual function is a virtual function that has no body and is assigned the value 0 (e.g., virtual int getValue() const = 0;). Pure virtual functions act as placeholders that must be redefined by derived classes. When you add a pure virtual function to a class, you are saying "it is up to the derived classes to implement this function".

Abstract base classes: Any class with one or more pure virtual functions becomes an abstract base class, which means it cannot be instantiated. Abstract base classes are designed to be inherited from, not instantiated directly. Any derived class must provide definitions for all pure virtual functions, or it will also be considered an abstract base class.

Pure virtual functions with definitions: C++ allows you to create pure virtual functions that have definitions. The function is still considered pure virtual (because of the = 0), and the class is still abstract, but derived classes can call the base class implementation if desired. When providing a definition for a pure virtual function, the definition must be provided separately (not inline).

Virtual destructors in abstract classes: Any class with pure virtual functions should also have a virtual destructor. A destructor can be made pure virtual, but must be given a definition so that it can be called when a derived object is destructed.

Interface classes: An interface class is a class that has no member variables and where all functions are pure virtual. Interfaces are useful when you want to define the functionality that derived classes must implement, but leave the implementation details entirely up to the derived class. Interface classes are often named beginning with an I (e.g., ISerializable).

Benefits of interfaces: Interface classes provide flexibility by allowing code to work with any class that implements the interface, without being tied to specific implementations. This makes code more independent, extensible, and maintainable. Modern languages like Java and C# have built-in interface keywords and allow multiple interface inheritance to leverage these benefits.

Pure virtual functions and virtual tables: For consistency, abstract classes still have virtual tables. The virtual table entry for a pure virtual function will generally contain either a null pointer or point to a generic error function (sometimes named __purecall).

Pure virtual functions and abstract base classes are fundamental tools for creating flexible, extensible class hierarchies in C++. They enforce that derived classes provide specific functionality while allowing the base class to define a common interface. Interface classes take this further by providing pure abstraction with no implementation details, enabling highly decoupled and flexible code design.