The virtual table

Consider the following program:

#include <iostream>
#include <string_view>

class Transport
{
public:
    std::string_view getMode() const { return "Transport"; }                // not virtual
    virtual std::string_view getModeVirtual() const { return "Transport"; } // virtual
};

class Train: public Transport
{
public:
    std::string_view getMode() const { return "Train"; }
    virtual std::string_view getModeVirtual() const override { return "Train"; }
};

int main()
{
    Train train{};
    Transport& transport{ train };

    std::cout << "transport has static type " << transport.getMode() << '\n';
    std::cout << "transport has dynamic type " << transport.getModeVirtual() << '\n';

    return 0;
}

First, let's look at the call to transport.getMode(). Because this is a non-virtual function, the compiler can use the actual type of transport (Transport) to determine (at compile-time) that this should resolve to Transport::getMode().

Although it looks almost identical, the call to transport.getModeVirtual() must be resolved differently. Because this is a virtual function call, the compiler must use the dynamic type of transport to resolve the call, and the dynamic type of transport is not knowable until runtime. Therefore, only at runtime will it be determined that this particular call to transport.getModeVirtual() resolves to Train::getModeVirtual(), not Transport::getModeVirtual().

So how do virtual functions actually work?

The virtual table

The C++ standard does not specify how virtual functions should be implemented (this detail is left up to the implementation).

However, C++ implementations typically implement virtual functions using a form of late binding known as the virtual table.

The virtual table is a lookup table of functions used to resolve function calls in a dynamic/late binding manner. The virtual table sometimes goes by other names, such as "vtable", "virtual function table", "virtual method table", or "dispatch table". In C++, virtual function resolution is sometimes called dynamic dispatch.

Advanced note: Here's an easier way of thinking about it in C++:

  • Early binding/static dispatch = direct function call overload resolution
  • Late binding = indirect function call resolution
  • Dynamic dispatch = virtual function override resolution

Because knowing how the virtual table works is not necessary to use virtual functions, this section can be considered optional reading.

The virtual table is actually quite simple, though it's a little complex to describe in words. First, every class that uses virtual functions (or is derived from a class that uses virtual functions) has a corresponding virtual table. This table is simply a static array that the compiler sets up at compile time. A virtual table contains one entry for each virtual function that can be called by objects of the class. Each entry in this table is simply a function pointer that points to the most-derived function accessible by that class.

Second, the compiler also adds a hidden pointer that is a member of the base class, which we will call *__vptr. *__vptr is set (automatically) when a class object is created so that it points to the virtual table for that class. Unlike the this pointer, which is actually a function parameter used by the compiler to resolve self-references, *__vptr is a real pointer member. Consequently, it makes each class object allocated bigger by the size of one pointer. It also means that *__vptr is inherited by derived classes, which is important.

By now, you're probably confused as to how these things all fit together, so let's take a look at a simple example:

class Transport
{
public:
    virtual void move() {};
    virtual void stop() {};
};

class Car: public Transport
{
public:
    void move() override {};
};

class Bicycle: public Transport
{
public:
    void stop() override {};
};

Because there are 3 classes here, the compiler will set up 3 virtual tables: one for Transport, one for Car, and one for Bicycle.

The compiler also adds a hidden pointer member to the most base class that uses virtual functions. Although the compiler does this automatically, we'll put it in the next example just to show where it's added:

class Transport
{
public:
    VirtualTable* __vptr;
    virtual void move() {};
    virtual void stop() {};
};

class Car: public Transport
{
public:
    void move() override {};
};

class Bicycle: public Transport
{
public:
    void stop() override {};
};

When a class object is created, *__vptr is set to point to the virtual table for that class. For example, when an object of type Transport is created, *__vptr is set to point to the virtual table for Transport. When objects of type Car or Bicycle are constructed, *__vptr is set to point to the virtual table for Car or Bicycle respectively.

Now, let's talk about how these virtual tables are filled out. Because there are only two virtual functions here, each virtual table will have two entries (one for move() and one for stop()). Remember that when these virtual tables are filled out, each entry is filled out with the most-derived function an object of that class type can call.

The virtual table for Transport objects is simple. An object of type Transport can only access the members of Transport. Transport has no access to Car or Bicycle functions. Consequently, the entry for move() points to Transport::move() and the entry for stop() points to Transport::stop().

The virtual table for Car is slightly more complex. An object of type Car can access members of both Car and Transport. However, Car has overridden move(), making Car::move() more derived than Transport::move(). Consequently, the entry for move() points to Car::move(). Car hasn't overridden stop(), so the entry for stop() will point to Transport::stop().

The virtual table for Bicycle is similar to Car, except the entry for move() points to Transport::move(), and the entry for stop() points to Bicycle::stop().

Here's a picture of this graphically:

Although this diagram is kind of complex looking, it's really quite simple: the *__vptr in each class points to the virtual table for that class. The entries in the virtual table point to the most-derived version of the function that objects of that class are allowed to call.

So consider what happens when we create an object of type Car:

int main()
{
    Car car{};
}

Because car is a Car object, car has its *__vptr set to the Car virtual table.

Now, let's set a base pointer to Car:

int main()
{
    Car car{};
    Transport* tPtr = &car;

    return 0;
}

Note that because tPtr is a base pointer, it only points to the Transport portion of car. However, also note that *__vptr is in the Transport portion of the class, so tPtr has access to this pointer. Finally, note that tPtr->__vptr points to the Car virtual table! Consequently, even though tPtr is of type Transport*, it still has access to Car's virtual table (through __vptr).

So what happens when we try to call tPtr->move()?

int main()
{
    Car car{};
    Transport* tPtr = &car;
    tPtr->move();

    return 0;
}

First, the program recognizes that move() is a virtual function. Second, the program uses tPtr->__vptr to get to Car's virtual table. Third, it looks up which version of move() to call in Car's virtual table. This has been set to Car::move(). Therefore, tPtr->move() resolves to Car::move()!

Now, you might be saying, "But what if tPtr really pointed to a Transport object instead of a Car object. Would it still call Car::move()?". The answer is no.

int main()
{
    Transport t{};
    Transport* tPtr = &t;
    tPtr->move();

    return 0;
}

In this case, when t is created, t.__vptr points to Transport's virtual table, not Car's virtual table. Since tPtr is pointing to t, tPtr->__vptr points to Transport's virtual table as well. Transport's virtual table entry for move() points to Transport::move(). Thus, tPtr->move() resolves to Transport::move(), which is the most-derived version of move() that a Transport object should be able to call.

By using these tables, the compiler and program are able to ensure function calls resolve to the appropriate virtual function, even if you're only using a pointer or reference to a base class!

Calling a virtual function is slower than calling a non-virtual function for a couple of reasons: First, we have to use the *__vptr to get to the appropriate virtual table. Second, we have to index the virtual table to find the correct function to call. Only then can we call the function. As a result, we have to do 3 operations to find the function to call, as opposed to 2 operations for a normal indirect function call, or one operation for a direct function call. However, with modern computers, this added time is usually fairly insignificant.

Also as a reminder, any class that uses virtual functions has a *__vptr, and thus each object of that class will be bigger by one pointer. Virtual functions are powerful, but they do have a performance cost.

Summary

Virtual table (vtable): The virtual table is a lookup table of function pointers used to resolve function calls in a dynamic/late binding manner. Every class that uses virtual functions (or is derived from a class that uses virtual functions) has a corresponding virtual table. The virtual table is a static array that the compiler sets up at compile time, containing one entry for each virtual function that can be called by objects of the class.

Virtual table pointer (__vptr): The compiler adds a hidden pointer member to the base class, called *__vptr, which is set automatically when a class object is created. This pointer points to the virtual table for that class. Unlike the this pointer, *__vptr is a real pointer member that makes each class object bigger by the size of one pointer. It is inherited by derived classes.

Virtual table entries: Each entry in the virtual table is a function pointer that points to the most-derived function accessible by that class. When a virtual function is called through a pointer or reference, the program uses *__vptr to access the appropriate virtual table, looks up which version of the function to call, and then calls that function.

Virtual function call process: When a virtual function is called, the program recognizes it's a virtual function, uses the object's *__vptr to get to the correct virtual table, looks up which version of the function to call in that table, and then calls the function. This ensures the correct function is called based on the actual type of the object, not the type of the pointer or reference.

Performance implications: Calling a virtual function is slightly slower than calling a non-virtual function because it requires three operations (accessing *__vptr, indexing the virtual table, and calling the function) instead of one or two for non-virtual calls. Additionally, any class that uses virtual functions has a *__vptr, making each object of that class larger by one pointer. However, with modern computers, this overhead is usually insignificant compared to the benefits of polymorphism.

Understanding the virtual table mechanism provides insight into how C++ implements dynamic polymorphism through virtual functions. While you don't need to know these implementation details to use virtual functions effectively, this knowledge helps explain why virtual functions behave the way they do.