Aggregation

In the previous lesson, we noted that object composition is the process of creating complex objects from simpler ones. We also talked about one type of object composition, called composition. In a composition relationship, the whole object is responsible for the existence of the part.

In this lesson, we'll take a look at the other subtype of object composition, called aggregation.

To qualify as an aggregation, a whole object and its parts must have the following relationship:

  • The part (member) is part of the object (class)
  • The part (member) can (if desired) belong to more than one object (class) at a time
  • The part (member) does not have its existence managed by the object (class)
  • The part (member) does not know about the existence of the object (class)

Like a composition, an aggregation is still a part-whole relationship, where the parts are contained within the whole, and it's a unidirectional relationship. However, unlike a composition, parts can belong to more than one object at a time, and the whole object isn't responsible for the existence and lifespan of the parts. When an aggregation is created, the aggregation isn't responsible for creating the parts. When an aggregation is destroyed, the aggregation isn't responsible for destroying the parts.

For example, consider the relationship between a library and its books. In this example, for simplicity, we'll say every library has books. However, a book can belong to more than one library at a time: for example, if the book is part of a multi-library network or consortium. However, the book isn't managed by the library -- the book probably existed before the library acquired it, and will exist after the library removes it from its collection. Additionally, a library knows what books it has, but the books don't know what libraries they're part of. Therefore, this is an aggregate relationship.

Alternatively, consider a computer and its monitor. A monitor is part of the computer. And although the monitor belongs to the computer, it can belong to other things as well, like the person who owns the computer. The computer isn't responsible for the creation or destruction of the monitor. And while the computer knows it has a monitor (it has to in order to display anything), the monitor doesn't know it's part of the computer.

When it comes to modeling physical objects, the use of the term "destroyed" can be a little dicey. One might argue, "If a meteor fell out of the sky and destroyed the computer, wouldn't the monitor be destroyed too?" Yes, of course. But that's the fault of the meteor. The important point is that the computer isn't responsible for destruction of its parts (but an external force might be).

We can say that aggregation models "has-a" relationships (a library has books, the computer has a monitor).

Similar to a composition, the parts of an aggregation can be singular or multiplicative.

Implementing aggregations

Because aggregations are similar to compositions in that they are both part-whole relationships, they are implemented almost identically, and the difference between them is mostly semantic. In a composition, we typically add our parts to the composition using normal member variables (or pointers where the allocation and deallocation process is handled by the composition class).

In an aggregation, we also add parts as member variables. However, these member variables are typically either references or pointers that are used to point at objects that have been created outside the scope of the class. Consequently, an aggregation usually either takes the objects it's going to point to as constructor parameters, or it begins empty and the subobjects are added later via access functions or operators.

Because these parts exist outside of the scope of the class, when the class is destroyed, the pointer or reference member variable will be destroyed (but not deleted). Consequently, the parts themselves will still exist.

Let's take a look at an Instructor and Course example in more detail. In this example, we're going to make a couple of simplifications: First, the course will only hold one instructor. Second, the instructor will be unaware of what courses they're part of.

#include <iostream>
#include <string>
#include <string_view>

class Instructor
{
private:
    std::string m_name{};

public:
    Instructor(std::string_view name)
        : m_name{name}
    {
    }

    const std::string& getName() const { return m_name; }
};

class Course
{
private:
    const Instructor& m_instructor; // This course holds only one instructor for simplicity, but it could hold many instructors

public:
    Course(const Instructor& instructor)
        : m_instructor{instructor}
    {
    }
};

int main()
{
    // Create an instructor outside the scope of the Course
    Instructor prof{"Dr. Smith"}; // create an instructor

    {
        // Create a course and use the constructor parameter to pass
        // the instructor to it.
        Course math{prof};

    } // course goes out of scope here and is destroyed

    // prof still exists here, but the course doesn't

    std::cout << prof.getName() << " still exists!\n";

    return 0;
}

In this case, prof is created independently of math, and then passed into math's constructor. When math is destroyed, the m_instructor reference is destroyed, but the instructor itself isn't destroyed, so it still exists until it's independently destroyed later in main().

Pick the right relationship for what you're modeling

Although it might seem a little odd in the above example that the Instructors don't know what Courses they're teaching, that may be totally fine in the context of a given program. When you're determining what kind of relationship to implement, implement the simplest relationship that meets your needs, not the one that seems like it would fit best in a real-life context.

For example, if you're writing an academic management system, you may want to implement a course and instructor as an aggregation, so the instructor can teach multiple courses and courses can have multiple instructors. However, if you're writing a simple course registration system, you may want to implement a course and instructor as a composition, since an instructor will only ever teach one specific course in that context.

Best Practice
Implement the simplest relationship type that meets the needs of your program, not what seems right in real-life.

Summarizing composition and aggregation

Compositions:

  • Typically use normal member variables
  • Can use pointer members if the class handles object allocation/deallocation itself
  • Responsible for creation/destruction of parts

Aggregations:

  • Typically use pointer or reference members that point to or reference objects that live outside the scope of the aggregate class
  • Not responsible for creating/destroying parts

It's worth noting that the concepts of composition and aggregation can be mixed freely within the same class. It's entirely possible to write a class that's responsible for the creation/destruction of some parts but not others. For example, our Course class could have a name and an Instructor. The name would probably be added to the Course by composition, and would be created and destroyed with the Course. On the other hand, the Instructor would be added to the course by aggregation, and created/destroyed independently.

While aggregations can be extremely useful, they are also potentially more dangerous, because aggregations don't handle deallocation of their parts. Deallocations are left to an external party to do. If the external party no longer has a pointer or reference to the abandoned parts, or if it simply forgets to do the cleanup (assuming the class will handle that), then memory will be leaked.

For this reason, compositions should be favored over aggregations.

A few warnings/errata

For a variety of historical and contextual reasons, unlike a composition, the definition of an aggregation isn't precise -- so you may see other reference material define it differently from the way we do. That's fine, just be aware.

One final note: In an earlier lesson, we defined aggregate data types (such as structs and classes) as data types that group multiple variables together. You may also run across the term aggregate class in your C++ journeys, which is defined as a struct or class that has no provided constructors, destructors, or overloaded assignment, has all public members, and doesn't use inheritance -- essentially a plain-old-data struct. Despite the similarities in naming, aggregates and aggregation are different and should not be confused.

Summary

Aggregation definition: A subtype of object composition where the part is part of the whole, can belong to multiple objects simultaneously, does not have its existence managed by the whole, and doesn't know about the whole (unidirectional).

Key differences from composition: Unlike composition, parts in an aggregation can belong to multiple objects at once and aren't responsible for creating or destroying the parts. The whole is not responsible for the part's lifetime.

"Has-a" relationship: Aggregation models "has-a" relationships (a library has books, a computer has a monitor) where the part exists independently of the whole.

Implementation: Typically use pointer or reference members that point to or reference objects created outside the aggregate class's scope. Parts are usually passed via constructor parameters or added later via access functions.

Lifetime independence: When the aggregate is destroyed, member pointers/references are destroyed, but the parts themselves continue to exist because they were created independently.

Context-appropriate modeling: Choose the relationship type based on your program's needs, not real-world context. A Course-Instructor relationship might be aggregation in one system and composition in another.

Composition vs aggregation comparison: Compositions use normal members or class-managed pointers and handle creation/destruction. Aggregations use pointers/references to external objects and don't handle creation/destruction.

Mixing relationships: A class can use both composition and aggregation. Some members might be composed (created and destroyed with the class) while others are aggregated (created externally and referenced).

Safety considerations: Aggregations are more dangerous than compositions because they don't handle deallocation. If the external party loses track of the parts or forgets cleanup, memory leaks occur. Favor compositions over aggregations when possible.

std::reference_wrapper: Enables storing references in containers like std::vector. Lives in header. Use get() to retrieve the referenced object. Allows creating containers of references where direct reference storage isn't possible.

std::reference_wrapper

In the Course/Instructor example above, we used a reference in the Course to store the Instructor. This works fine if there's only one Instructor, but what if a Course has multiple Instructors? We'd like to store those Instructors in a list of some kind (e.g., a std::vector) but fixed arrays and the various standard library lists can't hold references (because list elements must be assignable, and references can't be reassigned).

std::vector<const Instructor&> m_instructors{}; // Illegal

Instead of references, we could use pointers, but that would open the possibility to store or pass null pointers. In the Course/Instructor example, we don't want to allow null pointers. To solve this, there's std::reference_wrapper.

Essentially, std::reference_wrapper is a class that acts like a reference, but also allows assignment and copying, so it's compatible with lists like std::vector.

The good news is that you don't really need to understand how it works to use it. All you need to know are three things:

  1. std::reference_wrapper lives in the header.
  2. When you create your std::reference_wrapper wrapped object, the object can't be an anonymous object (since anonymous objects have expression scope, and this would leave the reference dangling).
  3. When you want to get your object back out of std::reference_wrapper, you use the get() member function.

Here's an example using std::reference_wrapper in a std::vector:

#include <functional> // std::reference_wrapper
#include <iostream>
#include <vector>
#include <string>

int main()
{
    std::string alice{"Alice"};
    std::string bob{"Bob"};

    std::vector<std::reference_wrapper<std::string>> names{alice, bob}; // these strings are stored by reference, not value

    std::string charlie{"Charlie"};

    names.emplace_back(charlie);

    for (auto name : names)
    {
        // Use the get() member function to get the referenced string.
        name.get() += " Jones";
    }

    std::cout << charlie << '\n'; // prints Charlie Jones

    return 0;
}

To create a vector of const references, we'd have to add const before the std::string like so:

// Vector of const references to std::string
std::vector<std::reference_wrapper<const std::string>> names{alice, bob};