Advanced 15 min

Thread Pools

Master efficient concurrent task management using thread pools to avoid thread creation overhead

Learn how to efficiently manage many concurrent tasks using thread pools, reusing threads instead of creating and destroying them repeatedly.

A Simple Example

#include <iostream>
#include <thread>
#include <queue>
#include <mutex>
#include <condition_variable>
#include <functional>
#include <vector>

class ThreadPool {
    std::vector<std::thread> workers;
    std::queue<std::function<void()>> tasks;
    std::mutex queueMutex;
    std::condition_variable cv;
    bool stop{false};

public:
    ThreadPool(size_t numThreads) {
        for (size_t i{0}; i < numThreads; ++i) {
            workers.emplace_back([this] {
                while (true) {
                    std::function<void()> task;

                    {
                        std::unique_lock<std::mutex> lock{queueMutex};
                        cv.wait(lock, [this] { return stop || !tasks.empty(); });

                        if (stop && tasks.empty()) {
                            return;
                        }

                        task = std::move(tasks.front());
                        tasks.pop();
                    }

                    task();
                }
            });
        }
    }

    ~ThreadPool() {
        {
            std::lock_guard<std::mutex> lock{queueMutex};
            stop = true;
        }
        cv.notify_all();

        for (auto& worker : workers) {
            worker.join();
        }
    }

    void enqueue(std::function<void()> task) {
        {
            std::lock_guard<std::mutex> lock{queueMutex};
            tasks.push(std::move(task));
        }
        cv.notify_one();
    }
};

int main() {
    ThreadPool pool{4};  // 4 worker threads

    // Submit 10 tasks
    for (int i{0}; i < 10; ++i) {
        pool.enqueue([i] {
            std::cout << "Task " << i << " running on thread "
                      << std::this_thread::get_id() << "\n";
            std::this_thread::sleep_for(std::chrono::milliseconds(500));
            std::cout << "Task " << i << " completed\n";
        });
    }

    std::this_thread::sleep_for(std::chrono::seconds(6));

    return 0;
}

Breaking It Down

Thread pool architecture

  • Worker threads: Fixed number of threads that wait for tasks
  • Task queue: Shared queue where tasks are submitted
  • Synchronization: Mutex and condition variable coordinate access
  • Remember: Workers pull tasks from queue and execute them

Why thread pools are efficient

  • Reuse: Threads are created once and reused for many tasks
  • Control: Limit concurrent threads to match CPU cores
  • No overhead: Avoid repeated thread creation and destruction costs
  • Remember: Creating threads is expensive, reusing is cheap

Thread pool lifecycle

  • Construction: Create worker threads that wait for tasks
  • Enqueue: Add tasks to queue, notify workers
  • Execution: Workers pick up and execute tasks
  • Destruction: Set stop flag, notify all, join workers

Choosing pool size

  • CPU-bound: Use std::thread::hardware_concurrency() for CPU cores
  • I/O-bound: Can use more threads since they block on I/O
  • Balance: Too few wastes CPU, too many causes thrashing
  • Remember: Match pool size to workload characteristics

Why This Matters

  • Creating threads is expensive. Each thread creation involves system calls, memory allocation, and context setup.
  • Thread pools reuse worker threads, processing many tasks without the overhead of repeated creation and destruction.
  • They prevent over-subscription - limiting concurrent threads to the number of CPU cores for optimal performance.

Critical Insight

A thread pool is like a restaurant kitchen with a fixed number of chefs. Instead of hiring and firing chefs for each order (creating/destroying threads), you have a team that works through all orders in the queue.

The queue is the ticket system. Chefs pull tickets and cook orders. You never have more chefs than you need, and they are always ready to work. This is far more efficient than hiring a new chef for every single order.

Best Practices

Size appropriately: Use std::thread::hardware_concurrency() for CPU-bound tasks. I/O-bound tasks can have more threads.

Always join in destructor: Set stop flag, notify all workers, then join all threads before destroying the pool.

Use condition variables: Workers should wait on condition variable instead of busy-waiting for tasks.

Handle exceptions: Wrap task execution in try-catch to prevent worker thread termination.

Consider std::packaged_task: Use packaged_task to get return values and exceptions from tasks.

Common Mistakes

Forgetting to join threads: Not joining worker threads in destructor causes program termination.

Over-subscribing: Too many threads causes context switching overhead and reduced performance.

Not handling exceptions: Uncaught exceptions in worker threads will terminate them.

Deadlock in destructor: Ensure stop flag and notify happen before joining threads.

Debug Challenge

This thread pool forgets to join worker threads in the destructor. Click the highlighted line to fix it:

1 class ThreadPool {
2 std::vector<std::thread> workers;
3 // ... other members
4
5 ~ThreadPool() {
6 {
7 std::lock_guard<std::mutex> lock{queueMutex};
8 stop = true;
9 }
10 cv.notify_all();
11 // Missing something here!
12 }
13 };

Quick Quiz

  1. Why are thread pools more efficient than creating threads for each task?
Thread creation is expensive, reusing threads avoids this overhead
Thread pools use less memory
Thread pools run faster
  1. How should you size a thread pool for CPU-bound tasks?
Use as many threads as possible
Match the number of CPU cores using hardware_concurrency()
Always use exactly 4 threads
  1. What must you do in a thread pool destructor?
Just set a stop flag
Detach all threads
Set stop flag, notify workers, and join all threads

Practice Playground

Time to try out what you just learned! Play with the example code below, experiment by making changes and running the code to deepen your understanding.

Lesson Progress

  • Fix This Code
  • Quick Quiz
  • Practice Playground - run once