End-to-End Testing with Playwright: Building Confidence in Complex Features

Why We Need End-to-End Testing

When you're building a learning platform where students execute C++ code in their browser, complete interactive exercises, and track their progress across hundreds of lessons, you can't afford to ship bugs. Unit tests verify individual components work correctly, but they don't tell you if the entire system works together as users expect.

That's where end-to-end (E2E) testing comes in.

At HelloC++, we use Playwright to test complete user workflows in real browsers. This article shares how we've built a comprehensive E2E testing strategy that gives us confidence to ship features quickly while maintaining quality.

What is Playwright?

Playwright is a modern browser automation framework from Microsoft that allows you to write tests that control real browsers (Chromium, Firefox, and WebKit) programmatically. Unlike older tools like Selenium, Playwright is:

Fast: Tests run quickly with parallel execution
Reliable: Auto-waiting and retry mechanisms reduce flaky tests
Cross-browser: One codebase tests Chrome, Firefox, and Safari
Developer-friendly: Excellent debugging tools and trace viewer

Our Testing Philosophy

Before diving into implementation details, here's our testing philosophy:

Test What Users Do, Not Implementation Details

We focus on user workflows, not internal implementation:

import { test, expect } from '@playwright/test';
import { loginAsUniqueUser } from '../utils/backend-helpers.js';

// Good - tests user behavior
test('user completes first lesson', async ({ page }, testInfo) => {
    await loginAsUniqueUser(page, testInfo);
    await page.goto('/courses');
    await page.getByTestId('enroll-course-button').first().click();
    await page.getByTestId('start-lesson-link').first().click();
    await page.getByTestId('mark-complete-button').click();
    await expect(page.getByTestId('lesson-completed-message')).toBeVisible();
});

// Bad - tests implementation details
test('calls lesson completion API', async ({ page }) => {
    await loginAsUniqueUser(page, testInfo);
    await page.goto('/courses');
    ...
    await page.route('**/api/lesson/*/complete', route => {
        // Testing internals, not user experience
    });
});

Prioritize Critical User Journeys

We test the workflows that matter most:

Registration and authentication - Users must be able to sign up and log in
Lesson completion - Core learning flow must work perfectly
Code execution - C++ code compilation and execution is critical
Progress tracking - Users' achievements and progress must be accurate
Guided projects - Complex multi-step projects require thorough testing

Accept Some Flakiness, But Fix It Quickly

E2E tests are inherently more flaky than unit tests. We accept this reality but:

Run tests multiple times in CI to catch flakiness
Fix flaky tests immediately when discovered
Use Playwright's built-in retry mechanisms
Maintain a "last failed" test suite for quick debugging

Test Structure and Organization

Our Playwright tests are organized by feature in tests/playwright/:

tests/playwright/
├── auth/
│   ├── registration-validation.spec.js
│   ├── password-reset.spec.js
│   └── auth.spec.js
├── courses/
│   ├── course-enrollment.spec.js
│   ├── course-completion.spec.js
│   └── courses.spec.js
├── lessons/
│   ├── lessons.spec.js
│   ├── lesson-access-control.spec.js
│   └── lesson-comments.spec.js
├── exercises/
│   ├── exercise.spec.js
│   └── multi-file-exercise.spec.js
├── guided-projects/
│   ├── guided-project-progress.spec.js
│   ├── guided-project-compilation.spec.js
│   └── guided-project-validation.spec.js
├── quiz/
│   ├── quiz.spec.js
│   └── quiz-achievements.spec.js
└── utils/
    ├── backend-helpers.js
    └── navigation-helpers.js

Each feature has its own directory with focused test files.

Playwright Configuration

Our playwright.config.js sets up multiple test projects for different scenarios:

Local Development Tests

{
    name: "local-chromium",
    use: {
        ...devices["Desktop Chrome"],
        baseURL: "http://localhost:8001"
    },
    testIgnore: [/.*-auth\.spec\.js$/]
}

These tests run against a local development server with a fresh test database.

Authenticated Tests

{
    name: "local-chromium-auth",
    use: {
        ...devices["Desktop Chrome"],
        storageState: "tests/playwright/.auth/user.json"
    },
    dependencies: ["setup-auth"],
    testMatch: /.*-auth\.spec\.js$/
}

Tests requiring authentication use a separate project with stored authentication state.

Production Tests (Safety First)

...(productionURL ? [
    {
        name: "live-chromium",
        use: {
            ...devices["Desktop Chrome"],
            baseURL: productionURL
        }
    }
] : [])

Production tests only run when PRODUCTION_URL is explicitly set, preventing accidental testing against the live site. This is critical for safety.

Cross-Browser Testing

{
    name: "local-firefox",
    use: { ...devices["Desktop Firefox"] }
},
{
    name: "local-webkit",
    use: { ...devices["Desktop Safari"] }
}

We test across all major browser engines to ensure compatibility.

Backend-Playwright Integration

One of our key innovations is integrating Playwright with our backend. We created custom helpers that allow tests to interact with the backend application directly:

Creating Users

import { loginAsUniqueUser } from '../utils/backend-helpers.js';

test('user completes lesson', async ({ page }, testInfo) => {
    // Creates unique user and logs them in via backend API
    const user = await loginAsUniqueUser(page, testInfo);

    // Now test as authenticated user
    await page.goto('/dashboard');
    await expect(page.getByText(user.name)).toBeVisible();
});

Using Factories

import { create } from '../utils/backend-helpers.js';

test('displays user achievements', async ({ page }) => {
    // Create test data using backend factories
    const achievement = await create({
        page,
        model: 'Achievement',
    });

    await page.goto('/achievements');
    await expect(page.getByText('Test Achievement')).toBeVisible();
});

Seeding Related Data

For complex test scenarios, create related records in a single call:

import { loginAsUniqueUser, seed } from '../utils/backend-helpers.js';

test('shows lesson progress with completed exercises', async ({ page }, testInfo) => {
    const user = await loginAsUniqueUser(page, testInfo);

    // Seed a lesson with exercises and mark some as complete
    await seed({
        page,
        scenario: 'lesson-with-progress',
        data: {
            userId: user.id,
            lessonSlug: 'variables-and-types',
            completedExercises: 3,
            totalExercises: 5
        }
    });

    await page.goto('/lessons/variables-and-types');
    await expect(page.getByTestId('progress-bar')).toContainText('3/5');
});

Resetting State

Some tests need to reset specific state without a full database refresh:

import { resetState, loginAsUniqueUser } from '../utils/backend-helpers.js';

test('user can restart course progress', async ({ page }, testInfo) => {
    const user = await loginAsUniqueUser(page, testInfo);

    // Complete some lessons first
    await seed({ page, scenario: 'completed-course', data: { userId: user.id } });

    await page.goto('/courses/cpp-fundamentals');
    await expect(page.getByTestId('course-progress')).toContainText('100%');

    // Reset just this user's course progress
    await resetState({
        page,
        type: 'course-progress',
        userId: user.id
    });

    await page.reload();
    await expect(page.getByTestId('course-progress')).toContainText('0%');
});

Checking Sent Notifications

Verify that actions trigger the expected notifications:

import { loginAsUniqueUser, getNotifications } from '../utils/backend-helpers.js';

test('completing course sends achievement notification', async ({ page }, testInfo) => {
    const user = await loginAsUniqueUser(page, testInfo);

    // Setup: user has completed all but last lesson
    await seed({ page, scenario: 'almost-complete-course', data: { userId: user.id } });

    await page.goto('/lessons/final-lesson');
    await page.getByTestId('mark-complete-button').click();
    await expect(page.getByTestId('course-completed-modal')).toBeVisible();

    // Verify notification was created
    const notifications = await getNotifications({ page, userId: user.id });
    expect(notifications).toContainEqual(
        expect.objectContaining({
            type: 'achievement-unlocked',
            data: expect.objectContaining({ achievement: 'course-complete' })
        })
    );
});

Time Manipulation

Test time-sensitive features by controlling the server's perception of time:

import { loginAsUniqueUser, setTestTime, resetTestTime } from '../utils/backend-helpers.js';

test('daily streak resets after 24 hours', async ({ page }, testInfo) => {
    const user = await loginAsUniqueUser(page, testInfo);

    // Complete a lesson to start streak
    await page.goto('/lessons/hello-world');
    await page.getByTestId('mark-complete-button').click();
    await expect(page.getByTestId('streak-counter')).toContainText('1 day');

    // Fast forward 25 hours
    await setTestTime({ page, offset: { hours: 25 } });

    await page.goto('/dashboard');
    await expect(page.getByTestId('streak-counter')).toContainText('0 days');
    await expect(page.getByTestId('streak-lost-message')).toBeVisible();

    // Always reset time after test
    await resetTestTime({ page });
});

These helpers communicate with protected API endpoints that only exist in the test environment, keeping your production code clean and secure.

Testing Complex Features

Let's look at how we test some of HelloC++'s most complex features.

Code Execution Testing

Testing code execution requires verifying the entire pipeline: user types code, submits it, Docker compiles it, executes it, and returns output.

test('executes C++ code successfully', async ({ page }, testInfo) => {
    await loginAsUniqueUser(page, testInfo);
    await page.goto('/exercises/hello-world');

    // Type code in Monaco editor
    const editor = page.getByTestId('code-editor');
    await editor.click();
    await page.keyboard.insertText(`#include <iostream>
int main() {
    std::cout << "Hello, World!\\n";
    return 0;
}`);

    // Run code
    await page.getByTestId('run-code-button').click();

    // Verify execution
    await expect(page.getByTestId('execution-status')).toBeVisible();
    await expect(page.getByTestId('code-output')).toContainText('Hello, World!', { timeout: 30000 });
    await expect(page.getByTestId('execution-success')).toBeVisible();
});

Multi-File Exercise Testing

Testing exercises with multiple files requires interacting with file tabs and editors:

test('compiles multi-file exercise', async ({ page }, testInfo) => {
    await loginAsUniqueUser(page, testInfo);
    await page.goto('/exercises/classes-intro');

    // Switch to header file
    await page.getByTestId('file-tab-person.h').click();
    await expect(page.getByTestId('code-editor')).toContainText('class Person');

    // Switch to implementation file
    await page.getByTestId('file-tab-person.cpp').click();
    await expect(page.getByTestId('code-editor')).toContainText('Person::Person');

    // Compile and run
    await page.getByTestId('compile-run-button').click();

    // Verify all files compiled together
    await expect(page.getByTestId('compilation-success')).toBeVisible();
});

Guided Project Testing

Guided projects are our most complex feature - multi-step tutorials with validation, progress tracking, and file management:

import { setupGuidedProjectTest, waitForWorkspaceReady } from '../utils/backend-helpers.js';

test('completes guided project step', async ({ page }, testInfo) => {
    // Setup helper logs in user and initializes project
    await setupGuidedProjectTest(page, testInfo, {
        project_slug: 'opengl-blackhole-2d',
        mode: 'guided'
    });

    await page.goto('/projects/opengl-blackhole-2d/step/1');
    await waitForWorkspaceReady(page);

    // Complete validation requirements
    await page.getByTestId('guided-section-1').click();
    const editor = page.getByTestId('code-editor');
    await editor.click();

    // Make required changes
    await page.keyboard.insertText('glfwInit();');

    // Validate section
    await page.getByTestId('validate-section-button').click();
    await expect(page.getByTestId('validation-passed')).toBeVisible();

    // Mark step complete
    await page.getByTestId('complete-step-button').click();
    await expect(page.getByTestId('step-completed-message')).toBeVisible();
});

Test Isolation and Data Management

One of the biggest challenges in E2E testing is test isolation - ensuring tests don't interfere with each other.

Fresh Database Per Test Run

Our webServer configuration in playwright.config.js starts each test run with a fresh database:

// playwright.config.js
webServer: {
    command: `npm run test:server`,
    url: "http://localhost:8001",
    reuseExistingServer: !process.env.CI
}

The test:server script handles database setup and server startup:

// package.json
{
    "scripts": {
        "test:server": "npm run db:reset && npm run server:test",
        "db:reset": "DB_DATABASE=test_db npm run db:refresh -- --with-seed",
        "server:test": "DB_DATABASE=test_db npm run serve -- --port=8001"
    }
}

This approach:

Drops and recreates all tables (db:refresh) for a clean slate
Seeds essential test data (courses, lessons, achievements)
Uses a separate test database to avoid touching development data
Starts the server on a dedicated test port

In CI, we always use a fresh database. Locally, we can reuse the server for faster iteration with reuseExistingServer.

Unique Users Per Test

Instead of using shared test users, each test creates its own unique user:

export async function loginAsUniqueUser(page, testInfo) {
    const testId = Date.now().toString() + Math.random().toString(36).substring(2, 7);

    const response = await page.request.post('/__playwright__/login', {
        headers: { Accept: 'application/json' },
        data: {
            attributes: {
                name: `Test User ${testId}`,
                email: `test_${testId}@example.com`
            },
            state: ['betaTester']
        }
    });

    return await response.json();
}

This eliminates race conditions and test interdependencies.

Test Data Cleanup

While we use a fresh database, we still clean up within tests when needed:

test('deletes user account', async ({ page }, testInfo) => {
    const user = await loginAsUniqueUser(page, testInfo);

    await page.goto('/profile/settings');
    await page.getByTestId('delete-account-button').click();
    await page.getByTestId('confirm-deletion-button').click();

    // Verify user logged out and redirected
    await expect(page).toHaveURL('/');

    // Verify can't access authenticated routes
    await page.goto('/dashboard');
    await expect(page).toHaveURL('/login');
});

Debugging Failing Tests

Playwright provides excellent debugging tools that we use extensively.

Trace Viewer

When a test fails, Playwright automatically captures a trace (enabled in our config with trace: "retain-on-failure"). The trace includes:

Screenshots at each step
Network requests and responses
Console logs
DOM snapshots
Timeline of actions

# View trace for failed test
npx playwright show-trace test-results/lessons-completes-first-lesson/trace.zip

Screenshots on Failure

Failed tests automatically capture screenshots:

use: {
    screenshot: "only-on-failure"
}

Debug Mode

For interactive debugging:

npm run test:ui:debug

This opens the Playwright Inspector, allowing you to step through the test, inspect the page, and run commands interactively.

Headed Mode

Watch tests run in a real browser:

npm run test:ui:headed

Useful for understanding what's happening visually.

Best Practices We've Learned

1. Use Test IDs for Critical Elements

Prefer data-testid attributes over text selectors:

// Good - stable selector
await page.getByTestId('submit-exercise-button').click();

// Bad - breaks when text changes
await page.getByText('Submit Exercise').click();

In our HTML templates:

<button data-testid="submit-exercise-button">
    Submit Exercise
</button>

2. Wait for Network Idle

For pages with async data loading:

await page.goto('/dashboard');
await page.waitForLoadState('networkidle');
await expect(page.getByTestId('user-stats')).toBeVisible();

3. Avoid Fixed Timeouts

Let Playwright's auto-waiting work:

// Good - waits until element exists
await expect(page.getByTestId('success-message')).toBeVisible();

// Bad - arbitrary timeout
await page.waitForTimeout(2000);
await expect(page.getByTestId('success-message')).toBeVisible();

4. Test Error States Too

Don't just test happy paths:

test('shows error for invalid code', async ({ page }, testInfo) => {
    await loginAsUniqueUser(page, testInfo);
    await page.goto('/exercises/hello-world');

    // Submit invalid code
    await page.getByTestId('code-editor').click();
    await page.keyboard.insertText('invalid C++ code!!!');
    await page.getByTestId('run-code-button').click();

    // Verify error handling
    await expect(page.getByTestId('compilation-error')).toBeVisible();
    await expect(page.getByTestId('error-message')).toContainText('expected');
});

5. Group Related Tests

Use test.describe for organization:

test.describe('Lesson Completion Flow', () => {
    test('completes lesson with exercises', async ({ page }) => {
        // Test implementation
    });

    test('completes lesson with quiz', async ({ page }) => {
        // Test implementation
    });

    test('unlocks achievement on completion', async ({ page }) => {
        // Test implementation
    });
});

Running Tests in CI/CD

Our GitHub Actions workflow runs Playwright tests on every push:

name: Playwright Tests
on: [push, pull_request]
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20

      - name: Install dependencies
        run: npm ci

      - name: Install Playwright Browsers
        run: npx playwright install --with-deps

      - name: Run Playwright tests
        run: npm run test:ui:all

      - uses: actions/upload-artifact@v4
        if: always()
        with:
          name: playwright-report
          path: playwright-report/
          retention-days: 30

Tests must pass before PRs can merge.

Measuring Test Effectiveness

We track several metrics:

Test coverage: What percentage of critical workflows are tested?
Test reliability: How often do tests pass on first run?
Test speed: How quickly do tests execute?
Bug catch rate: What percentage of bugs are caught before production?

We aim for:

Comprehensive coverage of critical user workflows
High pass rate on first run (90%+ target)
Fast execution (under 5 minutes for full suite)
Early bug detection before production deployment

Limitations and Trade-offs

Playwright testing isn't perfect. Here are the trade-offs we've accepted:

Slower Than Unit Tests

E2E tests take minutes while unit tests take seconds. We accept this because:

E2E tests catch integration bugs unit tests miss
We run unit tests more frequently than E2E tests
E2E tests run in CI while we continue coding

Flakiness Exists

E2E tests are more flaky than unit tests due to:

Network timing
Async operations
Browser differences

We mitigate this with retries and immediate fixes.

Maintenance Overhead

UI changes require test updates. We minimize this by:

Using semantic selectors (roles, labels, test IDs)
Testing behavior, not implementation
Grouping related assertions

Conclusion

Playwright has transformed how we build features at HelloC++. By testing complete user workflows in real browsers, we catch bugs that unit tests miss and deploy with confidence.

Key takeaways:

Test user workflows, not implementation details
Integrate with your backend for powerful test helpers
Isolate tests with unique users and fresh databases
Use Playwright's debugging tools extensively
Accept some flakiness but fix it immediately
Run tests in CI to catch regressions

If you're building a complex web application, invest in E2E testing. The initial setup takes time, but the confidence and velocity gains are worth it.

Further Reading:

Questions or Want to Chat About Testing?

Testing strategies are always evolving. If you have questions about our approach or want to share your own experiences, reach out - I'd love to hear from you.

Part of the Building Software at Scale series.

← Zero Downtime Deployment · Application Monitoring →