E2E Testing Guide

End-to-end tests live in dashboard/frontend/e2e/ and run with Playwright. They exercise the full stack — PostgreSQL, Rust backend, and React frontend — through a real Chromium browser.

Quick Start

Prerequisites

Tool	Version	Notes
Docker	any	For the local PostgreSQL container
Rust	stable	Backend compilation
Node.js	22+	Frontend tooling
PostgreSQL	16	Via Docker (managed by `make`)

First-time setup

# 1. Install dependencies and start the local PostgreSQL container
make deps

# 2. Create the test database (make deps creates "dashboard", not "tekton_test")
docker exec -i tekton-postgres psql -U tekton -d postgres -c "CREATE DATABASE tekton_test;"

# 3. Install Playwright browsers
cd dashboard/frontend
npx playwright install --with-deps chromium

Running tests

cd dashboard/frontend

# Set DATABASE_URL for the Docker-based PostgreSQL (the default in playwright.config.ts
# uses the OS user, which won't match the Docker container credentials)
export DATABASE_URL="postgres://tekton:tekton@localhost:5432/tekton_test"

# Run the full suite (Playwright auto-starts the backend via its webServer config)
npm run test:e2e

# Interactive UI mode — great for debugging
npm run test:e2e:ui

# Single file
npx playwright test e2e/tasks-list.spec.ts

# View the HTML report after a run
npx playwright show-report

Tip: The Playwright webServer config runs cargo run --release on port 3200 and reuses an already-running server if one is detected.

Architecture & Project Layout

dashboard/frontend/e2e/
├── fixtures.ts              # Custom test fixtures (role pages, TEST_IDS, coverage)
├── global-setup.ts          # Seed DB, generate JWTs, save auth storage states
├── global-teardown.ts       # Drop all tables, remove .auth/ directory
├── coverage.ts              # Collect window.__coverage__ from instrumented builds
├── seed.sql                 # Full seed data (users, tasks, policies, intake, etc.)
│
├── activity-sidebar.spec.ts
├── admin-intake.spec.ts
├── admin-policies.spec.ts
├── admin-secrets.spec.ts
├── admin-users.spec.ts
├── audit-log.spec.ts
├── auth.spec.ts
├── command-palette.spec.ts
├── cost-dashboard.spec.ts
├── intake-board.spec.ts
├── navigation.spec.ts
├── preview-detail.spec.ts
├── previews.spec.ts
├── responsive.spec.ts
├── settings.spec.ts
├── task-chat.spec.ts
├── task-create.spec.ts
├── task-detail.spec.ts
├── tasks-list.spec.ts
├── theme-toggle.spec.ts
│
└── lighthouse/
    ├── lighthouse-config.ts  # Thresholds, Chrome launcher, audit runner
    └── lighthouse.spec.ts    # Performance/a11y audits for key pages

Test lifecycle

global-setup.ts                      global-teardown.ts
  ┌─────────────────────┐              ┌──────────────────────┐
  │ 1. Run seed.sql      │              │ 1. DROP all tables    │
  │ 2. Generate JWTs     │  ──tests──▶  │ 2. Remove .auth/      │
  │ 3. Save auth states  │              │ 3. Clean .nyc_output  │
  └─────────────────────┘              └──────────────────────┘

global-setup seeds the tekton_test database via psql, generates HS256 JWTs for three test users (admin, member, viewer), and writes Playwright storage states to .auth/*.json with dashboard_session cookies.
Tests run fully parallel across spec files. Each spec uses role-based fixtures that load the correct auth state.
global-teardown drops all tables and removes .auth/ storage files.

Test Patterns & Conventions

No mocks

Tests hit the real stack: a real PostgreSQL database, the real Rust backend, and a real Chromium browser. Do not mock the database, API responses, or any backend behavior. The goal is to catch the bugs that only surface when every layer is wired together. If a test is hard to write without mocking, that's a signal the feature needs better seed data or a test-specific API state — not a mock.

Fixtures

Always import from ./fixtures, not from @playwright/test directly:

import { test, expect, TEST_IDS } from './fixtures';

The custom fixtures provide role-based browser contexts with pre-loaded auth:

Fixture	Role	Storage State
`authenticatedPage`	admin	`.auth/admin.json`
`adminPage`	admin	`.auth/admin.json`
`memberPage`	member	`.auth/member.json`
`viewerPage`	viewer	`.auth/viewer.json`

Every page fixture automatically calls collectCoverage() after the test completes, so code coverage collection requires no manual effort.

TEST_IDS

TEST_IDS is a constant map of identifiers that correspond to entities in seed.sql. Use these instead of hardcoded strings:

// Good
await page.goto(`/tasks/${TEST_IDS.tasks.completed}`);

// Bad — breaks if seed data changes
await page.goto('/tasks/task-completed-1');

Selectors

Prefer accessibility-first selectors:

// Preferred
page.getByRole('button', { name: 'Create Task' })
page.getByLabel('Task title')
page.getByText('Successfully created')

// Avoid
page.locator('.btn-primary')
page.locator('#task-title-input')

Serial vs parallel

Use test.describe.serial only for CRUD flows where test order matters (create -> verify -> delete). Default to test.describe for everything else — tests run fully parallel.

// Use serial when tests depend on prior test state
test.describe.serial('Task CRUD', () => {
  test('create a task', async ({ adminPage }) => { /* ... */ });
  test('verify task appears in list', async ({ adminPage }) => { /* ... */ });
  test('delete the task', async ({ adminPage }) => { /* ... */ });
});

How to Write a New Test

1. Create the spec file

// e2e/my-feature.spec.ts
import { test, expect, TEST_IDS } from './fixtures';

test.describe('My Feature', () => {
  test('renders correctly for admin', async ({ adminPage }) => {
    await adminPage.goto('/my-feature');
    await expect(adminPage.getByRole('heading', { name: 'My Feature' })).toBeVisible();
  });

  test('is read-only for viewers', async ({ viewerPage }) => {
    await viewerPage.goto('/my-feature');
    await expect(viewerPage.getByRole('button', { name: 'Edit' })).toBeDisabled();
  });
});

2. Add seed data (if needed)

If your feature requires data that doesn't exist yet:

Add INSERT statements to e2e/seed.sql
Add corresponding IDs to TEST_IDS in e2e/fixtures.ts
Add the table to the DROP TABLE list in e2e/global-teardown.ts

3. Run and verify

npx playwright test e2e/my-feature.spec.ts

Seed Data

seed.sql creates the full schema and populates it with deterministic test data.

What's seeded

Entity	Count	Examples
Users	3	admin, member, viewer
Tasks	10	Various states: pending, running, completed, failed, awaiting
Task logs	5	Execution logs for completed task
Task messages	5	Chat messages for completed and awaiting tasks
Task actions	10	Agent actions including policy violations
State transitions	3	pending -> running -> completed
Repo policies	2	Tool and cost limits per repo
Org policies	1	Organization-wide defaults
Secrets	3	Encrypted credential entries
Budgets	2	User and org budget limits
Audit log entries	27+	Auth, task, and admin events
Intake sources	2	GitHub and Linear integrations
Intake issues	8	Various statuses: backlog, pending, done, failed
Intake poll log	6	Polling history with durations

TEST_IDS mapping

TEST_IDS.tasks.pending      // → "task-pending-1"     (seed.sql row)
TEST_IDS.tasks.completed    // → "task-completed-1"
TEST_IDS.users.admin        // → "testadmin"
TEST_IDS.repos.main         // → "testorg/testrepo"
TEST_IDS.intake.issues.backlogAuth  // → "Fix null pointer in auth module"
// ... see fixtures.ts for the full map

Adding new seed data

Add your INSERT to seed.sql — use a stable, descriptive ID (e.g., my-feature-entity-1)
Add the ID to TEST_IDS in fixtures.ts
Add DROP TABLE IF EXISTS your_table CASCADE; to global-teardown.ts (if it's a new table)

CI Pipeline

The GitHub Actions workflow (.github/workflows/ci.yml) has three jobs:

┌────────┐     ┌────────────┐     ┌──────────────┐
│  Rust  │     │  Frontend  │     │  E2E Tests   │
│ format │     │   lint     │     │ (depends on  │
│ clippy │────▶│   build    │────▶│  both jobs)  │
│  test  │     │            │     │              │
└────────┘     └────────────┘     └──────────────┘

E2E job details

Setting	Value
Runner	`ubuntu-latest`
Timeout	30 minutes
Workers	4
Retries	2
Browser	Chromium only
PostgreSQL	16 (service container)
DB name	`tekton_test`
DB credentials	`tekton` / `tekton_test_password`

E2E job steps

Build backend (cargo build --release)
Install frontend deps (npm ci)
Build instrumented frontend (INSTRUMENT_COVERAGE=true npm run build)
Install Playwright (npx playwright install --with-deps chromium)
Start backend (serves frontend/dist as static files)
Health check (curl http://localhost:3200/api/config)
Run tests (npx playwright test --project=chromium)
Generate coverage report (npx nyc report --reporter=text-summary)
Enforce coverage thresholds (npx nyc check-coverage)

Artifacts

Both are uploaded with 30-day retention:

playwright-report/ — HTML test report with traces and screenshots
coverage/ — Istanbul code coverage report

Code Coverage

How it works

vite-plugin-istanbul instruments the production build when INSTRUMENT_COVERAGE=true
Each test fixture collects window.__coverage__ from the browser after test completion
Coverage JSON files are saved to .nyc_output/
NYC merges all coverage data and generates reports

Running locally

# Run everything in one command (build, test, report, threshold check)
make e2e.coverage

# Open the HTML report to see uncovered lines per file
open dashboard/frontend/coverage/index.html

Or step by step:

cd dashboard/frontend

# 1. Build with instrumentation
npm run build:coverage

# 2. Run E2E tests (collects coverage automatically)
npm run test:e2e

# 3. Generate reports
npm run coverage:report

# 4. Check thresholds
npm run coverage:check

Thresholds

All thresholds are 70%, configured in .nycrc.json:

Metric	Threshold
Branches	70%
Lines	70%
Functions	70%
Statements	70%

CI enforces these via npx nyc check-coverage — the job fails if any metric drops below 70%.

Exclusions

The following are excluded from coverage (.nycrc.json):

src/components/ui/** — shadcn/ui primitives
src/components/VoiceInput.tsx — browser speech API dependent
src/components/DiffViewer.tsx — complex rendering component
src/components/LogViewer.tsx — complex rendering component
src/components/TaskChat.tsx — streaming/WebSocket dependent
src/components/BranchCombobox.tsx — complex combobox
src/hooks/use-mobile.ts — media query hook

Lighthouse / Performance Tests

A separate Playwright project runs Lighthouse audits against key pages.

npm run test:lighthouse

Thresholds

Category	Minimum Score
Performance	70
Accessibility	85
Best Practices	85
SEO	70

Pages audited

Home page (/)
Tasks list (/tasks)
Task detail (/tasks/{id})
Admin panel (/admin)
Cost dashboard (/cost)

How it works

Tests run serialized (test.describe.serial) with a single shared Chrome instance launched via chrome-launcher. Each page is audited with the admin auth cookie injected as an HTTP header. The desktop config uses simulated throttling (40ms RTT, 10Mbps throughput, no CPU slowdown).

Note: Lighthouse tests have a 60-second timeout per page (vs 30 seconds for regular tests).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

E2E Testing Guide

Quick Start

Prerequisites

First-time setup

Running tests

Architecture & Project Layout

Test lifecycle

Test Patterns & Conventions

No mocks

Fixtures

TEST_IDS

Selectors

Serial vs parallel

How to Write a New Test

1. Create the spec file

2. Add seed data (if needed)

3. Run and verify

Seed Data

What's seeded

TEST_IDS mapping

Adding new seed data

CI Pipeline

E2E job details

E2E job steps

Artifacts

Code Coverage

How it works

Running locally

Thresholds

Exclusions

Lighthouse / Performance Tests

Thresholds

Pages audited

How it works

FilesExpand file tree

e2e-testing.md

Latest commit

History

e2e-testing.md

File metadata and controls

E2E Testing Guide

Quick Start

Prerequisites

First-time setup

Running tests

Architecture & Project Layout

Test lifecycle

Test Patterns & Conventions

No mocks

Fixtures

TEST_IDS

Selectors

Serial vs parallel

How to Write a New Test

1. Create the spec file

2. Add seed data (if needed)

3. Run and verify

Seed Data

What's seeded

TEST_IDS mapping

Adding new seed data

CI Pipeline

E2E job details

E2E job steps

Artifacts

Code Coverage

How it works

Running locally

Thresholds

Exclusions

Lighthouse / Performance Tests

Thresholds

Pages audited

How it works