End-to-end tests live in dashboard/frontend/e2e/ and run with Playwright. They exercise the full stack — PostgreSQL, Rust backend, and React frontend — through a real Chromium browser.
| Tool | Version | Notes |
|---|---|---|
| Docker | any | For the local PostgreSQL container |
| Rust | stable | Backend compilation |
| Node.js | 22+ | Frontend tooling |
| PostgreSQL | 16 | Via Docker (managed by make) |
# 1. Install dependencies and start the local PostgreSQL container
make deps
# 2. Create the test database (make deps creates "dashboard", not "tekton_test")
docker exec -i tekton-postgres psql -U tekton -d postgres -c "CREATE DATABASE tekton_test;"
# 3. Install Playwright browsers
cd dashboard/frontend
npx playwright install --with-deps chromiumcd dashboard/frontend
# Set DATABASE_URL for the Docker-based PostgreSQL (the default in playwright.config.ts
# uses the OS user, which won't match the Docker container credentials)
export DATABASE_URL="postgres://tekton:tekton@localhost:5432/tekton_test"
# Run the full suite (Playwright auto-starts the backend via its webServer config)
npm run test:e2e
# Interactive UI mode — great for debugging
npm run test:e2e:ui
# Single file
npx playwright test e2e/tasks-list.spec.ts
# View the HTML report after a run
npx playwright show-reportTip: The Playwright
webServerconfig runscargo run --releaseon port 3200 and reuses an already-running server if one is detected.
dashboard/frontend/e2e/
├── fixtures.ts # Custom test fixtures (role pages, TEST_IDS, coverage)
├── global-setup.ts # Seed DB, generate JWTs, save auth storage states
├── global-teardown.ts # Drop all tables, remove .auth/ directory
├── coverage.ts # Collect window.__coverage__ from instrumented builds
├── seed.sql # Full seed data (users, tasks, policies, intake, etc.)
│
├── activity-sidebar.spec.ts
├── admin-intake.spec.ts
├── admin-policies.spec.ts
├── admin-secrets.spec.ts
├── admin-users.spec.ts
├── audit-log.spec.ts
├── auth.spec.ts
├── command-palette.spec.ts
├── cost-dashboard.spec.ts
├── intake-board.spec.ts
├── navigation.spec.ts
├── preview-detail.spec.ts
├── previews.spec.ts
├── responsive.spec.ts
├── settings.spec.ts
├── task-chat.spec.ts
├── task-create.spec.ts
├── task-detail.spec.ts
├── tasks-list.spec.ts
├── theme-toggle.spec.ts
│
└── lighthouse/
├── lighthouse-config.ts # Thresholds, Chrome launcher, audit runner
└── lighthouse.spec.ts # Performance/a11y audits for key pages
global-setup.ts global-teardown.ts
┌─────────────────────┐ ┌──────────────────────┐
│ 1. Run seed.sql │ │ 1. DROP all tables │
│ 2. Generate JWTs │ ──tests──▶ │ 2. Remove .auth/ │
│ 3. Save auth states │ │ 3. Clean .nyc_output │
└─────────────────────┘ └──────────────────────┘
- global-setup seeds the
tekton_testdatabase viapsql, generates HS256 JWTs for three test users (admin, member, viewer), and writes Playwright storage states to.auth/*.jsonwithdashboard_sessioncookies. - Tests run fully parallel across spec files. Each spec uses role-based fixtures that load the correct auth state.
- global-teardown drops all tables and removes
.auth/storage files.
Tests hit the real stack: a real PostgreSQL database, the real Rust backend, and a real Chromium browser. Do not mock the database, API responses, or any backend behavior. The goal is to catch the bugs that only surface when every layer is wired together. If a test is hard to write without mocking, that's a signal the feature needs better seed data or a test-specific API state — not a mock.
Always import from ./fixtures, not from @playwright/test directly:
import { test, expect, TEST_IDS } from './fixtures';The custom fixtures provide role-based browser contexts with pre-loaded auth:
| Fixture | Role | Storage State |
|---|---|---|
authenticatedPage |
admin | .auth/admin.json |
adminPage |
admin | .auth/admin.json |
memberPage |
member | .auth/member.json |
viewerPage |
viewer | .auth/viewer.json |
Every page fixture automatically calls collectCoverage() after the test completes, so code coverage collection requires no manual effort.
TEST_IDS is a constant map of identifiers that correspond to entities in seed.sql. Use these instead of hardcoded strings:
// Good
await page.goto(`/tasks/${TEST_IDS.tasks.completed}`);
// Bad — breaks if seed data changes
await page.goto('/tasks/task-completed-1');Prefer accessibility-first selectors:
// Preferred
page.getByRole('button', { name: 'Create Task' })
page.getByLabel('Task title')
page.getByText('Successfully created')
// Avoid
page.locator('.btn-primary')
page.locator('#task-title-input')Use test.describe.serial only for CRUD flows where test order matters (create -> verify -> delete). Default to test.describe for everything else — tests run fully parallel.
// Use serial when tests depend on prior test state
test.describe.serial('Task CRUD', () => {
test('create a task', async ({ adminPage }) => { /* ... */ });
test('verify task appears in list', async ({ adminPage }) => { /* ... */ });
test('delete the task', async ({ adminPage }) => { /* ... */ });
});// e2e/my-feature.spec.ts
import { test, expect, TEST_IDS } from './fixtures';
test.describe('My Feature', () => {
test('renders correctly for admin', async ({ adminPage }) => {
await adminPage.goto('/my-feature');
await expect(adminPage.getByRole('heading', { name: 'My Feature' })).toBeVisible();
});
test('is read-only for viewers', async ({ viewerPage }) => {
await viewerPage.goto('/my-feature');
await expect(viewerPage.getByRole('button', { name: 'Edit' })).toBeDisabled();
});
});If your feature requires data that doesn't exist yet:
- Add
INSERTstatements toe2e/seed.sql - Add corresponding IDs to
TEST_IDSine2e/fixtures.ts - Add the table to the
DROP TABLElist ine2e/global-teardown.ts
npx playwright test e2e/my-feature.spec.tsseed.sql creates the full schema and populates it with deterministic test data.
| Entity | Count | Examples |
|---|---|---|
| Users | 3 | admin, member, viewer |
| Tasks | 10 | Various states: pending, running, completed, failed, awaiting |
| Task logs | 5 | Execution logs for completed task |
| Task messages | 5 | Chat messages for completed and awaiting tasks |
| Task actions | 10 | Agent actions including policy violations |
| State transitions | 3 | pending -> running -> completed |
| Repo policies | 2 | Tool and cost limits per repo |
| Org policies | 1 | Organization-wide defaults |
| Secrets | 3 | Encrypted credential entries |
| Budgets | 2 | User and org budget limits |
| Audit log entries | 27+ | Auth, task, and admin events |
| Intake sources | 2 | GitHub and Linear integrations |
| Intake issues | 8 | Various statuses: backlog, pending, done, failed |
| Intake poll log | 6 | Polling history with durations |
TEST_IDS.tasks.pending // → "task-pending-1" (seed.sql row)
TEST_IDS.tasks.completed // → "task-completed-1"
TEST_IDS.users.admin // → "testadmin"
TEST_IDS.repos.main // → "testorg/testrepo"
TEST_IDS.intake.issues.backlogAuth // → "Fix null pointer in auth module"
// ... see fixtures.ts for the full map- Add your
INSERTtoseed.sql— use a stable, descriptive ID (e.g.,my-feature-entity-1) - Add the ID to
TEST_IDSinfixtures.ts - Add
DROP TABLE IF EXISTS your_table CASCADE;toglobal-teardown.ts(if it's a new table)
The GitHub Actions workflow (.github/workflows/ci.yml) has three jobs:
┌────────┐ ┌────────────┐ ┌──────────────┐
│ Rust │ │ Frontend │ │ E2E Tests │
│ format │ │ lint │ │ (depends on │
│ clippy │────▶│ build │────▶│ both jobs) │
│ test │ │ │ │ │
└────────┘ └────────────┘ └──────────────┘
| Setting | Value |
|---|---|
| Runner | ubuntu-latest |
| Timeout | 30 minutes |
| Workers | 4 |
| Retries | 2 |
| Browser | Chromium only |
| PostgreSQL | 16 (service container) |
| DB name | tekton_test |
| DB credentials | tekton / tekton_test_password |
- Build backend (
cargo build --release) - Install frontend deps (
npm ci) - Build instrumented frontend (
INSTRUMENT_COVERAGE=true npm run build) - Install Playwright (
npx playwright install --with-deps chromium) - Start backend (serves
frontend/distas static files) - Health check (
curl http://localhost:3200/api/config) - Run tests (
npx playwright test --project=chromium) - Generate coverage report (
npx nyc report --reporter=text-summary) - Enforce coverage thresholds (
npx nyc check-coverage)
Both are uploaded with 30-day retention:
playwright-report/— HTML test report with traces and screenshotscoverage/— Istanbul code coverage report
vite-plugin-istanbulinstruments the production build whenINSTRUMENT_COVERAGE=true- Each test fixture collects
window.__coverage__from the browser after test completion - Coverage JSON files are saved to
.nyc_output/ - NYC merges all coverage data and generates reports
# Run everything in one command (build, test, report, threshold check)
make e2e.coverage
# Open the HTML report to see uncovered lines per file
open dashboard/frontend/coverage/index.htmlOr step by step:
cd dashboard/frontend
# 1. Build with instrumentation
npm run build:coverage
# 2. Run E2E tests (collects coverage automatically)
npm run test:e2e
# 3. Generate reports
npm run coverage:report
# 4. Check thresholds
npm run coverage:checkAll thresholds are 70%, configured in .nycrc.json:
| Metric | Threshold |
|---|---|
| Branches | 70% |
| Lines | 70% |
| Functions | 70% |
| Statements | 70% |
CI enforces these via npx nyc check-coverage — the job fails if any metric drops below 70%.
The following are excluded from coverage (.nycrc.json):
src/components/ui/**— shadcn/ui primitivessrc/components/VoiceInput.tsx— browser speech API dependentsrc/components/DiffViewer.tsx— complex rendering componentsrc/components/LogViewer.tsx— complex rendering componentsrc/components/TaskChat.tsx— streaming/WebSocket dependentsrc/components/BranchCombobox.tsx— complex comboboxsrc/hooks/use-mobile.ts— media query hook
A separate Playwright project runs Lighthouse audits against key pages.
npm run test:lighthouse| Category | Minimum Score |
|---|---|
| Performance | 70 |
| Accessibility | 85 |
| Best Practices | 85 |
| SEO | 70 |
- Home page (
/) - Tasks list (
/tasks) - Task detail (
/tasks/{id}) - Admin panel (
/admin) - Cost dashboard (
/cost)
Tests run serialized (test.describe.serial) with a single shared Chrome instance launched via chrome-launcher. Each page is audited with the admin auth cookie injected as an HTTP header. The desktop config uses simulated throttling (40ms RTT, 10Mbps throughput, no CPU slowdown).
Note: Lighthouse tests have a 60-second timeout per page (vs 30 seconds for regular tests).