feat: Add Workflow Analytics Dashboards with OpenSearch integration by LuD1161 · Pull Request #229 · ShipSecAI/studio

LuD1161 · 2026-01-22T21:08:37Z

Summary

This PR adds a Security Analytics platform to ShipSec Studio that enables users to index workflow output data into OpenSearch and visualize it through dashboards. It also includes multi-tenant security, a unified developer experience, and component SDK improvements.

Key Features

Analytics Sink Component: New workflow node (core.analytics.sink) that indexes output data from any upstream node to OpenSearch
- Supports array and object inputs with automatic bulk indexing
- Auto-detects asset correlation keys (host, domain, subdomain, url, ip, etc.)
- Configurable index suffix and fail-on-error modes
- Fire-and-forget by default with retry logic (3 attempts with exponential backoff)
OpenSearch Integration:
- Daily index rotation pattern: security-findings-{orgId}-{YYYY.MM.DD}
- Index template with standard metadata fields
- Multi-tenant data isolation per organization
Multi-Tenant OpenSearch Security:
- TLS encryption for OpenSearch transport and HTTP layers
- Security plugin with role-based access control
- SaaS multitenancy with per-customer tenant isolation via proxy auth
- Dynamic tenant provisioning with seed indices for Dashboards
- Dashboards locked down for SaaS tenants (read-only, scoped to org data)
- ISM policy permissions for automated index lifecycle management
Analytics API:
- POST /api/v1/analytics/query endpoint supporting OpenSearch DSL
- Auto-scopes queries to organization's index pattern
- Rate limiting: 100 requests/minute per user
UI Integration:
- "Dashboards" link in sidebar (opens OpenSearch Dashboards in new tab)
- "Analytics Settings" page for tier-based retention configuration
- "View Analytics" button on workflow detail page
Nginx Reverse Proxy:
- Unified entry point at http://localhost
- Routes: / (frontend), /api (backend), /analytics (OpenSearch Dashboards)
Unified just dev command:
- Auto-detects auth mode from CLERK_SECRET_KEY in backend/.env
- If Clerk creds present → secure mode (Clerk auth + OpenSearch Security + nginx)
- If no Clerk creds → local auth mode (simpler, faster startup, analytics still work)
- No need for separate dev-insecure command
Workflow Status Improvements:
- New STALE status for orphaned run records (DB/Temporal mismatch)
- Improved status inference from trace events
Component SDK Extensions:
- generateFindingHash() utility for deduplication
- Workflow context (workflowId, workflowName, organizationId) passed to components
- Results output port added to nuclei, trufflehog, and supabase-scanner
- Support for optional inputs in components
- Dynamic Args Pattern for amass and subfinder

Commands

just dev              # Start dev (auto-detects: Clerk creds → secure mode, otherwise local auth)
just dev stop         # Stop everything
just dev clean        # Stop and remove all data
just prod             # Start production (auto-detects: TLS certs → secure mode)
just prod stop        # Stop production
just generate-certs   # Generate TLS certificates for production

Test Results

Justfile + OSD Verification Matrix

#	Scenario	Justfile	OSD Access	OSD Result	Notes
1	`just dev` — local auth (no Clerk)	PASS	`http://localhost/analytics` (nginx)	PASS	Session cookie auth via nginx. Index pattern `security-findings-*` pre-created.
2	`just dev` — secure mode (Clerk)	PASS	`http://localhost/analytics` (nginx)	PASS	Proxy auth + tenant isolation. 32 hits in Discover after workflow run.
3	`just prod` — standard (no certs)	PASS	`http://localhost/analytics` (nginx)	403 (known)	Login endpoint not in pre-built Docker image yet.
4	`just prod` — secure (with certs)	PASS	OSD returns 401	Expected	Infra-only mode. Needs app layer for proxy auth.
5	`just dev stop`	PASS	—	—	Clean PM2 + Docker shutdown.
6	`just prod stop`	PASS	—	—	Clean Docker shutdown.

End-to-End Workflow Analytics Test (Secure Mode)

Step	Result
Clerk auth via sign_in_tokens API	PASS
Import Subfinder workflow (3 nodes)	PASS
Run workflow with `hackerone.com`	PASS (9.9s)
Analytics Sink indexes 32 docs	PASS
OSD Discover shows 32 hits with filters	PASS
Sidebar "Dashboards" link opens OSD	PASS

Automated Checks

Check	Result
`bun typecheck`	PASS (0 errors)
`bun lint`	PASS
Unit tests	PASS
DCO (Signed-off-by)	PASS

Bug Found & Fixed During Testing

Bug	Root Cause	Fix
`just dev` crashes when `CLERK_SECRET_KEY` commented out	`grep` returns exit code 1 under `set -euo pipefail`	Added `\|\| true` to grep pipeline (justfile line 50)

Screenshots

PR #229 — Workflow Analytics Dashboards: File Journey Walkthrough

A reviewer's guide to how every file in this PR connects — told as the story of a user going from just dev to seeing security findings in OpenSearch Dashboards.

Stage 1: Developer Starts the App (`just dev`)

Auth Mode Auto-Detection

When a developer runs just dev, the justfile is the entry point. It reads backend/.env and checks whether CLERK_SECRET_KEY is set:

CLERK_KEY=$(grep -E '^CLERK_SECRET_KEY=' backend/.env | cut -d= -f2- ... || true)

Clerk key present → Secure mode: Clerk auth + OpenSearch Security plugin + TLS + multi-tenant isolation
Clerk key absent → Local auth mode: simpler startup, admin/password login, no multitenancy

Files involved:

File	Role
`justfile`	Orchestrates everything — detects auth mode, composes Docker files, starts PM2
`pm2.config.cjs`	Defines backend/frontend/worker processes; passes `OPENSEARCH_SECURITY_ENABLED=true\|false` from the shell env set by justfile
`backend/.env.example`	Documents all env vars (OpenSearch creds, Clerk keys, session secret)
`worker/.env.example`	Documents worker-specific vars (OpenSearch URL, internal service token)
`frontend/.env.example`	Documents `VITE_CLERK_PUBLISHABLE_KEY`, `VITE_OPENSEARCH_DASHBOARDS_URL`

Infrastructure Boots Up

The justfile composes Docker files depending on the mode:

Always: docker-compose.infra.yml (base services) + docker-compose.dev-ports.yml (expose ports for host-based PM2)
Secure mode adds: docker-compose.dev-secure.yml (TLS, security plugin, proxy auth)

Files involved:

File	Role
`docker/docker-compose.infra.yml`	Core services: PostgreSQL, Temporal, Redis, OpenSearch (security disabled), Dashboards (custom Dockerfile), nginx (port 80)
`docker/docker-compose.dev-ports.yml`	Exposes container ports to host so PM2-based backend/frontend/worker can reach them
`docker/docker-compose.dev-secure.yml`	Secure overlay — enables OpenSearch security plugin, mounts TLS certs, mounts security config, overrides Dashboards to use `opensearch-dashboards.prod.yml` (with proxy auth settings), swaps OpenSearch entrypoint to `docker-entrypoint-security.sh`
`docker/docker-compose.full.yml`	Production all-in-one: runs backend/frontend/worker as containers too (not used in dev, but shares the same nginx/OpenSearch patterns)
`docker/docker-compose.prod.yml`	Production overlay with TLS termination
`docker/certs/.gitignore`	Keeps generated TLS certs out of version control
`docker/scripts/generate-certs.sh`	Generates root CA + node + admin TLS certificates for OpenSearch

OpenSearch Security Bootstrap (Secure Mode Only)

When the security overlay is active, OpenSearch starts with a custom entrypoint that templates the proxy auth config and runs securityadmin.sh:

File	Role
`docker/opensearch-security/docker-entrypoint-security.sh`	First-boot bootstrap: replaces `__INTERNAL_PROXIES__` placeholder in `config.yml` with actual Docker network CIDR regex `(172\|192\|10)\.\d+\.\d+\.\d+`, then runs `securityadmin.sh` in the background (with a marker file to skip on restarts), finally execs the real OpenSearch entrypoint
`docker/opensearch-security/config.yml`	Configures the security plugin: enables XFF (X-Forwarded-For) parsing, defines proxy auth domain (reads `x-proxy-user` and `x-proxy-roles` headers from nginx), and a basic auth fallback for admin API access
`docker/opensearch-security/roles.yml`	Defines RBAC roles: `admin`, `dashboards_readwrite`, and `customer_template_ro` (a template for per-tenant read-only roles that include `indices:data/write/bulk` — critical for Dashboards saved objects)
`docker/opensearch-security/roles_mapping.yml`	Maps roles to users/backend_roles: `admin` → admin user, `dashboards_readwrite` → dashboards_server
`docker/opensearch-security/internal_users.yml`	Defines built-in users: `admin` (full access) and `dashboards_server` (for Dashboards → OpenSearch communication)
`docker/opensearch-security/tenants.yml`	Declares the `global_tenant` (base); per-customer tenants are created dynamically at runtime
`docker/opensearch-security/action_groups.yml`	Custom action groups for fine-grained permissions
`docker/opensearch-security/audit.yml`	Audit logging config (disabled in dev for performance)
`docker/opensearch-security/allowlist.yml`	API allowlist config
`docker/opensearch-security/whitelist.yml`	Legacy whitelist (kept for compatibility)
`docker/opensearch-security/nodes_dn.yml`	Node distinguished names for TLS certificate validation
`docker/scripts/security-init.sh`	Standalone security init script (alternative to entrypoint-based init)
`docker/scripts/hash-password.sh`	Utility to hash passwords for `internal_users.yml`

OpenSearch Dashboards Custom Image

Dashboards uses a custom Dockerfile to remove plugins that could let SaaS tenants escape their sandbox:

File	Role
`docker/opensearch-dashboards.Dockerfile`	Removes 9 plugins: queryWorkbench, reports, anomalyDetection, customImportMap, securityAnalytics, searchRelevance, mlCommons, indexManagement, observability. Why not config? OSD 2.x plugins don't register an `enabled` config schema — setting `plugin.enabled: false` causes a fatal `Unknown configuration key` error. Must physically remove them.
`docker/opensearch-dashboards.yml`	Base Dashboards config: `server.basePath: "/analytics"`, `rewriteBasePath: true`, default route to Discover
`docker/opensearch-dashboards.prod.yml`	Secure Dashboards config: adds `requestHeadersAllowlist: ["securitytenant", "Authorization", "x-forwarded-for"]` (x-forwarded-for is critical for proxy auth), disables security UI (`opensearch_security.readonly_mode.roles: [customer_*]`)
`docker/opensearch-init.sh`	Post-boot initialization: In insecure mode, creates a global `security-findings-*` index pattern in Dashboards so Discover works immediately. In secure mode, skips this — index patterns are created per-tenant on first access.

Stage 2: User Logs In

Frontend Auth Provider Selection

When the browser loads http://localhost (via nginx), the frontend determines which auth mode to use:

File	Role
`frontend/src/auth/AuthProvider.tsx`	Auth mode selection: checks `VITE_AUTH_PROVIDER` env var → Clerk key availability → defaults. In dev mode defaults to `local` unless Clerk is explicitly configured. `LocalAuthProvider` stores admin credentials in Zustand, creates a `basic-{base64}` token, and sets `shipsec_session` cookie via the backend login endpoint. `ClerkAuthProvider` delegates to Clerk's session management.
`frontend/src/components/auth/AdminLoginForm.tsx`	The login form for local auth mode — username/password fields that POST to `/api/v1/auth/login`
`frontend/src/config/env.ts`	Exposes `VITE_OPENSEARCH_DASHBOARDS_URL` (controls whether the Dashboards sidebar link appears) and `VITE_AUTH_PROVIDER`

Backend Auth Validation

File	Role
`backend/src/auth/providers/clerk-auth.provider.ts`	Clerk auth: validates Clerk session tokens, extracts `organizationId` from Clerk's org membership. The org ID becomes the tenant key for all analytics scoping.
`backend/src/auth/providers/local-auth.provider.ts`	Local auth: validates Basic auth credentials against configured admin user, returns a synthetic `AuthContext` with `organizationId: 'local-dev'`
`backend/src/auth/session.utils.ts`	Session token management: `createSessionToken()` → HMAC-SHA256 signed `{username, ts}.signature` encoded as base64. `verifySessionToken()` → timing-safe comparison to prevent timing attacks, 7-day TTL.
`backend/src/app.controller.ts`	Login endpoint (`POST /auth/login`): validates credentials, calls `createSessionToken()`, sets HTTP-only `shipsec_session` cookie. Logout endpoint (`POST /auth/logout`): clears cookie.
`backend/src/main.ts`	Enables cookie-parser middleware (required for session cookie handling), configures CORS

Stage 3: User Sees the Dashboard Sidebar

Once authenticated, the user lands on the main app. The sidebar shows a "Dashboards" link if the env var is configured:

File	Role
`frontend/src/components/layout/AppLayout.tsx`	Sidebar: conditionally renders a "Dashboards" link (BarChart3 icon) pointing to `VITE_OPENSEARCH_DASHBOARDS_URL` (typically `/analytics/app/discover`). Marked as `external: true` so it opens in the same window but via the nginx proxy. Also adds an "Analytics Settings" link under the Settings section.
`frontend/src/components/layout/AppTopBar.tsx`	Top bar — shows org context
`frontend/src/components/layout/TopBar.tsx`	Workflow-level top bar — adds a "View Analytics" button that links to Dashboards filtered by the current workflow
`frontend/src/pages/AnalyticsSettingsPage.tsx`	Analytics Settings page: shows retention period configuration (30/90/180/365 days) based on subscription tier. Currently a UI scaffold — API integration planned for a future ticket.
`frontend/src/App.tsx`	Registers the `/settings/analytics` route pointing to `AnalyticsSettingsPage`

Stage 4: User Builds a Workflow with Analytics

The Analytics Sink Component

Users can add an "Analytics Sink" node to any workflow. It collects output from upstream scanner nodes and indexes it to OpenSearch.

File	Role
`worker/src/components/core/analytics-sink.ts`	The core analytics component (`core.analytics.sink`): accepts multiple configurable data inputs (each with a label and sourceTag), aggregates documents from all inputs, validates workflow context (orgId, workflowId, workflowName are required), then calls the indexer. Supports two modes: lenient (default, fire-and-forget, skips missing inputs) and strict (fails on any error).
`frontend/src/components/workflow/AnalyticsInputsEditor.tsx`	Config panel UI for the Analytics Sink: lets users add/remove/rename data inputs dynamically, auto-generates `sourceTag` from label names for filtering in Dashboards.
`worker/src/components/index.ts`	Registers `analytics-sink` in the component registry
`backend/src/dsl/validator.ts`	Validates workflow DSL — updated to allow the analytics sink's dynamic input ports

Component SDK Extensions

The component SDK was extended to support analytics:

File	Role
`packages/component-sdk/src/analytics.ts`	`analyticsResultSchema()` — Zod schema contract for indexed documents (scanner, finding_hash, severity, asset_key). `generateFindingHash()` — creates stable 16-char SHA-256 dedup keys from field values.
`packages/component-sdk/src/context.ts`	`ExecutionContext` now includes `workflowId`, `workflowName`, and `organizationId` — these are passed from the backend through Temporal to every component execution.
`packages/component-sdk/src/types.ts`	Updated `ExecutionContext` interface with the new fields
`packages/component-sdk/src/index.ts`	Re-exports the new analytics utilities

Workflow Context Injection

For analytics to work, the backend must pass org/workflow identity through to the worker:

File	Role
`backend/src/workflows/workflows.service.ts`	When starting a workflow run, extracts `organizationId` from the authenticated user's context and passes it (along with `workflowId`, `workflowName`) to the Temporal workflow input. This is how the worker knows which org's index to write to.
`backend/src/workflows/workflows.controller.ts`	Passes `AuthContext` to the service layer so org ID is available

Stage 5: Workflow Runs → Data Gets Indexed

When a workflow runs, the Analytics Sink component calls the OpenSearch indexer:

File	Role
`worker/src/utils/opensearch-indexer.ts` (not in PR file list but referenced)	The indexing engine — singleton OpenSearch client in the worker. `bulkIndex()`: (1) ensures tenant is provisioned by calling `POST /api/v1/analytics/ensure-tenant` with the internal service token, (2) builds index name `security-findings-{orgId}-{YYYY.MM.DD}`, (3) enriches each document with `@timestamp` and `shipsec` metadata block (org_id, workflow_id, run_id, component_id, asset_key), (4) sends bulk request with 3x retry + exponential backoff, (5) reports partial failures.
`worker/package.json`	Adds `@opensearch-project/opensearch` dependency

Backend Analytics API

The backend exposes endpoints for both the worker (internal) and the frontend (user-facing):

File	Role
`backend/src/analytics/analytics.module.ts`	NestJS module wiring: imports `OpenSearchModule`, provides `SecurityAnalyticsService`, `OpenSearchTenantService`, `OrganizationSettingsService`
`backend/src/analytics/analytics.controller.ts`	Endpoints: `POST /analytics/query` (user-facing, auto-scoped to org's index pattern, rate-limited 100 req/min), `GET /analytics/settings` + `PUT /analytics/settings` (retention config), `POST /analytics/ensure-tenant` (internal, validates `X-Internal-Token`, idempotent tenant provisioning)
`backend/src/analytics/security-analytics.service.ts`	`query()` — builds OpenSearch query scoped to `security-findings-{orgId}-*`, preventing cross-tenant data access at the application layer
`backend/src/analytics/dto/analytics-query.dto.ts`	Request/response DTOs for the query endpoint (supports OpenSearch DSL passthrough)
`backend/src/analytics/dto/analytics-settings.dto.ts`	DTOs for analytics settings (tier, retention days)
`backend/src/analytics/organization-settings.service.ts`	Manages per-org settings in PostgreSQL; triggers `ensureTenantExists()` on first access
`backend/src/app.module.ts`	Registers the `AnalyticsModule` in the app

OpenSearch Configuration

File	Role
`backend/src/config/opensearch.config.ts`	NestJS config factory: reads `OPENSEARCH_URL`, `OPENSEARCH_USERNAME`, `OPENSEARCH_PASSWORD` from env
`backend/src/config/opensearch.client.ts`	Injectable OpenSearch client wrapper — initializes with auth if credentials provided, skips TLS verification in dev
`backend/src/config/opensearch.module.ts`	Global module that provides `OpenSearchClient` across the app
`backend/scripts/setup-opensearch.ts`	Standalone script to manually bootstrap OpenSearch (index templates, seed data) — useful for debugging

Database Schema

File	Role
`backend/src/database/schema/organization-settings.ts`	Drizzle schema for `organization_settings` table: org_id (PK), subscription_tier, retention_days, timestamps
`backend/src/database/schema/index.ts`	Re-exports the new schema
`backend/src/database/migration.guard.ts`	Updated to handle the new table

Stage 6: User Clicks "Dashboards" → nginx Auth Gateway

This is where it all comes together. When a user clicks the "Dashboards" link, the browser navigates to /analytics/app/discover. Here's the request flow:

Step 1: nginx intercepts `/analytics/*`

Browser → nginx (port 80) → /analytics/app/discover

nginx's auth_request directive fires an internal subrequest:

nginx → /_auth → backend /api/v1/auth/validate

Step 2: Backend validates the session

The backend reads the shipsec_session cookie (or Clerk token), verifies it, and returns org identity in response headers:

X-Auth-Organization-Id: acme-corp
X-Auth-User-Id: user-123

Step 3: nginx injects tenant isolation headers

nginx captures these headers and sets proxy auth headers before forwarding to Dashboards:

x-proxy-user: acme-corp
x-proxy-roles: customer_acme-corp_ro
securitytenant: acme-corp

Step 4: OpenSearch Security enforces isolation

The security plugin reads these headers (via proxy auth config), maps the role customer_acme-corp_ro to index pattern security-findings-acme-corp-*, and restricts all queries to that namespace.

Files involved:

File	Role
`docker/nginx/nginx.dev.conf`	Dev routing: `/_auth` internal location proxies to backend `/api/v1/auth/validate`. `/analytics/` location uses `auth_request /_auth`, captures `$auth_org_id` from response headers, sets `x-proxy-user`, `x-proxy-roles`, `securitytenant` headers, proxies to `opensearch-dashboards:5601`. Critical gotcha: `proxy_set_header` in a location block OVERRIDES all parent-level headers — must repeat `Host`, `X-Forwarded-For` etc.
`docker/nginx/nginx.full.conf`	Production routing: same auth_request pattern, fail-closed (`if ($auth_org_id = "") { return 403; }`), upstreams point to container names instead of `host.docker.internal`
`docker/nginx/nginx.prod.conf`	Standalone production routing with TLS termination
`backend/src/app.controller.ts`	`/auth/validate` endpoint: validates session, sets `X-Auth-Organization-Id` header, triggers fire-and-forget tenant provisioning via `Map<string, Promise<boolean>>` (concurrent requests share the same in-flight promise; failed provisioning is removed from cache to allow retry)
`backend/src/analytics/opensearch-tenant.service.ts`	`ensureTenantExists()` — the 6-step provisioning sequence: (1) create OpenSearch Security tenant, (2) create customer read-only role with `indices:data/write/bulk` for saved objects, (3) create role mapping, (4) create index template with field mappings, (5) create seed index (so Dashboards can resolve fields before real data arrives), (6) create index pattern in Dashboards API. All steps are idempotent with 3x retry + exponential backoff.

Stage 7: Workflow Status & Execution Tracking

File	Role
`packages/shared/src/execution.ts`	Adds `STALE` workflow status for orphaned run records (DB says running but Temporal has no matching workflow — detected during status sync)
`frontend/src/store/runStore.ts`	Handles the new `STALE` status in the run store
`frontend/src/utils/statusBadgeStyles.ts`	Adds badge styling for `STALE` status (grey/warning appearance)
`frontend/src/features/workflow-builder/WorkflowBuilder.tsx`	Workflow builder updates to support analytics sink node configuration
`frontend/src/features/workflow-builder/hooks/useWorkflowImportExport.ts`	Import/export handles the new analytics sink component
`frontend/src/vite.config.ts`	Adds proxy rules for `/api` and `/analytics` in dev mode so the Vite dev server forwards correctly

Stage 8: Testing & Documentation

E2E Test

File	Role
`e2e-tests/analytics.test.ts`	Full end-to-end test: authenticates via Clerk `sign_in_tokens` API → imports a Subfinder workflow with Analytics Sink → runs it against `hackerone.com` → waits for 32 docs to appear in OpenSearch → verifies `shipsec` metadata fields → checks Dashboards index pattern exists

Documentation

File	Role
`docs/analytics.md`	Architecture overview: multi-tenant model, index naming, provisioning flow, troubleshooting
`docs/development/workflow-analytics.mdx`	Developer guide: how to add Analytics Sink to workflows, index patterns, querying
`docs/development/analytics.mdx`	Analytics development reference
`docs/development/component-development.mdx`	Updated with analytics SDK utilities
`docs/components/core.mdx`	Component catalog — documents `core.analytics.sink`
`docs/installation.mdx`	Updated install docs with `just dev` modes
`docs/workflows/execution-status.md`	Documents the new `STALE` status
`docs/docs.json`	Docs navigation — adds analytics section
`docs/media/clerk-user-local-org.png`	Screenshot: Clerk user with local org
`docs/media/clerk-user-test-org.png`	Screenshot: Clerk user with test org
`docs/media/opensearch-tenant-org-id.png`	Screenshot: OpenSearch tenant using org ID
`docs/media/opensearch-tenant-workspace-fallback.png`	Screenshot: workspace fallback
`docker/README.md`	Docker setup documentation
`docker/PRODUCTION.md`	Production deployment guide
`docker/SECURE-DEV-MODE.md`	Secure dev mode setup guide
`.ai/analytics-output-port-design.md`	Design doc: how the analytics output port pattern was designed

Security Component Test Fixes

File	Role
`worker/src/components/security/__tests__/dnsx.test.ts`	Test fixture updated for new component SDK context fields
`worker/src/components/security/__tests__/httpx.test.ts`	Same — test fixtures updated

Other

File	Role
`Dockerfile`	Updated for production build — includes OpenSearch client dependency
`backend/package.json`	Adds `@opensearch-project/opensearch`, `cookie-parser`
`bun.lock`	Lockfile updated with new dependencies

Architecture Summary

┌─────────────────────────────────────────────────────────────────┐
│                        just dev                                  │
│  Detects CLERK_SECRET_KEY → secure mode or local auth            │
│  Composes: infra.yml [+ dev-secure.yml] + dev-ports.yml          │
│  Starts PM2: frontend, backend, worker                           │
└──────────┬──────────────────────────────────────────────────────┘
           │
           ▼
┌──────────────────────┐     ┌──────────────────────────────────┐
│   nginx (port 80)    │     │     Frontend (Vite, port 5173)   │
│                      │     │                                  │
│  / → frontend        │◄────│  AuthProvider: Clerk or local    │
│  /api → backend      │     │  Sidebar: "Dashboards" link      │
│  /analytics → OSD    │     │  AnalyticsInputsEditor           │
│    ↓ auth_request    │     │  AnalyticsSettingsPage            │
│    → /_auth          │     └──────────────────────────────────┘
│    → backend/validate│
│    → set proxy hdrs  │     ┌──────────────────────────────────┐
│    → forward to OSD  │     │     Backend (NestJS, port 3211)  │
└──────────────────────┘     │                                  │
                              │  /auth/login, /auth/validate     │
                              │  /analytics/query (org-scoped)   │
                              │  /analytics/ensure-tenant        │
                              │  OpenSearchTenantService          │
                              │    → 6-step provisioning          │
                              └──────────────────────────────────┘

┌──────────────────────┐     ┌──────────────────────────────────┐
│  OpenSearch + OSD    │     │   Worker (Temporal)               │
│                      │     │                                  │
│  Security plugin:    │◄────│  analytics-sink component         │
│    proxy auth        │     │    → aggregates scanner data     │
│    per-tenant roles  │     │    → opensearch-indexer           │
│    index isolation   │     │      → bulkIndex() with retry    │
│                      │     │      → document enrichment       │
│  Index pattern:      │     │        (@timestamp, shipsec.*)   │
│  security-findings-  │     │      → tenant provisioning       │
│    {orgId}-{date}    │     │        (1-hour cache)            │
└──────────────────────┘     └──────────────────────────────────┘

Security Model (Defense in Depth)

Layer	Mechanism	Files
Image level	Remove dangerous Dashboards plugins	`opensearch-dashboards.Dockerfile`
Proxy level	nginx auth_request + tenant header injection	`nginx.dev.conf`, `nginx.full.conf`
Auth level	Session token verification (HMAC-SHA256)	`session.utils.ts`, `app.controller.ts`
Data level	OpenSearch Security: per-tenant roles, index isolation	`roles.yml`, `config.yml`, `opensearch-tenant.service.ts`
Application level	Query endpoint scopes to org's index pattern	`security-analytics.service.ts`

Key Design Decisions

Fire-and-forget provisioning — Tenant setup happens async after auth validation returns 200. Uses Map<string, Promise<boolean>> so concurrent requests share the same in-flight promise. Failed provisioning is removed from cache to allow retry.
Seed indices — Index patterns in Dashboards need at least one backing index to resolve field types. A seed index with explicit mappings is created during provisioning so @timestamp column is available before any real data arrives.
indices:data/write/bulk at cluster level — The cluster_composite_ops_ro action group does NOT include bulk write. Without explicit indices:data/write/bulk in cluster_permissions, the multitenancy plugin's kibana_all_write index-level grant is never reached, causing 403 on all .kibana_* saves (column preferences, default index pattern, etc.).
Plugin removal via Dockerfile — OSD 2.x plugins that don't register an enabled config schema cause fatal errors when you try pluginId.enabled: false. The only safe path is physical removal at the Docker image level.
nginx header inheritance — A proxy_set_header in any location block OVERRIDES ALL parent-level proxy_set_header directives. The /analytics/ block must repeat standard headers (Host, X-Forwarded-For) alongside the custom proxy auth headers.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 42044b8c24

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

worker/src/utils/opensearch-indexer.ts

betterclever · 2026-01-29T14:00:15Z

Question: Scope of User Stories

Looking at tasks/prd.json, I see all user stories (US-001 through US-015) are marked with "passes": true, including:

US-012: Analytics settings page UI ✅
US-013: Retention settings API endpoints ✅

However, frontend/src/pages/AnalyticsSettingsPage.tsx still contains:

Mock data for current tier and retention (line 42)
TODO comments referencing US-013 for API integration (lines 58, 70)

Question: Are all 15 user stories expected to be completed in this PR, or is US-013 (the backend API) intentionally deferred to a later PR? If it's expected to be complete, the frontend may need to be wired up to the backend endpoints that were reportedly implemented.

LuD1161 · 2026-01-29T19:57:48Z

Question: Scope of User Stories

Looking at tasks/prd.json, I see all user stories (US-001 through US-015) are marked with "passes": true, including:
* **US-012**: Analytics settings page UI ✅

* **US-013**: Retention settings API endpoints ✅
However, frontend/src/pages/AnalyticsSettingsPage.tsx still contains:
* Mock data for current tier and retention (line 42)

* TODO comments referencing US-013 for API integration (lines 58, 70)
Question: Are all 15 user stories expected to be completed in this PR, or is US-013 (the backend API) intentionally deferred to a later PR? If it's expected to be complete, the frontend may need to be wired up to the backend endpoints that were reportedly implemented.

This was a relic from the hottest trend in AI, of ralph . Henceforth removed :)

- Add nginx reverse proxy for unified entry point at http://localhost - Routes: / (frontend), /api (backend), /analytics (OpenSearch Dashboards) - Configure OpenSearch Dashboards with /analytics base path - Add production deployment with TLS and security plugin - SaaS multitenancy with per-customer tenant isolation - Certificate generation script (just generate-certs) - New commands: just dev, just prod-secure Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>

- Add STALE status for orphaned run records (DB/Temporal mismatch) - Improve status inference from trace events when Temporal not found - Use correct TraceEventType values for status detection - Add amber badge color for STALE status - Extract WorkflowNode into modular directory structure - Document all execution statuses with transition diagram Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>

…gration Analytics Sink Component (core.analytics.sink): - Index output data from any upstream node to OpenSearch - Auto-detect asset correlation keys (host, domain, url, ip, etc.) - Fire-and-forget with retry logic (3 attempts, exponential backoff) - Configurable index suffix and fail-on-error modes OpenSearch Integration: - Daily index rotation: security-findings-{orgId}-{YYYY.MM.DD} - Index template with standard metadata fields - Multi-tenant data isolation per organization Analytics API: - POST /api/v1/analytics/query with OpenSearch DSL support - Auto-scope queries to organization's index pattern - Rate limiting: 100 req/min per user - Protected routes require authentication - Session cookie support for analytics route auth UI Integration: - Analytics Settings page with tier-based retention - Dashboards link in sidebar (opens in new tab) - View Analytics button uses Discover app with proper URL state - Uses .keyword fields for exact match filtering Component SDK Extensions: - generateFindingHash() for deduplication - Workflow context (workflowId, workflowName, organizationId) - Results output port on nuclei, trufflehog, supabase-scanner - Support for optional inputs in components Bug fixes: - Fix webhook URLs to include global API prefix (ENG-115) - Add proper connectionType for list variable types - Handle invalid_value errors for placeholder fields Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>

…ovisioning Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>

Document the OpenSearch tenant identity resolution flow, Clerk active org session vs membership distinction, tenant provisioning details, and security guarantees. Add troubleshooting entry for workspace-user fallback with screenshots and diagnostic commands. Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>

…objects Two-layer SaaS lockdown for OpenSearch Dashboards: 1. nginx whitelist: PCRE negative lookahead blocks non-whitelisted /analytics/app/* routes (returns 403). Allowed: Discover, Visualize, Dashboards, Alerting, Dev Tools, Data Explorer, Home. Blocked: ISM, Security, Management, Anomaly Detection, Maps, etc. Admin retains full access via direct Dashboards port (5601). 2. Role permissions: Replace ISM cluster permissions with Alerting permissions (monitor CRUD, alerts, destinations) for tenant roles. Add indices:data/write/bulk cluster permission required for Dashboards saved objects (visualizations, dashboards, saved searches). Without this, multitenancy's kibana_all_write grant is never reached. 3. Default landing page set to Discover instead of Home (which exposes all plugin links including blocked ones). Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>

Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>

…e in prod Base compose configs (infra.yml, full.yml) now use `expose` instead of `ports` for all internal services. Dev-ports overlay binds everything to 127.0.0.1. Only nginx port 80 remains publicly accessible. Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>

…ermission Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>

- Merge nginx.full.conf into nginx.prod.conf (95% identical, prod has better proxy_redirect) - Consolidate DB init scripts: merge temporal DB creation into 01-create-instance-databases.sh - Remove orphaned scripts: dev-instance-manager.sh, instance-bootstrap.sh (unreferenced) - Remove deprecated opensearch-security/whitelist.yml (superseded by allowlist.yml) - Update docker-compose.full.yml and docs to reference nginx.prod.conf Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>

The AnalyticsModule's controller and services depend on ConfigService and OpenSearchClient which aren't available in the MCP test module. Use overrideModule to replace the entire AnalyticsModule with mocks. Also add explicit ConfigModule import to AnalyticsModule. Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>

- Fix PM2 --only filter to use instance-suffixed names (shipsec-backend-0) - Fix Kafka broker port from 19092 to 9092 (matches single-listener Redpanda) - Add whitelist.yml required by securityadmin.sh alongside allowlist.yml Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>

Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>

Keep deletion of scripts/dev-instance-manager.sh from feature branch — script was removed as orphaned in 283d37a; main's bash-compat fix (f100991) is no longer needed. Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>

chatgpt-codex-connector bot reviewed Jan 22, 2026

View reviewed changes

worker/src/utils/opensearch-indexer.ts Show resolved Hide resolved

worker/src/utils/opensearch-indexer.ts Outdated Show resolved Hide resolved

LuD1161 force-pushed the eng-42/workflow-analytics-dashboards branch 12 times, most recently from 0284482 to 8c83d0b Compare January 23, 2026 02:39

LuD1161 requested a review from betterclever January 23, 2026 02:44

LuD1161 force-pushed the eng-42/workflow-analytics-dashboards branch 9 times, most recently from 7afae76 to bd71e89 Compare January 27, 2026 04:49

LuD1161 force-pushed the eng-42/workflow-analytics-dashboards branch from 30b1504 to 5d92c8d Compare January 29, 2026 19:41

LuD1161 force-pushed the eng-42/workflow-analytics-dashboards branch 2 times, most recently from b8d9c3a to bd98d61 Compare January 30, 2026 04:59

LuD1161 mentioned this pull request Feb 5, 2026

test(security): analytics multi-tenant data isolation e2e tests #266

Closed

5 tasks

LuD1161 added 9 commits February 6, 2026 14:51

feat(analytics): add multi-tenant OpenSearch Security with dynamic pr…

82ea105

…ovisioning Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>

refactor(dev): unify justfile commands and harden Dashboards lockdown

3378d22

Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>

fix(analytics): harden tenant security roles and restore bulk write p…

875231f

…ermission Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>

LuD1161 added enhancement New feature or request core platform Anything related to the core platform. security engineering Things we do for security engineering :) labels Feb 6, 2026

LuD1161 force-pushed the eng-42/workflow-analytics-dashboards branch from 000fe03 to 2918958 Compare February 6, 2026 21:06

LuD1161 added 3 commits February 6, 2026 17:16

refactor(ui): move Analytics Settings under Manage sidebar section

ff8e459

Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add Workflow Analytics Dashboards with OpenSearch integration#229

feat: Add Workflow Analytics Dashboards with OpenSearch integration#229
LuD1161 wants to merge 14 commits intomainfrom
eng-42/workflow-analytics-dashboards

LuD1161 commented Jan 22, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Uh oh!

betterclever commented Jan 29, 2026

Uh oh!

LuD1161 commented Jan 29, 2026

Question: Scope of User Stories

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

LuD1161 commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key Features

Commands

Test Results

Justfile + OSD Verification Matrix

End-to-End Workflow Analytics Test (Secure Mode)

Automated Checks

Bug Found & Fixed During Testing

Screenshots

PR #229 — Workflow Analytics Dashboards: File Journey Walkthrough

Stage 1: Developer Starts the App (just dev)

Auth Mode Auto-Detection

Infrastructure Boots Up

OpenSearch Security Bootstrap (Secure Mode Only)

OpenSearch Dashboards Custom Image

Stage 2: User Logs In

Frontend Auth Provider Selection

Backend Auth Validation

Stage 3: User Sees the Dashboard Sidebar

Stage 4: User Builds a Workflow with Analytics

The Analytics Sink Component

Component SDK Extensions

Workflow Context Injection

Stage 5: Workflow Runs → Data Gets Indexed

Backend Analytics API

OpenSearch Configuration

Database Schema

Stage 6: User Clicks "Dashboards" → nginx Auth Gateway

Step 1: nginx intercepts /analytics/*

Step 2: Backend validates the session

Step 3: nginx injects tenant isolation headers

Step 4: OpenSearch Security enforces isolation

Stage 7: Workflow Status & Execution Tracking

Stage 8: Testing & Documentation

E2E Test

Documentation

Security Component Test Fixes

Other

Architecture Summary

Security Model (Defense in Depth)

Key Design Decisions

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

betterclever commented Jan 29, 2026

Question: Scope of User Stories

Uh oh!

LuD1161 commented Jan 29, 2026

Question: Scope of User Stories

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

LuD1161 commented Jan 22, 2026 •

edited

Loading

Stage 1: Developer Starts the App (`just dev`)

Step 1: nginx intercepts `/analytics/*`