test(security): analytics multi-tenant data isolation e2e tests#266
test(security): analytics multi-tenant data isolation e2e tests#266
Conversation
Fixed the ui-ux changes.
- Replace embedded shell scripts with clean shell wrapper pattern - Add buildAmassArgs() and buildSubfinderArgs() TypeScript functions - Use IsolatedContainerVolume for secure file I/O in both components - Add -silent flag to amass to prevent progress bar spam - Add passive mode parameter to amass (default: true for quick scans) - Add new parameters to subfinder: threads, timeout, rateLimit, etc. - Mount provider config as file instead of base64 env var in subfinder - Move output parsing from shell to TypeScript for both components - Update subfinder image to v2.12.0 Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>
- Add default 15-minute timeout to prevent runaway scans - Add configurable DNS resolvers (Cloudflare, Google, Quad9 defaults) - Add configurable data sources, default to lightweight sources only - Exclude wayback/commoncrawl by default (can download 1GB+ per domain) - Disable recursive brute force by default for faster scans - Fix -src flag to -include (correct amass v5 syntax) These optimizations prevent system overload from excessive network I/O while maintaining useful subdomain enumeration capabilities. Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>
Security tools like amass and subfinder can exit non-zero when some data sources fail or rate-limit, but still produce valid partial results. Previously, this would throw ContainerError and lose all output. Changes: - Include stdout in ContainerError details (runner.ts) - Catch ContainerError in amass/subfinder and extract partial output - Log warning when preserving partial results This restores the prior behavior where partial results were returned instead of failing the entire workflow. Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>
…s-pattern refactor(worker): migrate amass/subfinder to Dynamic Args Pattern with perf optimizations
- Add nginx reverse proxy for unified entry point at http://localhost - Routes: / (frontend), /api (backend), /analytics (OpenSearch Dashboards) - Configure OpenSearch Dashboards with /analytics base path - Add production deployment with TLS and security plugin - SaaS multitenancy with per-customer tenant isolation - Certificate generation script (just generate-certs) - New commands: just dev, just prod-secure Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>
- Add STALE status for orphaned run records (DB/Temporal mismatch) - Improve status inference from trace events when Temporal not found - Use correct TraceEventType values for status detection - Add amber badge color for STALE status - Extract WorkflowNode into modular directory structure - Document all execution statuses with transition diagram Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>
…gration
Analytics Sink Component (core.analytics.sink):
- Index output data from any upstream node to OpenSearch
- Auto-detect asset correlation keys (host, domain, url, ip, etc.)
- Fire-and-forget with retry logic (3 attempts, exponential backoff)
- Configurable index suffix and fail-on-error modes
OpenSearch Integration:
- Daily index rotation: security-findings-{orgId}-{YYYY.MM.DD}
- Index template with standard metadata fields
- Multi-tenant data isolation per organization
Analytics API:
- POST /api/v1/analytics/query with OpenSearch DSL support
- Auto-scope queries to organization's index pattern
- Rate limiting: 100 req/min per user
- Protected routes require authentication
- Session cookie support for analytics route auth
UI Integration:
- Analytics Settings page with tier-based retention
- Dashboards link in sidebar (opens in new tab)
- View Analytics button uses Discover app with proper URL state
- Uses .keyword fields for exact match filtering
Component SDK Extensions:
- generateFindingHash() for deduplication
- Workflow context (workflowId, workflowName, organizationId)
- Results output port on nuclei, trufflehog, supabase-scanner
- Support for optional inputs in components
Bug fixes:
- Fix webhook URLs to include global API prefix (ENG-115)
- Add proper connectionType for list variable types
- Handle invalid_value errors for placeholder fields
Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>
…uto-refresh - Add dynamic inputs editor with auto-populated source tags from workflow - Add results port to all security components for analytics output - Fix Data Explorer URL format to preserve time filter - Hide View Analytics button during running workflows - Auto-refresh OpenSearch index patterns after bulk indexing - Add OPENSEARCH_DASHBOARDS_URL env var for worker configuration Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>
Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>
Keep analytics support (generateFindingHash, analyticsResultSchema, results) in amass.ts and subfinder.ts from the workflow-analytics-dashboards branch.
Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>
Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>
…fixture Auto-fix ESLint/Prettier formatting issues in security components and add required allowAny metadata to test analytics fixture. Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>
- Add 'analytics-inputs' to ComponentParameterType union - Fix analytics-fixture to use no-parameters overload - Add type assertions for OpenSearch indexer API responses Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>
- Cast defineComponent to any to bypass strict overload matching - Add explicit type annotations to execute function parameters - Import ExecutionContext and ExecutionPayload types for type safety - Update subfinder test to verify analytics results generation Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>
Add comprehensive security tests verifying that one organization cannot access another organization's analytics data through the dashboard API. Tests cover: - Cross-org data isolation (6 tests) - Aggregation isolation across tenants - Authentication/authorization enforcement - Settings isolation between orgs - Query input validation - Non-existent org handling - Direct OpenSearch index pattern verification Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b21d20a294
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| private serializeNestedFields(document: Record<string, any>): Record<string, any> { | ||
| // Pass through as-is - let OpenSearch handle dynamic mapping | ||
| return { ...document }; |
There was a problem hiding this comment.
Actually serialize nested fields before indexing
The serializeNestedFields helper is documented as preventing mapping explosions by JSON-stringifying nested objects/arrays, but it currently just returns a shallow copy of the document. That means any nested findings (e.g., scanner outputs with deeply nested metadata) will still be indexed as full object trees, which is exactly the scenario that triggers OpenSearch’s field-limit failures you describe in the design doc. In those cases, bulk indexing will error and analytics data is silently dropped. This should walk the document and stringify non-primitive values (or otherwise enforce the intended flattening) before indexing.
Useful? React with 👍 / 👎.
Summary
Adds comprehensive e2e security tests to verify multi-tenant data isolation in the analytics dashboard (follow-up to PR #229). These tests ensure one organization cannot access another organization's analytics data through the API.
Test categories (7 suites, 19 tests):
How it works: Seeds test data directly into OpenSearch under two separate org index patterns (
security-findings-{orgId}-*), then verifies the API layer correctly scopes all queries, aggregations, and settings to the authenticated organization. Cleans up test indices after completion.Testing
bun run testbun run lintbun run typecheckRUN_E2E=trueand OpenSearch + backend services availableDocumentation