feat(router): add Code Mode MCP server#2825
Conversation
Adds a second MCP server next to the existing per-operation one, exposing two generic tools instead of one-tool-per-operation: - code_mode_search_tools: takes prompts, generates GraphQL ops, registers them in the session catalog, returns their TS signatures. - code_mode_run_js: runs an async arrow against the session catalog inside a JS sandbox (V8 isolate), with tools.<name>(vars) bound to the registered ops. Includes: - router/internal/codemode: harness, sandbox, server, storage, tsgen, yoko (query-gen client), calltrace, observability. - proto/wg/cosmo/code_mode/yoko/v1: Yoko query-gen Connect API. - router/pkg/config: code_mode config block (auth, sandbox limits, query-generation endpoint, named-ops, mutation approval). - demo/code-mode: local federation + Yoko mock + start scripts + MCP client config snippets (Claude Code, Claude Desktop, Codex). - demo/code-mode-connect: alternate demo against an external yoko Connect supergraph (set YOKO_DIR). - router-tests: end-to-end named-ops integration test.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
WalkthroughAdds Code Mode: proto + Yoko client & mock, Code Mode MCP server and lifecycle, QuickJS sandbox and pipeline (transpile/shape-check/validation), named-ops storage (memory + Redis), TypeScript bundle generation, observability/calltrace, demo scripts/configs, router integration, and extensive tests. ChangesCode Mode
|
Router-nonroot image scan passed✅ No security vulnerabilities found in image: |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2825 +/- ##
===========================================
- Coverage 46.24% 44.21% -2.03%
===========================================
Files 1084 272 -812
Lines 145014 31002 -114012
Branches 9247 0 -9247
===========================================
- Hits 67055 13708 -53347
+ Misses 76179 15867 -60312
+ Partials 1780 1427 -353
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 14
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
router-tests/go.mod (1)
35-37:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftRemove the
replacedirective downgradinggo.opentelemetry.io/otel/sdktov1.28.0, or upgrade tov1.40.0+This module requires
go.opentelemetry.io/otel/sdk v1.39.0but areplacedirective at line 213 downgrades it tov1.28.0. Version 1.28.0 is vulnerable to CVE-2026-24051, a PATH hijacking vulnerability in the resource detection code that allows arbitrary code execution on macOS/Darwin systems. The vulnerability is fixed in v1.40.0.Also applies to: lines 44-45, 213-214
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router-tests/go.mod` around lines 35 - 37, The go.mod contains a replace that forces go.opentelemetry.io/otel/sdk down to v1.28.0 which is vulnerable; remove that replace or change it to at least v1.40.0 so the required module go.opentelemetry.io/otel/sdk v1.39.0 (and other otel modules) are not downgraded — update or delete the replace directive(s) referencing go.opentelemetry.io/otel/sdk (and any matching replace entries on lines that reference the same module) to point to v1.40.0+ or remove them entirely.
🧹 Nitpick comments (19)
router/internal/codemode/sandbox/sandbox_preamble.js (1)
45-55: ⚡ Quick winUse explicit null/undefined checks in these helpers.
This block uses
== null/!= null, which goes against the repository's JS rules and makes the coercion behavior implicit in a security-sensitive preamble.As per coding guidelines, "Always use strict equality (
===and!==) instead of loose equality (==and!=)."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/sandbox/sandbox_preamble.js` around lines 45 - 55, Replace loose-equality null checks in the sandbox helpers with explicit null/undefined checks: update globalThis.notNull to throw when v === null || v === undefined (instead of v == null), and in globalThis.compact use Array.isArray(v) branch to map and filter with x !== null && x !== undefined, use the object branch only when v !== null && v !== undefined && typeof v === "object", and test c !== null && c !== undefined before assigning to out; keep the same function names (notNull, compact) so the behavior is identical but uses strict null/undefined comparisons.router/pkg/config/config.schema.json (1)
2459-2499: ⚡ Quick winMirror the query-generation runtime requirements in the JSON schema.
This block currently accepts
query_generation.enabled: truewithout anendpoint, andauth.typecan be any string with no mode-specific required fields. That means schema-based validation and editor completion will accept configs that can only fail later at startup. Anif/thenonenabled, an enum forauth.type, and per-moderequiredfields would catch those earlier.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/pkg/config/config.schema.json` around lines 2459 - 2499, Update the query_generation schema so runtime constraints are enforced: add an if/then that when query_generation.enabled is true requires query_generation.endpoint; change auth.type to an enum (e.g. "static" and "oauth") instead of free string; and add conditional sub-schemas for auth using if auth.type == "static" then require "static_token", and if auth.type == "oauth" then require "token_endpoint", "client_id", and "client_secret" (keep additionalProperties: false). Reference the existing query_generation object and its properties enabled, endpoint, and auth (auth.type, static_token, token_endpoint, client_id, client_secret) when adding these if/then/required clauses.router/internal/codemode/sandbox/host.go (1)
187-199: 💤 Low valuePotential redundant message prefix.
When
reasonis empty, line 189 sets it to"Mutation declined by operator", then line 194 prepends"Mutation declined by operator: ", resulting in"Mutation declined by operator: Mutation declined by operator". Consider using the reason directly without the prefix, or adjusting the default to avoid duplication.Suggested fix
func mutationDeclinedResponse(reason string) json.RawMessage { if reason == "" { reason = "Mutation declined by operator" } body, _ := json.Marshal(map[string]any{ "data": nil, "errors": []map[string]string{{ - "message": "Mutation declined by operator: " + reason, + "message": reason, }}, "declined": map[string]string{"reason": reason}, }) return body }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/sandbox/host.go` around lines 187 - 199, The function mutationDeclinedResponse duplicates the phrase "Mutation declined by operator" when reason is empty because reason is set to that exact string and then the errors.message prepends the same prefix; update mutationDeclinedResponse to avoid duplication by either setting the default reason to an empty string and letting the errors.message build the full text, or by using the reason directly in errors.message without adding the hardcoded prefix when reason equals the default; modify the body construction in mutationDeclinedResponse so the "message" and the "declined.reason" are consistent and non-redundant.router/internal/codemode/storage/memory_backend_test.go (1)
218-230: 💤 Low valueConsider using
sync.WaitGroup.Go()for cleaner goroutine management.Based on learnings for this repository (Go 1.25+), prefer
wg.Go(func() {...})over the manualwg.Add(1)+go func() { defer wg.Done(); ... }()pattern.Suggested refactor
for i := range goroutines { - wg.Add(1) - go func(i int) { - defer wg.Done() + wg.Go(func() { _, err := backend.Append(ctx, "shared", []SessionOp{{ Name: "shared-op", Body: fmt.Sprintf("query Shared%d { shared%d }", i, i), Kind: OperationKindQuery, Description: fmt.Sprintf("Shared %d", i), }}) errs <- err - }(i) + }) }Note: This refactor removes the closure parameter
isincewg.Godoesn't support parameters. The loop variableiis captured correctly in Go 1.22+ due to the loop variable semantic change.Based on learnings: "In Go code (Go 1.25+), prefer using sync.WaitGroup.Go(func()) to run a function in a new goroutine."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/storage/memory_backend_test.go` around lines 218 - 230, Replace the manual goroutine management inside the loop (where goroutines is iterated and you call wg.Add(1); go func(i int){ defer wg.Done(); ... }(i)) with sync.WaitGroup.Go by calling wg.Go(func() { ... }) so you no longer call wg.Add or defer wg.Done; remove the closure parameter since Go 1.22+ loop variable capture is safe and let the body call backend.Append(ctx, "shared", []SessionOp{...}) and send its error to errs as before; update references to wg, backend.Append, errs and the loop variable usage accordingly.router/core/router_config.go (1)
117-117: ⚡ Quick winConsider adding Code Mode-specific keys in
Config.Usage().The new feature is wired into config state, but usage telemetry still only reports generic MCP toggles. Adding Code Mode keys would keep adoption visibility complete.
Possible follow-up in
Usage()usage["mcp"] = c.mcp.Enabled + usage["mcp_code_mode_enabled"] = c.mcp.CodeMode.Enabled + usage["mcp_code_mode_named_ops_enabled"] = c.mcp.CodeMode.NamedOps.Enabled + usage["mcp_code_mode_query_generation_enabled"] = c.mcp.CodeMode.QueryGeneration.Enabled usage["mcp_enable_arbitrary_operations"] = c.mcp.EnableArbitraryOperations usage["mcp_exclude_mutations"] = c.mcp.ExcludeMutations usage["mcp_expose_schema"] = c.mcp.ExposeSchema🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/core/router_config.go` at line 117, Config.Usage() currently emits only generic MCP toggle keys and misses Code Mode telemetry; update the Usage method on the Config type to add Code Mode-specific usage keys (e.g., "code_mode.enabled", "code_mode.server_present", or similar) derived from the fields controlling the feature (reference the Config struct and the codeModeServer field / any Code Mode-related flags) so adoption and enablement are reported; ensure keys follow the existing naming/format used by other MCP toggles and are conditionally emitted based on the same presence/enable checks used elsewhere in the code.router/internal/codemode/server/search_handler_test.go (1)
300-346: ⚡ Quick winUse
sync.WaitGroup.Go()instead of manualAdd(1)anddefer Done().This simplifies the code and removes boilerplate. Applies to lines 300-346, 358-377, and 410-435.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/server/search_handler_test.go` around lines 300 - 346, Replace manual wg.Add(1) / go func() { defer wg.Done() ... } blocks with the WaitGroup's Go method to reduce boilerplate: locate the goroutine launches around handleSearch/searchToolRequest in search_handler_test.go (the anonymous goroutines that call srv.handleSearch and populate results) and change each pattern from "wg.Add(1); go func(...) { defer wg.Done(); ... }(...)" to "wg.Go(func() { ... })" (remove the Add and defer Done). Apply the same replacement for the other occurrences noted (the blocks iterating prompts and other goroutines mentioned in the review).router/internal/codemode/server/search_handler.go (2)
224-248: ⚖️ Poor tradeoff
extractGraphQLVariablesBlockmay misparse parentheses inside strings.The parser counts
(and)without considering GraphQL string literals. A variable definition likequery Foo($name: String = "(test)")would incorrectly parse. This edge case is likely rare in generated operations, but worth noting.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/server/search_handler.go` around lines 224 - 248, The function extractGraphQLVariablesBlock miscounts parentheses inside string literals; update its loop to track string state (single-quote, double-quote, and GraphQL block string triple-quotes) and escape sequences so '(' and ')' are ignored when inside any string literal. Specifically, in the for-loop in extractGraphQLVariablesBlock maintain flags like inSingle, inDouble, inBlockString and an escape flag to handle backslashes and detect triple-quote entry/exit, and only increment/decrement depth when none of those in-* flags are true; keep the same return behavior (trimmed substring pointer) and function signature.
21-24: 💤 Low valueHardcoded enum value may drift from proto definition.
yokoOperationKindSubscription = 3assumes the proto enum value. If the proto definition changes, this constant will silently become incorrect. Consider adding a comment referencing the proto definition or using a build-time check.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/server/search_handler.go` around lines 21 - 24, The file defines yokoOperationKindSubscription as a hardcoded numeric constant (yokoOperationKindSubscription yokov1.OperationKind = 3) which can drift if the proto enum changes; replace the magic number by referencing the generated proto enum symbol (use the enum value from the yokov1 package instead of 3) or add a compile-time assertion/readme: add a comment pointing to the exact proto enum name and location and/or add a build-time check (e.g., a const compile-time comparison between yokoOperationKindSubscription and the generated enum value) so the code fails to compile if the proto value changes; update references to yokoOperationKindSubscription accordingly.demo/code-mode/yoko-mock/main_test.go (1)
153-153: 💤 Low valueOrphan interface compliance check serves no purpose.
This line verifies that
*http.ServeMuximplementshttp.Handler, which is always true and tested by the standard library. Consider removing it.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@demo/code-mode/yoko-mock/main_test.go` at line 153, Remove the redundant compile-time interface assertion `var _ http.Handler = (*http.ServeMux)(nil)` from the test — it serves no purpose because `*http.ServeMux` already implements `http.Handler`; delete that line (the orphan assertion referencing http.ServeMux and http.Handler) to clean up the test.router/internal/codemode/server/server.go (1)
289-309: 💤 Low valueConsider documenting the cross-origin bypass intent.
The
AddInsecureBypassPattern("/{path...}")disables cross-origin protection for all paths under the MCP endpoint. While this appears intentional for MCP streaming compatibility, adding a brief comment explaining why this bypass is safe/necessary would help future maintainers understand the security decision.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/server/server.go` around lines 289 - 309, The cross-origin bypass call AddInsecureBypassPattern("/{path...}") in handler() (where cop is a CrossOriginProtection used in the mcp.NewStreamableHTTPHandler options) needs an inline comment documenting why bypassing CORS for MCP streaming is necessary and why it is safe; add a concise comment above the cop.AddInsecureBypassPattern call referencing CrossOriginProtection/AddInsecureBypassPattern and mcp streaming behavior, explaining the compatibility requirement (e.g., long-lived stream/handshake constraints), any mitigations (endpoint is internal/ authenticated via session ID), and that this is an intentional, reviewed decision for maintainers.demo/code-mode/yoko-mock/main.go (1)
295-303: 💤 Low value
filterNonNilmutates the input slice in place.The function reuses the input slice's backing array (
out := ops[:0]), which mutates the originalresultsslice. While this works correctly here sinceresultsisn't used after the call, this pattern can cause subtle bugs if the caller expects the original slice to remain unchanged.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@demo/code-mode/yoko-mock/main.go` around lines 295 - 303, The function filterNonNil currently reuses and mutates the input slice backing array (out := ops[:0]); change it to allocate a new slice to avoid mutating the caller's slice by creating a fresh slice (e.g., make([]*yokov1.GeneratedOperation, 0, len(ops))) and append non-nil ops into it, returning that new slice so callers' original results/ops remain unchanged; update references inside filterNonNil to use the new local slice variable instead of ops[:0].router/internal/codemode/storage/redis_backend_test.go (1)
182-196: 💤 Low valueConsider using
sync.WaitGroup.Gofor cleaner concurrency.The manual
wg.Add(1)followed bygo func() { defer wg.Done() }pattern can be simplified usingwg.Go(func())which manages Add/Done automatically.♻️ Simplified with WaitGroup.Go
var wg sync.WaitGroup errs := make(chan error, goroutines) for i := 0; i < goroutines; i++ { - wg.Add(1) - go func(worker int) { - defer wg.Done() + wg.Go(func() { + worker := i ops := make([]SessionOp, 0, opsPerGoroutine) for j := 0; j < opsPerGoroutine; j++ { ops = append(ops, SessionOp{Name: fmt.Sprintf("op_%02d_%02d", worker, j), Body: "query { ok }", Kind: OperationKindQuery}) } _, err := backend.Append(ctx, "session-1", ops) errs <- err - }(i) + }) }Based on learnings: In Go code (Go 1.25+), prefer using sync.WaitGroup.Go(func()) to run a function in a new goroutine.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/storage/redis_backend_test.go` around lines 182 - 196, Replace the manual Add/Done pattern with WaitGroup.Go to simplify the concurrent worker launch: use wg.Go to start each worker instead of calling wg.Add(1) then launching a goroutine that defers wg.Done(); keep the same body that builds ops (SessionOp creation loop) and calls backend.Append(ctx, "session-1", ops), and still send the result into errs channel; ensure wg variable remains the sync.WaitGroup used and that errs and opsPerGoroutine/goroutines are unchanged.router/internal/codemode/sandbox/execute.go (1)
192-202: 💤 Low valueRedundant condition in memory error detection.
Line 198:
strings.Contains(lower, "out of memory")is redundant since any string containing "out of memory" will also contain "memory", which is already checked by the first condition.♻️ Simplified condition
func classifyRuntimeError(err error, ctx context.Context) *ErrorEnvelope { if ctx.Err() != nil { return &ErrorEnvelope{Name: "Timeout", Message: ctx.Err().Error(), Stack: ""} } msg := err.Error() lower := strings.ToLower(msg) - if strings.Contains(lower, "memory") || strings.Contains(lower, "out of memory") { + if strings.Contains(lower, "memory") { return &ErrorEnvelope{Name: "MemoryLimit", Message: msg, Stack: ""} } return &ErrorEnvelope{Name: "Error", Message: msg, Stack: ""} }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/sandbox/execute.go` around lines 192 - 202, In classifyRuntimeError, remove the redundant strings.Contains(lower, "out of memory") check and simplify the memory detection to a single condition (e.g., if strings.Contains(lower, "memory") { ... }) so the function only checks for "memory" once; update the if in classifyRuntimeError to use the single contains-check and keep the returned ErrorEnvelope.Name as "MemoryLimit".router/internal/codemode/server/execute_handler_test.go (1)
328-332: 💤 Low valueUnused method
lastSpanContext.This method is defined but never called in any test. Consider removing it or adding a test that validates span context propagation if that's the intent.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/server/execute_handler_test.go` around lines 328 - 332, The method recordingPipeline.lastSpanContext is defined but unused; either remove this dead code or add a test that asserts span context propagation by invoking recordingPipeline.lastSpanContext after running a pipeline operation that creates a span. Locate the recordingPipeline type and the lastSpan field alongside the lastSpanContext() method, then either delete the lastSpanContext method and its mutex use, or add a new test that runs the pipeline (e.g., triggers Execute/Handle) and calls recordingPipeline.lastSpanContext() to verify the returned trace.SpanContext matches the expected span.router/internal/codemode/yoko/client_test.go (2)
321-336: 💤 Low valuePrefer
sync.WaitGroup.Gofor goroutine management.The manual
wg.Add(2)followed bygo func() { defer wg.Done(); ... }()pattern should be replaced with the idiomaticwg.Go(func())pattern available in Go 1.25+.♻️ Suggested refactor
var wg sync.WaitGroup - wg.Add(2) results := make([]*yokov1.SearchResponse, 2) errs := make([]error, 2) - go func() { - defer wg.Done() + wg.Go(func() { results[0], errs[0] = client.Search(context.Background(), "session-1", []string{"first"}) - }() + }) <-indexStarted - go func() { - defer wg.Done() + wg.Go(func() { results[1], errs[1] = client.Search(context.Background(), "session-2", []string{"second"}) - }() + })Based on learnings: "In Go code (Go 1.25+), prefer using sync.WaitGroup.Go(func()) to run a function in a new goroutine, letting the WaitGroup manage Add/Done automatically."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/yoko/client_test.go` around lines 321 - 336, Replace the manual wg.Add(2) + two go func() { defer wg.Done(); ... }() blocks with the Go 1.25+ idiom using wg.Go(func() { ... }) for each goroutine: keep the existing variables (wg, results, errs), preserve the call sites client.Search(context.Background(), "session-1", ... ) and client.Search(... "session-2", ... ), and retain the synchronization sequence around indexStarted, time.Sleep(25*time.Millisecond), close(releaseIndex) and wg.Wait(); this removes the explicit wg.Add and uses wg.Go to manage Add/Done automatically while preserving behavior.
364-379: 💤 Low valueSame WaitGroup pattern issue as above.
Apply the same
wg.Go(func())refactor here.♻️ Suggested refactor
var wg sync.WaitGroup - wg.Add(2) results := make([]*yokov1.SearchResponse, 2) errs := make([]error, 2) - go func() { - defer wg.Done() + wg.Go(func() { results[0], errs[0] = client.Search(context.Background(), "session-1", []string{"first"}) - }() + }) <-indexStarted - go func() { - defer wg.Done() + wg.Go(func() { results[1], errs[1] = client.Search(context.Background(), "session-2", []string{"second"}) - }() + })Based on learnings: "In Go code (Go 1.25+), prefer using sync.WaitGroup.Go(func()) to run a function in a new goroutine."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/yoko/client_test.go` around lines 364 - 379, The test uses manual goroutines with wg.Add and defer wg.Done; replace those with sync.WaitGroup.Go calls to avoid the pattern issue: remove explicit wg.Add(2) and the two go func() blocks and instead call wg.Go(func() { results[0], errs[0] = client.Search(context.Background(), "session-1", []string{"first"}) }) and similarly wg.Go for the "session-2" call (keeping the <-indexStarted, time.Sleep and close(releaseIndex) logic intact); ensure you reference the same results, errs slices and client.Search invocations and keep wg.Wait() at the end.router/internal/codemode/storage/redis_backend.go (2)
345-349: ⚡ Quick winNon-atomic SET + EXPIRE could leave keys without TTL.
If
Setsucceeds butExpirefails, the key will persist indefinitely. Consider usingSetwith the TTL option directly.♻️ Suggested fix
func (b *RedisBackend) setWithTTL(ctx context.Context, key string, value []byte) error { - if err := b.client.Set(ctx, key, value, 0).Err(); err != nil { - return err - } - return b.client.Expire(ctx, key, b.sessionTTL).Err() + return b.client.Set(ctx, key, value, b.sessionTTL).Err() }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/storage/redis_backend.go` around lines 345 - 349, setWithTTL performs a non-atomic Set followed by Expire which can leave keys without TTL if Expire fails; change RedisBackend.setWithTTL to call the client's Set with the expiration argument directly (use b.client.Set(ctx, key, value, b.sessionTTL).Err()) and remove the separate b.client.Expire call so the value is written with TTL atomically (refer to RedisBackend.setWithTTL, b.client.Set, b.client.Expire, and b.sessionTTL).
91-147: ⚖️ Poor tradeoffUnbounded retry loop in Append may cause indefinite blocking.
The retry loop has no maximum retry count. While context cancellation provides an escape hatch, persistent Redis failures (e.g., connection issues, cluster failover) could cause requests to block indefinitely until the context times out. Consider adding a max retry limit.
♻️ Suggested approach
backoff := 5 * time.Millisecond + const maxRetries = 10 var appended []SessionOp - for { + for attempt := 0; attempt < maxRetries; attempt++ { if err := ctx.Err(); err != nil { return nil, err } // ... existing logic ... } + return nil, fmt.Errorf("code mode redis append: max retries (%d) exceeded", maxRetries)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/storage/redis_backend.go` around lines 91 - 147, The Append retry loop currently retries indefinitely; add a max retry limit (e.g., maxRetries constant) and an attempts counter inside the loop (or use a for attempts := 0; attempts < maxRetries; attempts++ pattern) so that after maxRetries failed Watch attempts you stop retrying, log and return a clear error (e.g., ErrMaxAppendRetries or a wrapped error) instead of looping forever; ensure you still check ctx.Err() each iteration, keep the existing backoff logic with sleepWithContext(ctx, backoff), and reference the same symbols (opsKey/bundleKey computation, b.readOps, tx.TxPipelined, backoff, sleepWithContext) when implementing the early exit and error return.router/internal/codemode/sandbox/sandbox_test.go (1)
580-592: 💤 Low valuePrefer
sync.WaitGroup.Gofor goroutine management.The loop with
wg.Add(1)followed bygo func()should use the idiomaticwg.Go(func())pattern.♻️ Suggested refactor
var wg sync.WaitGroup for range 5 { - wg.Add(1) - go func() { - defer wg.Done() + wg.Go(func() { _, err := s.Execute(context.Background(), ExecuteRequest{ SessionID: "s1", ToolNames: []string{"ping"}, WrappedJS: `async () => await tools.ping()`, }) assert.NoError(t, err) - }() + }) }Based on learnings: "In Go code (Go 1.25+), prefer using sync.WaitGroup.Go(func()) to run a function in a new goroutine."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/sandbox/sandbox_test.go` around lines 580 - 592, Replace the manual wg.Add(1)/go func(){ defer wg.Done(); ... } pattern with the Go 1.25+ idiom wg.Go(func() { ... }) for goroutine management: remove the explicit wg.Add and defer wg.Done lines and call wg.Go(func() { _, err := s.Execute(context.Background(), ExecuteRequest{ SessionID: "s1", ToolNames: []string{"ping"}, WrappedJS: `async () => await tools.ping()`, }); assert.NoError(t, err) }) so the WaitGroup handles starting the goroutine; keep references to wg, s.Execute and ExecuteRequest unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@demo/code-mode/mcp-configs/codex.toml`:
- Line 2: Update the MCP URL in codex.toml to use an IPv4 loopback address:
change the url value (the "url" key in demo/code-mode/mcp-configs/codex.toml)
from "http://localhost:5027/mcp" to use "127.0.0.1" so the demo connects over
IPv4 rather than resolving to ::1.
In `@demo/code-mode/yoko-mock/main.go`:
- Around line 153-190: The Index method has a race where s.schemas[id] is
checked, unlocked, and later written, allowing duplicate work and temp-dir
leaks; fix by applying a double-check: after obtaining sessionID from
s.runCodexIndex and before assigning s.schemas[id], re-acquire s.mu, check if an
entry for id now exists (use existing := s.schemas[id] / schemaEntry), and if it
does, discard (remove) the newly created dir and use the existing
entry/sessionID for the response; otherwise insert the new schemaEntry;
alternatively replace with a singleflight.Group keyed by schema id to dedupe
concurrent runs while keeping parallelism for different schemas.
In `@router/core/graph_server.go`:
- Around line 1403-1410: The SDL print and codeModeServer.Reload failures are
currently only logged (via s.logger.Error) so the router starts with a broken
Code Mode; change this to hard-fail the mux build just like the MCP block: when
astprinter.PrintString(executor.ClientSchema) returns printErr or when
s.codeModeServer.Reload(executor.ClientSchema, sdl) returns mErr,
propagate/return the error (or call the same fatal/exit path used by the MCP
reload) instead of only logging it so startup aborts; update the block that
checks opts.IsBaseGraph() && s.codeModeServer != nil to return the error
(wrapping printErr/mErr with context mentioning Code Mode SDL/Reload and
executor.ClientSchema) so the build fails deterministically.
In `@router/internal/codemode/sandbox/errors.go`:
- Around line 144-160: The lookup function in type sourceMap seeds best with
line[0], causing columns before the first mapped segment to return a
false-positive mapping; change the logic in lookup (function name: lookup,
receiver: *sourceMap) to first check whether the earliest segment's
generatedColumn is greater than column0 and return (mapping{}, false) if so,
otherwise iterate and update best only when candidate.generatedColumn <= column0
(do not pre-seed best with line[0]); this ensures columns before the first
segment correctly return no mapping.
In `@router/internal/codemode/sandbox/headers.go`:
- Around line 31-43: The copyAllowedHeaders function currently only checks
static hopByHopHeaders but ignores dynamic hop-by-hop names declared in the
Connection header; update copyAllowedHeaders to parse src.Get("Connection") (or
src.Values("Connection")), split on commas, canonicalize each token (use
http.CanonicalHeaderKey then strings.ToLower) and treat those names as
additional hop-by-hop headers (merge into a temporary set or check before
copying). Keep existing checks for hopByHopHeaders and allow map; ensure any
header named in Connection is skipped just like entries in hopByHopHeaders
before dst.Add is called.
In `@router/internal/codemode/sandbox/sandbox_preamble.js`:
- Around line 62-66: The sandbox deletion of globalThis.eval and
globalThis.Function is insufficient because constructors like
AsyncFunction/GeneratorFunction remain reachable via constructors on real
functions; update the hardening in sandbox_preamble.js to also remove or
neutralize constructor access on function prototypes (e.g., remove/overwrite
Function.prototype.constructor and related prototype constructor properties)
and/or freeze Function.prototype so that expressions like (async
()=>{}).constructor, (function*(){}).constructor and (async
function*(){}).constructor cannot be used to obtain executable constructors;
target globalThis.eval, globalThis.Function, Function.prototype.constructor and
any prototype constructor lookups and ensure they are deleted or replaced with
non-callable values or made non-configurable to block constructor-based code
evaluation.
In `@router/internal/codemode/server/lifecycle.go`:
- Around line 122-140: When storage.NewRedisBackend(storage.RedisConfig{...})
fails after creating the Redis client via
factory(&rediscloser.RedisCloserOptions{...}), close the client to avoid leaking
connections: after err != nil from NewRedisBackend, call the client's
close/shutdown method (e.g., client.Close() or client.Shutdown() depending on
the client type) and handle/log any close error, then return the formatted
error; update the error path in the function that calls factory and
storage.NewRedisBackend to ensure client is always closed on backend creation
failure.
In `@router/internal/codemode/server/search_handler_test.go`:
- Around line 303-320: The test spawns goroutines that call srv.handleSearch via
searchToolRequest and currently uses require.* inside those worker goroutines;
replace in-goroutine assertions with collecting the returned error/results into
shared slices or an error channel (e.g., errs []error or errCh) alongside
results[], and let each goroutine set results[i] and errs[i] before returning;
keep require.Eventually(parent) checks like require.Eventually(...
searcher.callCount()) in the main goroutine, call wg.Wait(), then in the parent
test goroutine iterate over errs/results and perform require.NoError/assertions
there (referencing wg, results, errs/errCh, srv.handleSearch, searchToolRequest,
and searcher.callCount to locate spots).
In `@router/internal/codemode/storage/memory_backend.go`:
- Around line 159-191: In Bundle, when the session is missing you call
renderCapped before verifying b.renderer, which can panic; modify Bundle
(function Bundle on MemoryBackend) to check for b.renderer == nil and return the
configured error ("code mode storage renderer is not configured") before any
call to b.renderCapped (including the early path that handles a missing
session), ensuring all paths validate the renderer prior to invoking
renderCapped and leaving other logic (session locking, bundle caching on
memSession, setting lastUsed) unchanged.
In `@router/internal/codemode/storage/naming.go`:
- Around line 144-151: The lowerFirst function currently only lowercases the
first rune which produces mixed-case artifacts for inputs like "GETValue";
update lowerFirst to lowercase the entire initial token (the contiguous run of
identifier characters) rather than a single rune: in lowerFirst, iterate runes
from the start to find the token end (use unicode.IsLetter/unicode.IsDigit or
unicode.IsMark as appropriate), lowercase the substring comprising that token
(e.g., via strings.ToLower on that slice), then append the remainder and return;
keep the function name lowerFirst and preserve behavior for empty strings.
In `@router/internal/codemode/tsgen/graphql.go`:
- Around line 17-30: The renderOperation function currently only checks parse
errors; after parsing opDoc and before calling singleOperationRef you must
validate the parsed document against r.schema and return a clear error if
validation fails. Add a schema validation step (e.g., call the project’s GraphQL
document validator with r.schema and opDoc — place it after
astparser.ParseGraphqlDocumentString and before singleOperationRef in
renderOperation) and on validation errors return fmt.Errorf("render op %q:
validate GraphQL operation: %s", op.Name, <validation error details>); ensure
you import/use the repository's validator API and include the validation error
details in the log/returned error.
- Around line 150-169: The inputObjectType function currently always inlines
input objects and will infinitely recurse on recursive input types (e.g.,
InputObjectTypeDefinitions with self-references). Add a visited/alias strategy:
maintain a visited map from inputObjectRef (int) to a generated type alias name
(string) on the operationRenderer (or pass a context map into inputObjectType);
when first visiting an inputObjectRef generate and record a unique alias (e.g.,
UserFilterInput), emit/collect a top-level alias declaration instead of fully
inlining nested structure, and for subsequent visits return the alias name
(respecting writeNullable) instead of recursing; update any call sites that rely
on inputObjectType to ensure collected aliases are emitted (use symbols:
inputObjectType, writeInlineObject, writeNullable, tsProperty,
InputObjectTypeDefinitions, InputFieldsDefinition).
- Around line 61-87: variablesType currently treats any NonNull variable as
required; instead detect whether a variable has a default and treat defaulted
variables as optional. Inside variablesType (and where you construct tsProperty
for each varRef) call the document helper that exposes a variable default (e.g.
the same area that used VariableDefinitionNameString/VariableDefinitionType —
use VariableDefinitionDefaultValue or HasVariableDefinitionDefaultValue
equivalent) and then: 1) compute requiredOnly := (type is NonNull AND no default
present); 2) set tsProperty.optional = !requiredOnly; and 3) only flip
varsOptional to false when requiredOnly is true. Update the code paths in
variablesType and the tsProperty construction to use this default-aware logic.
In `@router/pkg/config/code_mode_validation.go`:
- Around line 3-22: The validator ValidateMCPCodeMode currently documents but
does not reject the unsupported combination of cfg.NamedOps.Enabled with
sessionStateless=true; change it to return a clear error when cfg.Enabled &&
cfg.NamedOps.Enabled && sessionStateless are true (e.g., return
fmt.Errorf("mcp.code_mode.named_ops requires stateful MCP sessions but sessions
are configured stateless")), and add the fmt import if missing so the config
load fails fast instead of deferring to runtime; reference ValidateMCPCodeMode,
cfg.NamedOps.Enabled and sessionStateless to locate the change.
---
Outside diff comments:
In `@router-tests/go.mod`:
- Around line 35-37: The go.mod contains a replace that forces
go.opentelemetry.io/otel/sdk down to v1.28.0 which is vulnerable; remove that
replace or change it to at least v1.40.0 so the required module
go.opentelemetry.io/otel/sdk v1.39.0 (and other otel modules) are not downgraded
— update or delete the replace directive(s) referencing
go.opentelemetry.io/otel/sdk (and any matching replace entries on lines that
reference the same module) to point to v1.40.0+ or remove them entirely.
---
Nitpick comments:
In `@demo/code-mode/yoko-mock/main_test.go`:
- Line 153: Remove the redundant compile-time interface assertion `var _
http.Handler = (*http.ServeMux)(nil)` from the test — it serves no purpose
because `*http.ServeMux` already implements `http.Handler`; delete that line
(the orphan assertion referencing http.ServeMux and http.Handler) to clean up
the test.
In `@demo/code-mode/yoko-mock/main.go`:
- Around line 295-303: The function filterNonNil currently reuses and mutates
the input slice backing array (out := ops[:0]); change it to allocate a new
slice to avoid mutating the caller's slice by creating a fresh slice (e.g.,
make([]*yokov1.GeneratedOperation, 0, len(ops))) and append non-nil ops into it,
returning that new slice so callers' original results/ops remain unchanged;
update references inside filterNonNil to use the new local slice variable
instead of ops[:0].
In `@router/core/router_config.go`:
- Line 117: Config.Usage() currently emits only generic MCP toggle keys and
misses Code Mode telemetry; update the Usage method on the Config type to add
Code Mode-specific usage keys (e.g., "code_mode.enabled",
"code_mode.server_present", or similar) derived from the fields controlling the
feature (reference the Config struct and the codeModeServer field / any Code
Mode-related flags) so adoption and enablement are reported; ensure keys follow
the existing naming/format used by other MCP toggles and are conditionally
emitted based on the same presence/enable checks used elsewhere in the code.
In `@router/internal/codemode/sandbox/execute.go`:
- Around line 192-202: In classifyRuntimeError, remove the redundant
strings.Contains(lower, "out of memory") check and simplify the memory detection
to a single condition (e.g., if strings.Contains(lower, "memory") { ... }) so
the function only checks for "memory" once; update the if in
classifyRuntimeError to use the single contains-check and keep the returned
ErrorEnvelope.Name as "MemoryLimit".
In `@router/internal/codemode/sandbox/host.go`:
- Around line 187-199: The function mutationDeclinedResponse duplicates the
phrase "Mutation declined by operator" when reason is empty because reason is
set to that exact string and then the errors.message prepends the same prefix;
update mutationDeclinedResponse to avoid duplication by either setting the
default reason to an empty string and letting the errors.message build the full
text, or by using the reason directly in errors.message without adding the
hardcoded prefix when reason equals the default; modify the body construction in
mutationDeclinedResponse so the "message" and the "declined.reason" are
consistent and non-redundant.
In `@router/internal/codemode/sandbox/sandbox_preamble.js`:
- Around line 45-55: Replace loose-equality null checks in the sandbox helpers
with explicit null/undefined checks: update globalThis.notNull to throw when v
=== null || v === undefined (instead of v == null), and in globalThis.compact
use Array.isArray(v) branch to map and filter with x !== null && x !==
undefined, use the object branch only when v !== null && v !== undefined &&
typeof v === "object", and test c !== null && c !== undefined before assigning
to out; keep the same function names (notNull, compact) so the behavior is
identical but uses strict null/undefined comparisons.
In `@router/internal/codemode/sandbox/sandbox_test.go`:
- Around line 580-592: Replace the manual wg.Add(1)/go func(){ defer wg.Done();
... } pattern with the Go 1.25+ idiom wg.Go(func() { ... }) for goroutine
management: remove the explicit wg.Add and defer wg.Done lines and call
wg.Go(func() { _, err := s.Execute(context.Background(), ExecuteRequest{
SessionID: "s1", ToolNames: []string{"ping"}, WrappedJS: `async () => await
tools.ping()`, }); assert.NoError(t, err) }) so the WaitGroup handles starting
the goroutine; keep references to wg, s.Execute and ExecuteRequest unchanged.
In `@router/internal/codemode/server/execute_handler_test.go`:
- Around line 328-332: The method recordingPipeline.lastSpanContext is defined
but unused; either remove this dead code or add a test that asserts span context
propagation by invoking recordingPipeline.lastSpanContext after running a
pipeline operation that creates a span. Locate the recordingPipeline type and
the lastSpan field alongside the lastSpanContext() method, then either delete
the lastSpanContext method and its mutex use, or add a new test that runs the
pipeline (e.g., triggers Execute/Handle) and calls
recordingPipeline.lastSpanContext() to verify the returned trace.SpanContext
matches the expected span.
In `@router/internal/codemode/server/search_handler_test.go`:
- Around line 300-346: Replace manual wg.Add(1) / go func() { defer wg.Done()
... } blocks with the WaitGroup's Go method to reduce boilerplate: locate the
goroutine launches around handleSearch/searchToolRequest in
search_handler_test.go (the anonymous goroutines that call srv.handleSearch and
populate results) and change each pattern from "wg.Add(1); go func(...) { defer
wg.Done(); ... }(...)" to "wg.Go(func() { ... })" (remove the Add and defer
Done). Apply the same replacement for the other occurrences noted (the blocks
iterating prompts and other goroutines mentioned in the review).
In `@router/internal/codemode/server/search_handler.go`:
- Around line 224-248: The function extractGraphQLVariablesBlock miscounts
parentheses inside string literals; update its loop to track string state
(single-quote, double-quote, and GraphQL block string triple-quotes) and escape
sequences so '(' and ')' are ignored when inside any string literal.
Specifically, in the for-loop in extractGraphQLVariablesBlock maintain flags
like inSingle, inDouble, inBlockString and an escape flag to handle backslashes
and detect triple-quote entry/exit, and only increment/decrement depth when none
of those in-* flags are true; keep the same return behavior (trimmed substring
pointer) and function signature.
- Around line 21-24: The file defines yokoOperationKindSubscription as a
hardcoded numeric constant (yokoOperationKindSubscription yokov1.OperationKind =
3) which can drift if the proto enum changes; replace the magic number by
referencing the generated proto enum symbol (use the enum value from the yokov1
package instead of 3) or add a compile-time assertion/readme: add a comment
pointing to the exact proto enum name and location and/or add a build-time check
(e.g., a const compile-time comparison between yokoOperationKindSubscription and
the generated enum value) so the code fails to compile if the proto value
changes; update references to yokoOperationKindSubscription accordingly.
In `@router/internal/codemode/server/server.go`:
- Around line 289-309: The cross-origin bypass call
AddInsecureBypassPattern("/{path...}") in handler() (where cop is a
CrossOriginProtection used in the mcp.NewStreamableHTTPHandler options) needs an
inline comment documenting why bypassing CORS for MCP streaming is necessary and
why it is safe; add a concise comment above the cop.AddInsecureBypassPattern
call referencing CrossOriginProtection/AddInsecureBypassPattern and mcp
streaming behavior, explaining the compatibility requirement (e.g., long-lived
stream/handshake constraints), any mitigations (endpoint is internal/
authenticated via session ID), and that this is an intentional, reviewed
decision for maintainers.
In `@router/internal/codemode/storage/memory_backend_test.go`:
- Around line 218-230: Replace the manual goroutine management inside the loop
(where goroutines is iterated and you call wg.Add(1); go func(i int){ defer
wg.Done(); ... }(i)) with sync.WaitGroup.Go by calling wg.Go(func() { ... }) so
you no longer call wg.Add or defer wg.Done; remove the closure parameter since
Go 1.22+ loop variable capture is safe and let the body call backend.Append(ctx,
"shared", []SessionOp{...}) and send its error to errs as before; update
references to wg, backend.Append, errs and the loop variable usage accordingly.
In `@router/internal/codemode/storage/redis_backend_test.go`:
- Around line 182-196: Replace the manual Add/Done pattern with WaitGroup.Go to
simplify the concurrent worker launch: use wg.Go to start each worker instead of
calling wg.Add(1) then launching a goroutine that defers wg.Done(); keep the
same body that builds ops (SessionOp creation loop) and calls
backend.Append(ctx, "session-1", ops), and still send the result into errs
channel; ensure wg variable remains the sync.WaitGroup used and that errs and
opsPerGoroutine/goroutines are unchanged.
In `@router/internal/codemode/storage/redis_backend.go`:
- Around line 345-349: setWithTTL performs a non-atomic Set followed by Expire
which can leave keys without TTL if Expire fails; change RedisBackend.setWithTTL
to call the client's Set with the expiration argument directly (use
b.client.Set(ctx, key, value, b.sessionTTL).Err()) and remove the separate
b.client.Expire call so the value is written with TTL atomically (refer to
RedisBackend.setWithTTL, b.client.Set, b.client.Expire, and b.sessionTTL).
- Around line 91-147: The Append retry loop currently retries indefinitely; add
a max retry limit (e.g., maxRetries constant) and an attempts counter inside the
loop (or use a for attempts := 0; attempts < maxRetries; attempts++ pattern) so
that after maxRetries failed Watch attempts you stop retrying, log and return a
clear error (e.g., ErrMaxAppendRetries or a wrapped error) instead of looping
forever; ensure you still check ctx.Err() each iteration, keep the existing
backoff logic with sleepWithContext(ctx, backoff), and reference the same
symbols (opsKey/bundleKey computation, b.readOps, tx.TxPipelined, backoff,
sleepWithContext) when implementing the early exit and error return.
In `@router/internal/codemode/yoko/client_test.go`:
- Around line 321-336: Replace the manual wg.Add(2) + two go func() { defer
wg.Done(); ... }() blocks with the Go 1.25+ idiom using wg.Go(func() { ... })
for each goroutine: keep the existing variables (wg, results, errs), preserve
the call sites client.Search(context.Background(), "session-1", ... ) and
client.Search(... "session-2", ... ), and retain the synchronization sequence
around indexStarted, time.Sleep(25*time.Millisecond), close(releaseIndex) and
wg.Wait(); this removes the explicit wg.Add and uses wg.Go to manage Add/Done
automatically while preserving behavior.
- Around line 364-379: The test uses manual goroutines with wg.Add and defer
wg.Done; replace those with sync.WaitGroup.Go calls to avoid the pattern issue:
remove explicit wg.Add(2) and the two go func() blocks and instead call
wg.Go(func() { results[0], errs[0] = client.Search(context.Background(),
"session-1", []string{"first"}) }) and similarly wg.Go for the "session-2" call
(keeping the <-indexStarted, time.Sleep and close(releaseIndex) logic intact);
ensure you reference the same results, errs slices and client.Search invocations
and keep wg.Wait() at the end.
In `@router/pkg/config/config.schema.json`:
- Around line 2459-2499: Update the query_generation schema so runtime
constraints are enforced: add an if/then that when query_generation.enabled is
true requires query_generation.endpoint; change auth.type to an enum (e.g.
"static" and "oauth") instead of free string; and add conditional sub-schemas
for auth using if auth.type == "static" then require "static_token", and if
auth.type == "oauth" then require "token_endpoint", "client_id", and
"client_secret" (keep additionalProperties: false). Reference the existing
query_generation object and its properties enabled, endpoint, and auth
(auth.type, static_token, token_endpoint, client_id, client_secret) when adding
these if/then/required clauses.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 75bd35af-298a-4502-af56-1e827ad4e903
⛔ Files ignored due to path filters (6)
demo/code-mode/mcp-stdio-proxy/go.sumis excluded by!**/*.sumdemo/code-mode/yoko-mock/go.sumis excluded by!**/*.sumrouter-tests/go.sumis excluded by!**/*.sumrouter/gen/proto/wg/cosmo/code_mode/yoko/v1/yoko.pb.gois excluded by!**/*.pb.go,!**/gen/**router/gen/proto/wg/cosmo/code_mode/yoko/v1/yokov1connect/yoko.connect.gois excluded by!**/gen/**router/go.sumis excluded by!**/*.sum
📒 Files selected for processing (101)
.gitignoreMakefiledemo/code-mode-connect/README.mddemo/code-mode-connect/router-config.yamldemo/code-mode-connect/start.shdemo/code-mode/.gitignoredemo/code-mode/Makefiledemo/code-mode/README.mddemo/code-mode/graph.yamldemo/code-mode/mcp-configs/README.mddemo/code-mode/mcp-configs/claude.desktop.jsondemo/code-mode/mcp-configs/claude.mcp.jsondemo/code-mode/mcp-configs/codex.tomldemo/code-mode/mcp-configs/sample-prompts/01-search-employees.txtdemo/code-mode/mcp-configs/sample-prompts/02-execute-fetch.txtdemo/code-mode/mcp-configs/sample-prompts/03-multi-tool.txtdemo/code-mode/mcp-configs/sample-prompts/04-mutation-not-approved.txtdemo/code-mode/mcp-stdio-proxy/go.moddemo/code-mode/mcp-stdio-proxy/main.godemo/code-mode/mcp-stdio-proxy/main_test.godemo/code-mode/router-config.yamldemo/code-mode/run_subgraphs_subset.shdemo/code-mode/start.shdemo/code-mode/yoko-mock/.gitignoredemo/code-mode/yoko-mock/README.mddemo/code-mode/yoko-mock/go.moddemo/code-mode/yoko-mock/main.godemo/code-mode/yoko-mock/main_test.godemo/code-mode/yoko-mock/schema.graphqldemo/pkg/subgraphs/employees/subgraph/schema.resolvers.goproto/wg/cosmo/code_mode/yoko/v1/yoko.protorouter-tests/code_mode_named_ops_test.gorouter-tests/go.modrouter-tests/testenv/testenv.gorouter/core/graph_server.gorouter/core/router.gorouter/core/router_config.gorouter/go.modrouter/internal/codemode/calltrace/calltrace.gorouter/internal/codemode/calltrace/calltrace_test.gorouter/internal/codemode/deps.gorouter/internal/codemode/harness/envelope.gorouter/internal/codemode/harness/envelope_test.gorouter/internal/codemode/harness/pipeline.gorouter/internal/codemode/harness/pipeline_test.gorouter/internal/codemode/harness/shape.gorouter/internal/codemode/harness/shape_test.gorouter/internal/codemode/harness/transpile.gorouter/internal/codemode/harness/transpile_test.gorouter/internal/codemode/observability/logging.gorouter/internal/codemode/observability/logging_test.gorouter/internal/codemode/observability/metrics.gorouter/internal/codemode/observability/metrics_test.gorouter/internal/codemode/observability/tracing.gorouter/internal/codemode/observability/tracing_test.gorouter/internal/codemode/sandbox/errors.gorouter/internal/codemode/sandbox/execute.gorouter/internal/codemode/sandbox/headers.gorouter/internal/codemode/sandbox/host.gorouter/internal/codemode/sandbox/preamble.gorouter/internal/codemode/sandbox/preamble_test.gorouter/internal/codemode/sandbox/sandbox.gorouter/internal/codemode/sandbox/sandbox_preamble.jsrouter/internal/codemode/sandbox/sandbox_test.gorouter/internal/codemode/sandbox/semaphore.gorouter/internal/codemode/sandbox/validation.gorouter/internal/codemode/server/approval.gorouter/internal/codemode/server/approval_test.gorouter/internal/codemode/server/execute_handler.gorouter/internal/codemode/server/execute_handler_test.gorouter/internal/codemode/server/lifecycle.gorouter/internal/codemode/server/lifecycle_test.gorouter/internal/codemode/server/observability_handler_test.gorouter/internal/codemode/server/search_handler.gorouter/internal/codemode/server/search_handler_test.gorouter/internal/codemode/server/server.gorouter/internal/codemode/server/server_test.gorouter/internal/codemode/server/session.gorouter/internal/codemode/storage/memory_backend.gorouter/internal/codemode/storage/memory_backend_test.gorouter/internal/codemode/storage/naming.gorouter/internal/codemode/storage/naming_test.gorouter/internal/codemode/storage/redis_backend.gorouter/internal/codemode/storage/redis_backend_test.gorouter/internal/codemode/storage/storage.gorouter/internal/codemode/storage/types.gorouter/internal/codemode/tsgen/bundle_test.gorouter/internal/codemode/tsgen/graphql.gorouter/internal/codemode/tsgen/tsgen.gorouter/internal/codemode/tsgen/tsgen_test.gorouter/internal/codemode/tsgen/typescript.gorouter/internal/codemode/yoko/client.gorouter/internal/codemode/yoko/client_test.gorouter/internal/codemode/yoko/searcher.gorouter/pkg/config/code_mode_config_test.gorouter/pkg/config/code_mode_validation.gorouter/pkg/config/config.gorouter/pkg/config/config.schema.jsonrouter/pkg/config/fixtures/full.yamlrouter/pkg/config/testdata/config_defaults.jsonrouter/pkg/config/testdata/config_full.json
| @@ -0,0 +1,2 @@ | |||
| [mcp_servers."yoko"] | |||
| url = "http://localhost:5027/mcp" | |||
There was a problem hiding this comment.
Use 127.0.0.1 instead of localhost for the demo MCP URL.
Using localhost can resolve to ::1 first on some systems; with an IPv4-only listener this causes confusing connection failures.
Suggested patch
[mcp_servers."yoko"]
-url = "http://localhost:5027/mcp"
+url = "http://127.0.0.1:5027/mcp"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| url = "http://localhost:5027/mcp" | |
| [mcp_servers."yoko"] | |
| url = "http://127.0.0.1:5027/mcp" |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@demo/code-mode/mcp-configs/codex.toml` at line 2, Update the MCP URL in
codex.toml to use an IPv4 loopback address: change the url value (the "url" key
in demo/code-mode/mcp-configs/codex.toml) from "http://localhost:5027/mcp" to
use "127.0.0.1" so the demo connects over IPv4 rather than resolving to ::1.
| func (s *yokoService) Index(ctx context.Context, req *connect.Request[yokov1.IndexRequest]) (*connect.Response[yokov1.IndexResponse], error) { | ||
| schemaSDL := req.Msg.GetSchemaSdl() | ||
| id := schemaID(schemaSDL) | ||
|
|
||
| s.mu.Lock() | ||
| if existing, ok := s.schemas[id]; ok { | ||
| s.mu.Unlock() | ||
| existing.mu.RLock() | ||
| existingSession := existing.sessionID | ||
| existing.mu.RUnlock() | ||
| log.Printf("Index schema_id=%s reused dir=%s session_id=%s", id, existing.dir, existingSession) | ||
| return connect.NewResponse(&yokov1.IndexResponse{SchemaId: id}), nil | ||
| } | ||
| s.mu.Unlock() | ||
|
|
||
| dir, err := os.MkdirTemp("", "yoko-schema-"+id+"-") | ||
| if err != nil { | ||
| return nil, connect.NewError(connect.CodeInternal, fmt.Errorf("create schema temp dir: %w", err)) | ||
| } | ||
| if err := os.WriteFile(filepath.Join(dir, "schema.graphql"), []byte(schemaSDL), 0o600); err != nil { | ||
| _ = os.RemoveAll(dir) | ||
| return nil, connect.NewError(connect.CodeInternal, fmt.Errorf("write schema.graphql: %w", err)) | ||
| } | ||
|
|
||
| sessionID, err := s.runCodexIndex(ctx, dir) | ||
| if err != nil { | ||
| _ = os.RemoveAll(dir) | ||
| return nil, connect.NewError(connect.CodeInternal, fmt.Errorf("codex pre-warm: %w", err)) | ||
| } | ||
|
|
||
| entry := &schemaEntry{dir: dir, sessionID: sessionID} | ||
| s.mu.Lock() | ||
| s.schemas[id] = entry | ||
| s.mu.Unlock() | ||
|
|
||
| log.Printf("Index schema_id=%s schema_sdl_size=%d schema_dir=%s session_id=%s rotate_after=%d", id, len(schemaSDL), dir, sessionID, s.rotateAfter) | ||
| return connect.NewResponse(&yokov1.IndexResponse{SchemaId: id}), nil | ||
| } |
There was a problem hiding this comment.
Race condition in Index allows duplicate schema entries and temp dir leaks.
Between lines 157-166, the lock is released after checking s.schemas[id] but before re-acquiring it at line 184 to insert. Two concurrent Index calls with the same schema SDL could both pass the initial existence check, both create temp directories, and both run codex pre-warm. The second insert would overwrite the first entry, leaking its temp directory.
🔒 Proposed fix using double-check pattern
func (s *yokoService) Index(ctx context.Context, req *connect.Request[yokov1.IndexRequest]) (*connect.Response[yokov1.IndexResponse], error) {
schemaSDL := req.Msg.GetSchemaSdl()
id := schemaID(schemaSDL)
s.mu.Lock()
if existing, ok := s.schemas[id]; ok {
s.mu.Unlock()
existing.mu.RLock()
existingSession := existing.sessionID
existing.mu.RUnlock()
log.Printf("Index schema_id=%s reused dir=%s session_id=%s", id, existing.dir, existingSession)
return connect.NewResponse(&yokov1.IndexResponse{SchemaId: id}), nil
}
- s.mu.Unlock()
+ // Keep lock held to prevent concurrent Index for same schema
+ defer s.mu.Unlock()
dir, err := os.MkdirTemp("", "yoko-schema-"+id+"-")
if err != nil {
return nil, connect.NewError(connect.CodeInternal, fmt.Errorf("create schema temp dir: %w", err))
}
if err := os.WriteFile(filepath.Join(dir, "schema.graphql"), []byte(schemaSDL), 0o600); err != nil {
_ = os.RemoveAll(dir)
return nil, connect.NewError(connect.CodeInternal, fmt.Errorf("write schema.graphql: %w", err))
}
sessionID, err := s.runCodexIndex(ctx, dir)
if err != nil {
_ = os.RemoveAll(dir)
return nil, connect.NewError(connect.CodeInternal, fmt.Errorf("codex pre-warm: %w", err))
}
entry := &schemaEntry{dir: dir, sessionID: sessionID}
- s.mu.Lock()
s.schemas[id] = entry
- s.mu.Unlock()
log.Printf("Index schema_id=%s schema_sdl_size=%d schema_dir=%s session_id=%s rotate_after=%d", id, len(schemaSDL), dir, sessionID, s.rotateAfter)
return connect.NewResponse(&yokov1.IndexResponse{SchemaId: id}), nil
}Note: Holding the lock during runCodexIndex will serialize concurrent Index calls for different schemas. If that's unacceptable, consider using singleflight.Group keyed by schema ID to deduplicate concurrent calls while allowing parallelism for different schemas.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@demo/code-mode/yoko-mock/main.go` around lines 153 - 190, The Index method
has a race where s.schemas[id] is checked, unlocked, and later written, allowing
duplicate work and temp-dir leaks; fix by applying a double-check: after
obtaining sessionID from s.runCodexIndex and before assigning s.schemas[id],
re-acquire s.mu, check if an entry for id now exists (use existing :=
s.schemas[id] / schemaEntry), and if it does, discard (remove) the newly created
dir and use the existing entry/sessionID for the response; otherwise insert the
new schemaEntry; alternatively replace with a singleflight.Group keyed by schema
id to dedupe concurrent runs while keeping parallelism for different schemas.
| if opts.IsBaseGraph() && s.codeModeServer != nil { | ||
| sdl, printErr := astprinter.PrintString(executor.ClientSchema) | ||
| if printErr != nil { | ||
| s.logger.Error("failed to reload MCP server", zap.Error(fmt.Errorf("failed to print Code Mode schema SDL: %w", printErr))) | ||
| } else if mErr := s.codeModeServer.Reload(executor.ClientSchema, sdl); mErr != nil { | ||
| s.logger.Error("failed to reload MCP server", zap.Error(mErr)) | ||
| } | ||
| } |
There was a problem hiding this comment.
Make Code Mode schema reload fail the mux build.
If SDL printing or codeModeServer.Reload fails here, the router still comes up even though Code Mode will be serving against stale or missing schema. That turns a deterministic startup failure into a runtime feature break. This should mirror the hard-fail behavior of the MCP reload block just above.
Proposed fix
if opts.IsBaseGraph() && s.codeModeServer != nil {
sdl, printErr := astprinter.PrintString(executor.ClientSchema)
if printErr != nil {
- s.logger.Error("failed to reload MCP server", zap.Error(fmt.Errorf("failed to print Code Mode schema SDL: %w", printErr)))
- } else if mErr := s.codeModeServer.Reload(executor.ClientSchema, sdl); mErr != nil {
- s.logger.Error("failed to reload MCP server", zap.Error(mErr))
+ return nil, fmt.Errorf("failed to print Code Mode schema SDL: %w", printErr)
+ }
+ if mErr := s.codeModeServer.Reload(executor.ClientSchema, sdl); mErr != nil {
+ return nil, fmt.Errorf("failed to reload Code Mode server: %w", mErr)
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if opts.IsBaseGraph() && s.codeModeServer != nil { | |
| sdl, printErr := astprinter.PrintString(executor.ClientSchema) | |
| if printErr != nil { | |
| s.logger.Error("failed to reload MCP server", zap.Error(fmt.Errorf("failed to print Code Mode schema SDL: %w", printErr))) | |
| } else if mErr := s.codeModeServer.Reload(executor.ClientSchema, sdl); mErr != nil { | |
| s.logger.Error("failed to reload MCP server", zap.Error(mErr)) | |
| } | |
| } | |
| if opts.IsBaseGraph() && s.codeModeServer != nil { | |
| sdl, printErr := astprinter.PrintString(executor.ClientSchema) | |
| if printErr != nil { | |
| return nil, fmt.Errorf("failed to print Code Mode schema SDL: %w", printErr) | |
| } | |
| if mErr := s.codeModeServer.Reload(executor.ClientSchema, sdl); mErr != nil { | |
| return nil, fmt.Errorf("failed to reload Code Mode server: %w", mErr) | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router/core/graph_server.go` around lines 1403 - 1410, The SDL print and
codeModeServer.Reload failures are currently only logged (via s.logger.Error) so
the router starts with a broken Code Mode; change this to hard-fail the mux
build just like the MCP block: when
astprinter.PrintString(executor.ClientSchema) returns printErr or when
s.codeModeServer.Reload(executor.ClientSchema, sdl) returns mErr,
propagate/return the error (or call the same fatal/exit path used by the MCP
reload) instead of only logging it so startup aborts; update the block that
checks opts.IsBaseGraph() && s.codeModeServer != nil to return the error
(wrapping printErr/mErr with context mentioning Code Mode SDL/Reload and
executor.ClientSchema) so the build fails deterministically.
| func (sm *sourceMap) lookup(generatedLine, generatedColumn int) (mapping, bool) { | ||
| if generatedLine < 1 || generatedLine > len(sm.lines) { | ||
| return mapping{}, false | ||
| } | ||
| line := sm.lines[generatedLine-1] | ||
| if len(line) == 0 { | ||
| return mapping{}, false | ||
| } | ||
| column0 := generatedColumn - 1 | ||
| best := line[0] | ||
| for _, candidate := range line { | ||
| if candidate.generatedColumn > column0 { | ||
| break | ||
| } | ||
| best = candidate | ||
| } | ||
| return best, true |
There was a problem hiding this comment.
Avoid false-positive source map hits before first mapped column.
Line 153 seeds best with line[0], so a lookup for a column earlier than the first segment still returns a mapping. That can rewrite stack traces to the wrong source location.
Suggested fix
func (sm *sourceMap) lookup(generatedLine, generatedColumn int) (mapping, bool) {
if generatedLine < 1 || generatedLine > len(sm.lines) {
return mapping{}, false
}
line := sm.lines[generatedLine-1]
if len(line) == 0 {
return mapping{}, false
}
column0 := generatedColumn - 1
+ if column0 < line[0].generatedColumn {
+ return mapping{}, false
+ }
best := line[0]
for _, candidate := range line {
if candidate.generatedColumn > column0 {
break
}
best = candidate
}
return best, true
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func (sm *sourceMap) lookup(generatedLine, generatedColumn int) (mapping, bool) { | |
| if generatedLine < 1 || generatedLine > len(sm.lines) { | |
| return mapping{}, false | |
| } | |
| line := sm.lines[generatedLine-1] | |
| if len(line) == 0 { | |
| return mapping{}, false | |
| } | |
| column0 := generatedColumn - 1 | |
| best := line[0] | |
| for _, candidate := range line { | |
| if candidate.generatedColumn > column0 { | |
| break | |
| } | |
| best = candidate | |
| } | |
| return best, true | |
| func (sm *sourceMap) lookup(generatedLine, generatedColumn int) (mapping, bool) { | |
| if generatedLine < 1 || generatedLine > len(sm.lines) { | |
| return mapping{}, false | |
| } | |
| line := sm.lines[generatedLine-1] | |
| if len(line) == 0 { | |
| return mapping{}, false | |
| } | |
| column0 := generatedColumn - 1 | |
| if column0 < line[0].generatedColumn { | |
| return mapping{}, false | |
| } | |
| best := line[0] | |
| for _, candidate := range line { | |
| if candidate.generatedColumn > column0 { | |
| break | |
| } | |
| best = candidate | |
| } | |
| return best, true | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router/internal/codemode/sandbox/errors.go` around lines 144 - 160, The
lookup function in type sourceMap seeds best with line[0], causing columns
before the first mapped segment to return a false-positive mapping; change the
logic in lookup (function name: lookup, receiver: *sourceMap) to first check
whether the earliest segment's generatedColumn is greater than column0 and
return (mapping{}, false) if so, otherwise iterate and update best only when
candidate.generatedColumn <= column0 (do not pre-seed best with line[0]); this
ensures columns before the first segment correctly return no mapping.
| func copyAllowedHeaders(dst, src http.Header, allow map[string]struct{}) { | ||
| for name, values := range src { | ||
| canonical := strings.ToLower(http.CanonicalHeaderKey(name)) | ||
| if _, hop := hopByHopHeaders[canonical]; hop { | ||
| continue | ||
| } | ||
| if _, ok := allow[canonical]; !ok { | ||
| continue | ||
| } | ||
| for _, value := range values { | ||
| dst.Add(name, value) | ||
| } | ||
| } |
There was a problem hiding this comment.
Filter Connection-declared hop-by-hop headers too
Current logic blocks the standard hop-by-hop names, but not headers dynamically listed in Connection (e.g. Connection: X-Foo). Those should also be dropped before forwarding.
💡 Suggested patch
func copyAllowedHeaders(dst, src http.Header, allow map[string]struct{}) {
+ connectionScoped := map[string]struct{}{}
+ for _, value := range src.Values("Connection") {
+ for _, token := range strings.Split(value, ",") {
+ name := strings.ToLower(http.CanonicalHeaderKey(strings.TrimSpace(token)))
+ if name != "" {
+ connectionScoped[name] = struct{}{}
+ }
+ }
+ }
+
for name, values := range src {
canonical := strings.ToLower(http.CanonicalHeaderKey(name))
if _, hop := hopByHopHeaders[canonical]; hop {
continue
}
+ if _, hop := connectionScoped[canonical]; hop {
+ continue
+ }
if _, ok := allow[canonical]; !ok {
continue
}
for _, value := range values {
dst.Add(name, value)
}
}
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router/internal/codemode/sandbox/headers.go` around lines 31 - 43, The
copyAllowedHeaders function currently only checks static hopByHopHeaders but
ignores dynamic hop-by-hop names declared in the Connection header; update
copyAllowedHeaders to parse src.Get("Connection") (or src.Values("Connection")),
split on commas, canonicalize each token (use http.CanonicalHeaderKey then
strings.ToLower) and treat those names as additional hop-by-hop headers (merge
into a temporary set or check before copying). Keep existing checks for
hopByHopHeaders and allow map; ensure any header named in Connection is skipped
just like entries in hopByHopHeaders before dst.Add is called.
| func lowerFirst(value string) string { | ||
| if value == "" { | ||
| return value | ||
| } | ||
| runes := []rune(value) | ||
| runes[0] = unicode.ToLower(runes[0]) | ||
| return string(runes) | ||
| } |
There was a problem hiding this comment.
Normalize the full first token, not only its first rune.
Current behavior can yield unexpected names (for example, all-caps input produces mixed-case artifacts like gET...). Lowercasing the full token first makes normalization more predictable.
Suggested patch
func lowerFirst(value string) string {
if value == "" {
return value
}
- runes := []rune(value)
+ runes := []rune(strings.ToLower(value))
runes[0] = unicode.ToLower(runes[0])
return string(runes)
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func lowerFirst(value string) string { | |
| if value == "" { | |
| return value | |
| } | |
| runes := []rune(value) | |
| runes[0] = unicode.ToLower(runes[0]) | |
| return string(runes) | |
| } | |
| func lowerFirst(value string) string { | |
| if value == "" { | |
| return value | |
| } | |
| runes := []rune(strings.ToLower(value)) | |
| runes[0] = unicode.ToLower(runes[0]) | |
| return string(runes) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router/internal/codemode/storage/naming.go` around lines 144 - 151, The
lowerFirst function currently only lowercases the first rune which produces
mixed-case artifacts for inputs like "GETValue"; update lowerFirst to lowercase
the entire initial token (the contiguous run of identifier characters) rather
than a single rune: in lowerFirst, iterate runes from the start to find the
token end (use unicode.IsLetter/unicode.IsDigit or unicode.IsMark as
appropriate), lowercase the substring comprising that token (e.g., via
strings.ToLower on that slice), then append the remainder and return; keep the
function name lowerFirst and preserve behavior for empty strings.
| func (r operationRenderer) renderOperation(op storage.SessionOp) (string, error) { | ||
| if r.schema == nil { | ||
| return "", fmt.Errorf("render op %q: schema is nil", op.Name) | ||
| } | ||
|
|
||
| opDoc, report := astparser.ParseGraphqlDocumentString(op.Body) | ||
| if report.HasErrors() { | ||
| return "", fmt.Errorf("render op %q: parse GraphQL operation: %s", op.Name, report.Error()) | ||
| } | ||
|
|
||
| opRef, err := singleOperationRef(&opDoc) | ||
| if err != nil { | ||
| return "", fmt.Errorf("render op %q: %w", op.Name, err) | ||
| } |
There was a problem hiding this comment.
Validate the operation against the schema before generating a signature.
Right now this only rejects parse errors. A document like query Q { health { id } } or an impossible fragment such as viewer { ... on Cat { name } } will still produce a TypeScript contract even though the operation can never execute successfully. Failing fast here avoids registering unusable tools and mismatched signatures.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router/internal/codemode/tsgen/graphql.go` around lines 17 - 30, The
renderOperation function currently only checks parse errors; after parsing opDoc
and before calling singleOperationRef you must validate the parsed document
against r.schema and return a clear error if validation fails. Add a schema
validation step (e.g., call the project’s GraphQL document validator with
r.schema and opDoc — place it after astparser.ParseGraphqlDocumentString and
before singleOperationRef in renderOperation) and on validation errors return
fmt.Errorf("render op %q: validate GraphQL operation: %s", op.Name, <validation
error details>); ensure you import/use the repository's validator API and
include the validation error details in the log/returned error.
| func (r operationRenderer) variablesType(opDoc *ast.Document, opRef int) (string, bool, error) { | ||
| op := opDoc.OperationDefinitions[opRef] | ||
| if !op.HasVariableDefinitions || len(op.VariableDefinitions.Refs) == 0 { | ||
| return "{}", true, nil | ||
| } | ||
|
|
||
| fields := make([]tsProperty, 0, len(op.VariableDefinitions.Refs)) | ||
| varsOptional := true | ||
| for _, varRef := range op.VariableDefinitions.Refs { | ||
| name := opDoc.VariableDefinitionNameString(varRef) | ||
| typeRef := opDoc.VariableDefinitionType(varRef) | ||
| required := opDoc.Types[typeRef].TypeKind == ast.TypeKindNonNull | ||
|
|
||
| typ, nullable, err := r.inputType(opDoc, typeRef) | ||
| if err != nil { | ||
| return "", false, err | ||
| } | ||
| if nullable { | ||
| typ = writeNullable(typ) | ||
| } else { | ||
| varsOptional = false | ||
| } | ||
|
|
||
| fields = append(fields, tsProperty{name: name, typ: typ, optional: !required}) | ||
| } | ||
|
|
||
| return writeInlineObject(fields), varsOptional, nil |
There was a problem hiding this comment.
Default-valued variables should remain optional in the generated vars shape.
Requiredness is derived only from TypeKindNonNull, but GraphQL callers may omit variables that have defaults even when the declared type is non-null. For example, query Q($limit: Int! = 10) should not become vars: { limit: number }. Please factor the variable definition's default value into both the property optionality and varsOptional.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router/internal/codemode/tsgen/graphql.go` around lines 61 - 87,
variablesType currently treats any NonNull variable as required; instead detect
whether a variable has a default and treat defaulted variables as optional.
Inside variablesType (and where you construct tsProperty for each varRef) call
the document helper that exposes a variable default (e.g. the same area that
used VariableDefinitionNameString/VariableDefinitionType — use
VariableDefinitionDefaultValue or HasVariableDefinitionDefaultValue equivalent)
and then: 1) compute requiredOnly := (type is NonNull AND no default present);
2) set tsProperty.optional = !requiredOnly; and 3) only flip varsOptional to
false when requiredOnly is true. Update the code paths in variablesType and the
tsProperty construction to use this default-aware logic.
| func (r operationRenderer) inputObjectType(inputObjectRef int) (string, error) { | ||
| def := r.schema.InputObjectTypeDefinitions[inputObjectRef] | ||
| fields := make([]tsProperty, 0, len(def.InputFieldsDefinition.Refs)) | ||
| for _, fieldRef := range def.InputFieldsDefinition.Refs { | ||
| name := r.schema.InputValueDefinitionNameString(fieldRef) | ||
| typeRef := r.schema.InputValueDefinitionType(fieldRef) | ||
| required := r.schema.Types[typeRef].TypeKind == ast.TypeKindNonNull | ||
|
|
||
| typ, nullable, err := r.inputType(r.schema, typeRef) | ||
| if err != nil { | ||
| return "", err | ||
| } | ||
| if nullable { | ||
| typ = writeNullable(typ) | ||
| } | ||
|
|
||
| fields = append(fields, tsProperty{name: name, typ: typ, optional: !required}) | ||
| } | ||
|
|
||
| return writeInlineObject(fields), nil |
There was a problem hiding this comment.
Recursive input objects will blow up this renderer.
This code always expands input objects inline, so schemas with common recursive filter shapes like input UserFilter { AND: [UserFilter!] } will recurse until stack overflow during bundle generation. Code Mode is supposed to work against arbitrary graphs, so this needs a visited-set/alias strategy before merge.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router/internal/codemode/tsgen/graphql.go` around lines 150 - 169, The
inputObjectType function currently always inlines input objects and will
infinitely recurse on recursive input types (e.g., InputObjectTypeDefinitions
with self-references). Add a visited/alias strategy: maintain a visited map from
inputObjectRef (int) to a generated type alias name (string) on the
operationRenderer (or pass a context map into inputObjectType); when first
visiting an inputObjectRef generate and record a unique alias (e.g.,
UserFilterInput), emit/collect a top-level alias declaration instead of fully
inlining nested structure, and for subsequent visits return the alias name
(respecting writeNullable) instead of recursing; update any call sites that rely
on inputObjectType to ensure collected aliases are emitted (use symbols:
inputObjectType, writeInlineObject, writeNullable, tsProperty,
InputObjectTypeDefinitions, InputFieldsDefinition).
| func ValidateMCPCodeMode(cfg *MCPCodeModeConfiguration, sessionStateless bool) error { | ||
| if !cfg.Enabled { | ||
| return nil | ||
| } | ||
|
|
||
| if !cfg.NamedOps.Enabled { | ||
| return nil | ||
| } | ||
|
|
||
| // Storage backend selection: when ProviderID is set, the router resolves it | ||
| // against the central storage_providers registry (Redis backend). Otherwise | ||
| // the in-memory backend is used. The provider lookup error (unknown id) is | ||
| // emitted by the router at startup, not here. | ||
|
|
||
| // Named ops require stateful MCP sessions to work, but this intentionally | ||
| // does not fail boot. The Code Mode runtime emits the warn log on first | ||
| // reload so deployments can enable Code Mode before flipping session mode. | ||
| _ = sessionStateless | ||
|
|
||
| return nil |
There was a problem hiding this comment.
Reject named_ops when MCP sessions are stateless.
This validator already documents that mcp.code_mode.named_ops cannot work with stateless sessions, but it still returns nil and defers the failure to runtime. That unsupported combination should be rejected during config load.
Proposed fix
package config
+import "errors"
+
func ValidateMCPCodeMode(cfg *MCPCodeModeConfiguration, sessionStateless bool) error {
if !cfg.Enabled {
return nil
}
if !cfg.NamedOps.Enabled {
return nil
}
@@
- // Named ops require stateful MCP sessions to work, but this intentionally
- // does not fail boot. The Code Mode runtime emits the warn log on first
- // reload so deployments can enable Code Mode before flipping session mode.
- _ = sessionStateless
+ if sessionStateless {
+ return errors.New("mcp.code_mode.named_ops requires mcp.session.stateless=false")
+ }
return nil
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router/pkg/config/code_mode_validation.go` around lines 3 - 22, The validator
ValidateMCPCodeMode currently documents but does not reject the unsupported
combination of cfg.NamedOps.Enabled with sessionStateless=true; change it to
return a clear error when cfg.Enabled && cfg.NamedOps.Enabled &&
sessionStateless are true (e.g., return fmt.Errorf("mcp.code_mode.named_ops
requires stateful MCP sessions but sessions are configured stateless")), and add
the fmt import if missing so the config load fails fast instead of deferring to
runtime; reference ValidateMCPCodeMode, cfg.NamedOps.Enabled and
sessionStateless to locate the change.
Previously a single non-serializable leaf (BigInt, function, symbol,
undefined, circular ref) caused the sandbox to drop the entire return
value and surface a NotSerializable error. Multi-source aggregations
that put raw federation errors or upstream payloads into a debug field
lost the rest of an expensive run.
The validator now mutates each bad leaf to the sentinel string
"<<non-serializable: KIND>>" and records {path, kind} in a new warnings
array on the response envelope. The successful result still flows
through. NotSerializable now only fires when JSON.stringify itself
throws after sanitization.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…down Moves the long inline tool/resource description string constants out of server.go into a new internal/codemode/server/descriptions sub-package, where each description lives in its own .md file embedded via go:embed. Lets the prose be edited as readable markdown without touching Go source. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The proxy now retries the initial dial with exponential backoff and supervises the live session via session.Wait(); when the upstream disconnects it reconnects transparently so downstream clients (Claude Desktop, etc.) keep working across router restarts. KeepAlive pings make dead connections surface quickly instead of waiting for the next call. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eState.wg WriteTimeout was 30s while executeTimeout is 120s, so net/http would cut off legitimately long code_mode_run_js responses mid-flight. Lift the write deadline above the configured execute timeout and switch the listener to ReadHeaderTimeout so a body upload doesn't share the 30s budget either. The unused sync.WaitGroup on executeState (and its matching state.wg.Wait() in Execute) was a no-op leftover; remove both. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
router/internal/codemode/sandbox/sandbox_test.go (1)
623-634: ⚡ Quick winUse
WaitGroup.Gofor these test workers.This repo already standardized on
sync.WaitGroup.Go(func()), and it removes the manualAdd/Donebookkeeping in this loop.Suggested refactor
var wg sync.WaitGroup for range 5 { - wg.Add(1) - go func() { - defer wg.Done() + wg.Go(func() { _, err := s.Execute(context.Background(), ExecuteRequest{ SessionID: "s1", ToolNames: []string{"ping"}, WrappedJS: `async () => await tools.ping()`, }) assert.NoError(t, err) - }() + }) }Based on learnings: In Go code (Go 1.25+), prefer using
sync.WaitGroup.Go(func())to run a function in a new goroutine, letting theWaitGroupmanageAdd/Doneautomatically.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/sandbox/sandbox_test.go` around lines 623 - 634, The test spawns 5 worker goroutines using manual wg.Add(1)/defer wg.Done(); replace that pattern with the standardized sync.WaitGroup.Go usage to avoid manual bookkeeping: use wg.Go(func() { _, err := s.Execute(context.Background(), ExecuteRequest{SessionID: "s1", ToolNames: []string{"ping"}, WrappedJS: `async () => await tools.ping()`}); assert.NoError(t, err) }) inside the loop so the WaitGroup handles Add/Done automatically and you can remove the explicit Add/Done calls around s.Execute.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@router/internal/codemode/harness/envelope.go`:
- Around line 56-60: After previewEnvelope() returns the fallback object you
must ensure its serialized form respects the caller's maxResultBytes before
returning the ResultEnvelope; marshal the fallback using the same encoder/format
used for MCP, measure the byte length, and if it exceeds maxResultBytes replace
the Result with a scalar/truncated representation that fits (or apply the
existing truncate helper used elsewhere), set Truncated=true and return the
ResultEnvelope; apply the same fix to the analogous fallback handling around
previewEnvelope usage at the later block (lines ~177-195) so both paths enforce
maxResultBytes.
In `@router/internal/codemode/sandbox/execute.go`:
- Around line 24-32: The recover block incorrectly labels all panics as
"Timeout"; change it so you only set ErrorEnvelope.Name = "Timeout" when
ctx.Err() != nil, and otherwise set a generic runtime error name (e.g.,
"RuntimeError" or "InternalError") and include the recovered value in
ErrorEnvelope.Message; update the logic around errEnv construction inside the
deferred func that uses recover(), ensure ExecuteResult is set with the correct
errEnv and OutputSize via envelopeSize(nil, errEnv), and keep retErr = nil
behavior unchanged.
In `@router/internal/codemode/sandbox/sandbox.go`:
- Around line 128-135: The default retry behavior can cause duplicate GraphQL
POST mutations; change the safe defaults so retries are disabled unless
explicitly set: set defaultRetryAttempts from 3 to 0 and modify the
withDefaults() check that currently treats any value <= 0 as “unset” to only
treat negative values as unset (i.e., change the cfg.RetryAttempts <= 0 check to
cfg.RetryAttempts < 0) so an explicit 0 will be preserved and
retryClient.RetryMax (set in the HTTP client construction using
retryablehttp.NewClient) will correctly disable retries for POST mutations.
In `@router/internal/codemode/server/server_test.go`:
- Around line 460-467: The cleanup currently uses a non-blocking select on the
errs channel which can let the test exit before srv.mcpServer.Run returns;
replace the default branch with a bounded wait like startServer does: in the
t.Cleanup closure block wait on errs with a select that has two cases—one
receiving err from errs (if err != nil && !errors.Is(err, context.Canceled) call
require.NoError) and a time.After case using the same timeout value used by
startServer—to ensure the in-memory MCP server goroutine has a bounded time to
exit instead of returning immediately.
---
Nitpick comments:
In `@router/internal/codemode/sandbox/sandbox_test.go`:
- Around line 623-634: The test spawns 5 worker goroutines using manual
wg.Add(1)/defer wg.Done(); replace that pattern with the standardized
sync.WaitGroup.Go usage to avoid manual bookkeeping: use wg.Go(func() { _, err
:= s.Execute(context.Background(), ExecuteRequest{SessionID: "s1", ToolNames:
[]string{"ping"}, WrappedJS: `async () => await tools.ping()`});
assert.NoError(t, err) }) inside the loop so the WaitGroup handles Add/Done
automatically and you can remove the explicit Add/Done calls around s.Execute.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ce98376b-3811-4454-b89c-28be9473c652
📒 Files selected for processing (15)
demo/code-mode/mcp-stdio-proxy/main.godemo/code-mode/mcp-stdio-proxy/main_test.gorouter/internal/codemode/harness/envelope.gorouter/internal/codemode/sandbox/execute.gorouter/internal/codemode/sandbox/host.gorouter/internal/codemode/sandbox/sandbox.gorouter/internal/codemode/sandbox/sandbox_test.gorouter/internal/codemode/sandbox/validation.gorouter/internal/codemode/server/descriptions/descriptions.gorouter/internal/codemode/server/descriptions/execute_source.mdrouter/internal/codemode/server/descriptions/execute_tool.mdrouter/internal/codemode/server/descriptions/persisted_ops_resource.mdrouter/internal/codemode/server/descriptions/search_tool.mdrouter/internal/codemode/server/server.gorouter/internal/codemode/server/server_test.go
✅ Files skipped from review due to trivial changes (5)
- router/internal/codemode/server/descriptions/execute_source.md
- router/internal/codemode/server/descriptions/descriptions.go
- router/internal/codemode/server/descriptions/search_tool.md
- router/internal/codemode/server/descriptions/execute_tool.md
- router/internal/codemode/server/descriptions/persisted_ops_resource.md
🚧 Files skipped from review as they are similar to previous changes (2)
- router/internal/codemode/sandbox/validation.go
- router/internal/codemode/server/server.go
| fallback, err := previewEnvelope(sandboxResult.Result) | ||
| if err != nil { | ||
| return ResultEnvelope{}, err | ||
| } | ||
| return ResultEnvelope{Result: fallback, Warnings: sandboxResult.Warnings, Truncated: true, Error: nil}, nil |
There was a problem hiding this comment.
Keep the scalar fallback within maxResultBytes.
The fallback path marks the result as truncated, but it never enforces the caller's byte budget after previewEnvelope() marshals the preview object. With a small non-default maxResultBytes, this can still return an oversized MCP payload.
Also applies to: 177-195
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router/internal/codemode/harness/envelope.go` around lines 56 - 60, After
previewEnvelope() returns the fallback object you must ensure its serialized
form respects the caller's maxResultBytes before returning the ResultEnvelope;
marshal the fallback using the same encoder/format used for MCP, measure the
byte length, and if it exceeds maxResultBytes replace the Result with a
scalar/truncated representation that fits (or apply the existing truncate helper
used elsewhere), set Truncated=true and return the ResultEnvelope; apply the
same fix to the analogous fallback handling around previewEnvelope usage at the
later block (lines ~177-195) so both paths enforce maxResultBytes.
| defer func() { | ||
| if r := recover(); r != nil { | ||
| errEnv := &ErrorEnvelope{Name: "Timeout", Message: fmt.Sprintf("sandbox runtime panic: %v", r)} | ||
| if ctx.Err() != nil { | ||
| errEnv.Message = ctx.Err().Error() | ||
| } | ||
| execResult = ExecuteResult{OK: false, Error: errEnv, OutputSize: envelopeSize(nil, errEnv)} | ||
| retErr = nil | ||
| } |
There was a problem hiding this comment.
Don't classify every recovered panic as Timeout.
This recover block also catches unrelated panics from our own code paths, but it always reports them as Timeout. That will hide real defects behind the wrong error class and make alerts/debugging misleading. Only map to Timeout when ctx.Err() != nil; otherwise surface a generic runtime error.
Suggested fix
defer func() {
if r := recover(); r != nil {
- errEnv := &ErrorEnvelope{Name: "Timeout", Message: fmt.Sprintf("sandbox runtime panic: %v", r)}
- if ctx.Err() != nil {
- errEnv.Message = ctx.Err().Error()
- }
+ errEnv := &ErrorEnvelope{
+ Name: "Error",
+ Message: fmt.Sprintf("sandbox runtime panic: %v", r),
+ }
+ if ctx.Err() != nil {
+ errEnv.Name = "Timeout"
+ errEnv.Message = ctx.Err().Error()
+ }
execResult = ExecuteResult{OK: false, Error: errEnv, OutputSize: envelopeSize(nil, errEnv)}
retErr = nil
}
}()🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router/internal/codemode/sandbox/execute.go` around lines 24 - 32, The
recover block incorrectly labels all panics as "Timeout"; change it so you only
set ErrorEnvelope.Name = "Timeout" when ctx.Err() != nil, and otherwise set a
generic runtime error name (e.g., "RuntimeError" or "InternalError") and include
the recovered value in ErrorEnvelope.Message; update the logic around errEnv
construction inside the deferred func that uses recover(), ensure ExecuteResult
is set with the correct errEnv and OutputSize via envelopeSize(nil, errEnv), and
keep retErr = nil behavior unchanged.
| client := cfg.HTTPClient | ||
| if client == nil { | ||
| retryClient := retryablehttp.NewClient() | ||
| retryClient.RetryMax = cfg.RetryAttempts | ||
| retryClient.RetryWaitMax = cfg.RetryCeiling | ||
| retryClient.Logger = nil | ||
| client = retryClient.StandardClient() | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -type f -name "sandbox.go" | grep -E "router/internal/codemode"Repository: wundergraph/cosmo
Length of output: 1939
🏁 Script executed:
wc -l router/internal/codemode/sandbox/sandbox.goRepository: wundergraph/cosmo
Length of output: 109
🏁 Script executed:
sed -n '120,180p' router/internal/codemode/sandbox/sandbox.goRepository: wundergraph/cosmo
Length of output: 1485
🏁 Script executed:
grep -n "defaultRetryAttempts" router/internal/codemode/sandbox/sandbox.goRepository: wundergraph/cosmo
Length of output: 149
🏁 Script executed:
grep -n "const (" router/internal/codemode/sandbox/sandbox.go | head -1Repository: wundergraph/cosmo
Length of output: 72
🏁 Script executed:
sed -n '1,50p' router/internal/codemode/sandbox/sandbox.goRepository: wundergraph/cosmo
Length of output: 1474
🏁 Script executed:
grep -r "mutations\|Mutation" router/internal/codemode/sandbox/ --include="*.go" | head -20Repository: wundergraph/cosmo
Length of output: 1319
🏁 Script executed:
sed -n '160,175p' router/internal/codemode/sandbox/sandbox.goRepository: wundergraph/cosmo
Length of output: 493
🏁 Script executed:
grep -n "http\|POST\|Method" router/internal/codemode/sandbox/host.go | head -30Repository: wundergraph/cosmo
Length of output: 550
🏁 Script executed:
grep -A 10 "func.*GraphQL\|RouterGraphQLEndpoint" router/internal/codemode/sandbox/host.go | head -40Repository: wundergraph/cosmo
Length of output: 936
🏁 Script executed:
sed -n '140,160p' router/internal/codemode/sandbox/host.goRepository: wundergraph/cosmo
Length of output: 798
🌐 Web query:
retryablehttp library POST requests retry behavior idempotent
💡 Result:
The retryablehttp library refers to github.com/hashicorp/go-retryablehttp, a Go HTTP client that automatically retries requests on client errors (e.g., connection issues) or 5xx server responses (except 501), using exponential backoff. This behavior applies regardless of HTTP method, including non-idempotent POST requests.[1][2][3] For POST requests with bodies, the library supports "rewinding" the body (via io.ReadSeeker or buffering) to enable retries without data loss, explicitly noting support for POST/PUT etc.[1][2] The default retry policy (DefaultRetryPolicy) does not consider request method or idempotency; it checks only errors and status codes via baseRetryPolicy, which retries on most client errors and 429/5xx.[3][4] Users have requested skipping retries for non-idempotent methods like POST (Issue #169, open since 2022), as the CheckRetry func receives only resp and err, not req, making method-based decisions hard—especially for connection errors where resp is nil.[5] Custom CheckRetry can be implemented, but requires workarounds for request access. To avoid retrying non-idempotent POSTs, use separate clients (one with retries for idempotent methods, one without) or custom logic.[5] Idempotency must be handled server-side (e.g., idempotency keys) if retries occur on POSTs.[6][7][8]
Citations:
- 1: https://pkg.go.dev/github.com/hashicorp/go-retryablehttp
- 2: https://github.com/hashicorp/go-retryablehttp
- 3: https://github.com/hashicorp/go-retryablehttp/blob/main/client.go
- 4: https://github.com/hashicorp/go-retryablehttp/blob/v0.7.1/client.go
- 5: A way to skip retrying non-idempotent requests hashicorp/go-retryablehttp#169
- 6: https://oneuptime.com/blog/post/2026-01-07-go-retry-exponential-backoff/view
- 7: https://dev.to/gabrielanhaia/idempotency-at-scale-the-pattern-that-prevents-double-charging-42j8
- 8: https://dev.to/highflyer910/idempotency-keys-what-most-tutorials-dont-tell-you-1ncc
🏁 Script executed:
grep -B 5 -A 10 "OperationKindMutation" router/internal/codemode/sandbox/host.go | head -40Repository: wundergraph/cosmo
Length of output: 611
Default retries duplicate approved mutations on partial upstream success.
GraphQL mutations are sent as POST requests (line 146, host.go) via the shared HTTP client constructed with retryablehttp.NewClient() (lines 128-135). The retryablehttp library retries all failed requests by default—including non-idempotent POST—on connection errors and 5xx responses. When a mutation partially succeeds upstream (the mutation executes, but the response times out before reaching the client), the retry will execute the same mutation again, creating a duplicate.
The safe opt-out via RetryAttempts: 0 is unavailable because withDefaults() (line 168) unconditionally rewrites any value <= 0 back to defaultRetryAttempts = 3.
Suggested fix:
- Change
defaultRetryAttemptsfrom3to0(safe-by-default, no retries) - Change the check from
cfg.RetryAttempts <= 0tocfg.RetryAttempts < 0(allow explicit0to disable, only force defaults for negative values)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router/internal/codemode/sandbox/sandbox.go` around lines 128 - 135, The
default retry behavior can cause duplicate GraphQL POST mutations; change the
safe defaults so retries are disabled unless explicitly set: set
defaultRetryAttempts from 3 to 0 and modify the withDefaults() check that
currently treats any value <= 0 as “unset” to only treat negative values as
unset (i.e., change the cfg.RetryAttempts <= 0 check to cfg.RetryAttempts < 0)
so an explicit 0 will be preserved and retryClient.RetryMax (set in the HTTP
client construction using retryablehttp.NewClient) will correctly disable
retries for POST mutations.
| t.Cleanup(func() { | ||
| select { | ||
| case err := <-errs: | ||
| if err != nil && !errors.Is(err, context.Canceled) { | ||
| require.NoError(t, err) | ||
| } | ||
| default: | ||
| } |
There was a problem hiding this comment.
Wait for the in-memory MCP server goroutine to exit in cleanup.
This non-blocking default can let the test finish before srv.mcpServer.Run returns, which hides transport/protocol errors and can leave the goroutine running into later tests. The cleanup should do a bounded wait here, like startServer does.
Proposed fix
t.Cleanup(func() {
select {
case err := <-errs:
if err != nil && !errors.Is(err, context.Canceled) {
require.NoError(t, err)
}
- default:
+ case <-time.After(5 * time.Second):
+ require.FailNow(t, "in-memory MCP server did not stop")
}
})📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| t.Cleanup(func() { | |
| select { | |
| case err := <-errs: | |
| if err != nil && !errors.Is(err, context.Canceled) { | |
| require.NoError(t, err) | |
| } | |
| default: | |
| } | |
| t.Cleanup(func() { | |
| select { | |
| case err := <-errs: | |
| if err != nil && !errors.Is(err, context.Canceled) { | |
| require.NoError(t, err) | |
| } | |
| case <-time.After(5 * time.Second): | |
| require.FailNow(t, "in-memory MCP server did not stop") | |
| } | |
| }) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router/internal/codemode/server/server_test.go` around lines 460 - 467, The
cleanup currently uses a non-blocking select on the errs channel which can let
the test exit before srv.mcpServer.Run returns; replace the default branch with
a bounded wait like startServer does: in the t.Cleanup closure block wait on
errs with a select that has two cases—one receiving err from errs (if err != nil
&& !errors.Is(err, context.Canceled) call require.NoError) and a time.After case
using the same timeout value used by startServer—to ensure the in-memory MCP
server goroutine has a bounded time to exit instead of returning immediately.
The Employee.startDate field carries @requiresScopes in the shared demo/pkg/subgraphs/employees schema; the router's authorizer is always active and rejects unauthenticated callers as soon as a scoped field is touched, breaking the code-mode demo. The shared schema is exercised by router-tests/security/authentication_test.go so it has to stay intact — instead, prepare-schemas.sh copies the four code-mode subgraph schemas into demo/code-mode/schemas/ with @authenticated and @requiresScopes directive applications removed, and graph.yaml composes from those copies. The make compose target depends on prepare-schemas so the local copies always track upstream. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Operation identity is now the ShortSHA over the canonical body — eight
hex chars prefixed with 'o' so it's a valid JS identifier. The model
calls tools.<sha>(...) inside the sandbox, the search response surfaces
(sha, description) pairs plus a TS fragment that names each method by
sha, and storage dedupes by canonical body alone.
Previously, yoko regenerating an operation under the same document name
("getOrders" with v1 vs. v2 selection sets) silently overwrote the
first registration under NormalizeName-based dedup. Tying identity to
content makes that collision impossible — different bodies always get
different identifiers, identical bodies always converge.
DocumentName is preserved separately because the router's GraphQL
parser still needs the literal operation name from the body when
invoking against /graphql. NormalizeName, SuffixedName, and the
reserved-word table are gone; tests are rewritten to compute SHAs from
bodies so expectations stay self-checking. router-tests duplicates a
small codeModeShortSHA helper because it's a separate module and can't
import internal/codemode/storage.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Merges code-mode-v2 (#2825) and adds a per-sub-server code_mode_run_js MCP tool that lets an LLM compose multiple operations in one V8-sandboxed JS call instead of N MCP round-trips. The new tool is auto-registered on every MultiServer sub-server that has at least one operation loaded. Each sub-server gets its own sandbox bound to its upstream URL; tools.<name>(vars) dispatches via the existing op catalog (no Yoko, no storage backend, no session state). Reuses the codemode/sandbox, codemode/harness, and codemode/tsgen packages from the merged PR. Description bundled inline includes the typed TS catalog plus shape constraints (single async-arrow root, no top-level await, no IIFE, sandbox console disabled, etc.) so the LLM lands on the right syntax first try. Compile fixes from the merge: - Add EnsureIndexed(ctx) error to yoko.Searcher and *yoko.Client - Rename opsFragment -> newOpsFragment to match call sites/tests - Update fakeYoko test fake with EnsureIndexed counter - Resolve mcpServer field conflict in router_config.go (keep MultiServer type, add codeModeServer alongside)
Multi-topic snapshot of code-mode-v2 work in progress. demo/code-mode: - Include all non-EDFS demo subgraphs in the local federation: hobbies, products, test1, countries + the products_fg feature graph (under feature flag myff), in addition to the existing employees, family, availability, mood. Mirrors demo/graph-no-edg.yaml. The previous 4-subgraph set silently lacked Employee.hobbies (and Country, products, test1 fields), so code-mode-search-tools could never resolve prompts about those domains. - Fix BSD sed bug in prepare-schemas.sh: the previous regex used `\b` which macOS BSD sed treats as a literal `b`, so @authenticated stayed on type/enum/interface definitions. Switch to a portable (^|whitespace)/(non-word|EOL) anchor pair. Also extend the loop to cover all 9 subgraphs. - start.sh + run_subgraphs_subset.sh: launch the 5 added subgraphs on ports 4003/4004/4006/4009/4010 with matching wait_url checks. - README + Makefile comment: document the new subgraph set. yoko proto + mock + client: - Rewrite proto/wg/cosmo/code_mode/yoko/v1/yoko.proto and regenerate router/gen/proto/.../{yoko.pb.go, yokov1connect/yoko.connect.go}. - Rewrite demo/code-mode/yoko-mock/main.go (and tests). - Refresh router/internal/codemode/yoko/{client.go, client_test.go, searcher.go} against the new contract. - Bump demo/code-mode/{mcp-stdio-proxy,yoko-mock}/go.{mod,sum}. Connect demo: - Refresh demo/code-mode-connect/{README.md, router-config.yaml, start.sh} against the same yoko contract. Build / generate: - Makefile: pass --include-imports to `buf generate` so transitive protobuf imports are regenerated alongside the yoko v1 schema. - buf.yaml + buf.lock: pin the buf module deps now needed for that --include-imports run. - router/gen/proto/buf/validate/validate.pb.go: regenerated buf validate import. Misc: - router/pkg/codemode/varschema: new package + tests. - router/pkg/grpcconnector/grpcplugin/grpc_plugin.go: minor update. - router/internal/codemode/sandbox/sandbox_test.go + router/internal/codemode/server/observability_handler_test.go: adjust tests for the refreshed client surface. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
router/go.mod (1)
46-46:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winCritical: OpenTelemetry SDK version vulnerable to PATH hijacking (CVE-2026-24051).
The
go.opentelemetry.io/otel/sdkis pinned to v1.28.0 via the replace directive (line 199), which is vulnerable to GHSA-9h8m-3fm2-qjrq / CVE-2026-24051 (PATH hijacking on macOS/Darwin). The required version in line 46 (v1.39.0) is also insufficient. Upgrade to v1.40.0 or later.Based on learnings: in wundergraph/cosmo,
google.golang.org/grpcv1.79.3 can transitively pull a vulnerablego.opentelemetry.io/otel/sdkversion (GHSA-9h8m-3fm2-qjrq / CVE-2026-24051, PATH hijacking on macOS/Darwin). In every modulego.mod, explicitly pingo.opentelemetry.io/otel/sdktov1.40.0or later (or add/adjust the correspondingrequire/replaceso the resolved version is >= v1.40.0). Do not rely on transitive resolution for this dependency.🔒 Proposed fix to upgrade otel/sdk
go.opentelemetry.io/otel/metric => go.opentelemetry.io/otel/metric v1.28.0 - go.opentelemetry.io/otel/sdk => go.opentelemetry.io/otel/sdk v1.28.0 + go.opentelemetry.io/otel/sdk => go.opentelemetry.io/otel/sdk v1.40.0 go.opentelemetry.io/otel/sdk/metric => go.opentelemetry.io/otel/sdk/metric v1.28.0Also update line 46 if needed:
- go.opentelemetry.io/otel/sdk v1.39.0 + go.opentelemetry.io/otel/sdk v1.40.0Also applies to: 199-199
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/go.mod` at line 46, The go.mod currently pins go.opentelemetry.io/otel/sdk to a vulnerable version (replace directive) and also lists v1.39.0 in requires; explicitly require or replace go.opentelemetry.io/otel/sdk to v1.40.0 or later in this module's go.mod so the resolved version is >= v1.40.0 (do not rely on transitive resolution from grpc), update the existing require/replace entries for go.opentelemetry.io/otel/sdk to v1.40.0+ and remove or adjust any replace that pins it to v1.28.0 (ensure the module name go.opentelemetry.io/otel/sdk is updated consistently).
♻️ Duplicate comments (1)
demo/code-mode/yoko-mock/main.go (1)
163-172:⚠️ Potential issue | 🟠 Major | ⚡ Quick win
IndexSchemastill has a check-then-insert race.Two concurrent calls for the same SDL can both miss the first lookup, both create temp dirs and pre-warm codex, and the later insert overwrites the earlier entry. That leaks the first temp dir/session and doubles an expensive external call.
Proposed fix
entry := &schemaEntry{dir: dir, schema: schemaDoc, sessionID: sessionID} s.mu.Lock() + if existing, ok := s.schemas[id]; ok { + s.mu.Unlock() + _ = os.RemoveAll(dir) + existing.mu.RLock() + existingSession := existing.sessionID + existing.mu.RUnlock() + log.Printf("IndexSchema schema_id=%s reused dir=%s session_id=%s", id, existing.dir, existingSession) + return connect.NewResponse(&yokov1.IndexSchemaResponse{SchemaId: id}), nil + } s.schemas[id] = entry s.mu.Unlock()Also applies to: 194-197
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@demo/code-mode/yoko-mock/main.go` around lines 163 - 172, IndexSchema has a check-then-insert race around s.schemas: two goroutines can both miss the lookup, each create resources, and then one overwrites the other. Fix by making the create-and-insert atomic under s.mu: when id is missing, allocate and insert a placeholder entry into s.schemas (e.g., a new schema struct with its own mutex or a sync.Once flag) before releasing s.mu, then perform expensive prep (temp dir + pre-warm) while other callers will see the placeholder and wait on its per-entry mutex/flag; update the placeholder with dir/sessionID and unlock its mutex when done. Apply the same pattern used in IndexSchema to the other occurrence referenced (lines 194-197).
🧹 Nitpick comments (2)
router/internal/codemode/sandbox/sandbox_test.go (1)
682-694: ⚡ Quick winPrefer
sync.WaitGroup.Goover manualAdd/Donegoroutine wiring.Lines 682-694 use the manual pattern. In Go 1.25+,
wg.Gois the preferred approach and reduces bookkeeping drift risk.♻️ Proposed refactor
var wg sync.WaitGroup for range 5 { - wg.Add(1) - go func() { - defer wg.Done() + wg.Go(func() { _, err := s.Execute(context.Background(), ExecuteRequest{ SessionID: "s1", ToolNames: []string{"ping"}, WrappedJS: `async () => await tools.ping()`, }) assert.NoError(t, err) - }() + }) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/sandbox/sandbox_test.go` around lines 682 - 694, Replace the manual wg.Add/Done goroutine pattern with the Go 1.25 WaitGroup.Go helper: instead of calling wg.Add(1) and deferring wg.Done() inside the closure, call wg.Go(func() { ... }) so the WaitGroup handles the bookkeeping automatically; update the anonymous goroutine that invokes s.Execute with ExecuteRequest (SessionID "s1", ToolNames ["ping"], WrappedJS `async () => await tools.ping()`) to run inside wg.Go and keep the assert.NoError(t, err) check inside that function.demo/code-mode-connect/start.sh (1)
69-90: 💤 Low valueUnused
wait_urlfunction.The
wait_urlfunction is defined but never called in this script. If it's kept for consistency with other demo start scripts or planned future use, consider adding a brief comment noting this.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@demo/code-mode-connect/start.sh` around lines 69 - 90, The wait_url function is defined but never called; either remove it or explicitly mark it as intentionally unused by adding a brief comment above wait_url explaining it’s kept for consistency with other demo start scripts (or for future use), or if it should be used, wire it up where services are started and reference its signature (wait_url name url [timeout]) and the LOG_DIR variable so callers can report logs on timeout. Ensure the chosen fix updates any documentation/comments so the intent is clear.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@demo/code-mode/start.sh`:
- Around line 124-126: The script currently unconditionally removes "$PID_FILE",
risking orphaning a running demo; before rm -f "$PID_FILE" update start.sh to
detect if "$PID_FILE" exists, read the PID, and attempt to stop the tracked demo
(e.g., call the existing stop/cleanup logic or send a graceful SIGTERM to that
PID and wait for it to exit), only after confirming the old process is stopped
should you remove "$PID_FILE"; keep the mkdir -p "$LOG_DIR" and mkdir -p
"$GOCACHE_DIR" lines but move or guard the rm -f "$PID_FILE" so it runs after
the stop check and successful termination check.
- Around line 161-169: The script is backgrounding a pipeline so $! references
the tee/process group instead of the router; change the invocation so the router
is the direct backgrounded process (use process substitution), e.g. run
"$ROUTER_BIN" -config "$ROUTER_CONFIG" > >(tee "$LOG_DIR/router.log") 2>&1 &
then capture router_pid="$!" and call append_pid router "$router_pid" and wait
"$router_pid" so the router process is tracked directly; update the lines using
ROUTER_BIN, ROUTER_CONFIG, LOG_DIR, router_pid, append_pid and wait accordingly.
In `@router/internal/codemode/storage/memory_backend_test.go`:
- Around line 325-340: The test currently calls require.NoError inside the
worker goroutine after calling backend.Append, which only fails the goroutine
and won't stop the test; instead, capture the append error in a channel (e.g.,
make an errs chan error), have each goroutine send its err (and optionally the
resolved result) to that channel rather than calling require.NoError, then after
wg.Wait() drain the errs channel and call require.NoError for each error (and
assert results as needed) following the pattern used in redis_backend_test.go;
reference backend.Append, require.NoError, results channel, wg.Wait and the
worker goroutine to locate and change the logic.
In `@router/internal/codemode/storage/memory_backend.go`:
- Around line 125-129: Move the session.lastUsed update out of the appendedAny
conditional so the session's recency is refreshed whenever ops are processed
(even if they dedupe), but keep session.bundle = "" and session.bundleValid =
false only when appendedAny is true; specifically, in the code handling incoming
ops in memory_backend.go update session.lastUsed = b.now() unconditionally when
processing the batch, and leave bundle invalidation inside the if appendedAny
block (referencing session.lastUsed, appendedAny, session.bundle,
session.bundleValid and b.now()).
In `@router/internal/codemode/storage/naming.go`:
- Around line 33-34: CanonicalBody currently uses strings.Fields which collapses
whitespace inside string and block-string literals and can make distinct GraphQL
documents collide (affecting ShortSHA/deduping); replace the whitespace-based
approach in CanonicalBody with a GraphQL-aware canonicalizer: parse the body
into an AST using the project's GraphQL parser and re-print/serialize the AST
(or strip only GraphQL-ignored tokens) so that literals remain intact and only
syntactically-ignored spacing is normalized; update CanonicalBody to call the
parse+print flow (or a dedicated GraphQL canonicalize function) instead of
strings.Fields to preserve literal content.
In `@router/internal/codemode/storage/redis_backend.go`:
- Around line 189-203: ListNames currently reads entries via readOps without
refreshing the session TTL, allowing an active session to expire; update
ListNames to refresh the session TTL after a successful read (e.g., call the
same TTL-refresh helper used elsewhere or invoke b.client.Expire(ctx,
b.opsKey(sessionID), b.ttl)) so reads keep sessions alive. Apply the same change
to the cache-hit path in Bundle (the code around the Block starting at lines
211-220) so both ListNames and cache-hit Bundle refresh the session/ops key TTL
after successful reads.
- Around line 141-145: Bundle can cache a stale render because it reads opsKey
then later writes bundleKey without ensuring opsKey didn't change; fix by making
the read-and-write atomic: in Bundle (and other affected code paths around
opsKey/bundleKey) perform a conditional write that verifies opsKey hasn't
changed (use Redis WATCH/MULTI/EXEC or a single EVAL Lua script that reads
opsKey and only writes bundleKey if opsKey still matches the value you read)
instead of the current separate read then tx.TxPipelined Set/Expire/Del
sequence; ensure Append still invalidates bundleKey and that any
pipeline/transaction uses the same opsKey version check to avoid committing
stale bundles.
---
Outside diff comments:
In `@router/go.mod`:
- Line 46: The go.mod currently pins go.opentelemetry.io/otel/sdk to a
vulnerable version (replace directive) and also lists v1.39.0 in requires;
explicitly require or replace go.opentelemetry.io/otel/sdk to v1.40.0 or later
in this module's go.mod so the resolved version is >= v1.40.0 (do not rely on
transitive resolution from grpc), update the existing require/replace entries
for go.opentelemetry.io/otel/sdk to v1.40.0+ and remove or adjust any replace
that pins it to v1.28.0 (ensure the module name go.opentelemetry.io/otel/sdk is
updated consistently).
---
Duplicate comments:
In `@demo/code-mode/yoko-mock/main.go`:
- Around line 163-172: IndexSchema has a check-then-insert race around
s.schemas: two goroutines can both miss the lookup, each create resources, and
then one overwrites the other. Fix by making the create-and-insert atomic under
s.mu: when id is missing, allocate and insert a placeholder entry into s.schemas
(e.g., a new schema struct with its own mutex or a sync.Once flag) before
releasing s.mu, then perform expensive prep (temp dir + pre-warm) while other
callers will see the placeholder and wait on its per-entry mutex/flag; update
the placeholder with dir/sessionID and unlock its mutex when done. Apply the
same pattern used in IndexSchema to the other occurrence referenced (lines
194-197).
---
Nitpick comments:
In `@demo/code-mode-connect/start.sh`:
- Around line 69-90: The wait_url function is defined but never called; either
remove it or explicitly mark it as intentionally unused by adding a brief
comment above wait_url explaining it’s kept for consistency with other demo
start scripts (or for future use), or if it should be used, wire it up where
services are started and reference its signature (wait_url name url [timeout])
and the LOG_DIR variable so callers can report logs on timeout. Ensure the
chosen fix updates any documentation/comments so the intent is clear.
In `@router/internal/codemode/sandbox/sandbox_test.go`:
- Around line 682-694: Replace the manual wg.Add/Done goroutine pattern with the
Go 1.25 WaitGroup.Go helper: instead of calling wg.Add(1) and deferring
wg.Done() inside the closure, call wg.Go(func() { ... }) so the WaitGroup
handles the bookkeeping automatically; update the anonymous goroutine that
invokes s.Execute with ExecuteRequest (SessionID "s1", ToolNames ["ping"],
WrappedJS `async () => await tools.ping()`) to run inside wg.Go and keep the
assert.NoError(t, err) check inside that function.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0fcb2a1d-9a7b-444e-a742-7c8741688b64
⛔ Files ignored due to path filters (7)
buf.lockis excluded by!**/*.lockdemo/code-mode/mcp-stdio-proxy/go.sumis excluded by!**/*.sumdemo/code-mode/yoko-mock/go.sumis excluded by!**/*.sumrouter/gen/proto/buf/validate/validate.pb.gois excluded by!**/*.pb.go,!**/gen/**router/gen/proto/wg/cosmo/code_mode/yoko/v1/yoko.pb.gois excluded by!**/*.pb.go,!**/gen/**router/gen/proto/wg/cosmo/code_mode/yoko/v1/yokov1connect/yoko.connect.gois excluded by!**/gen/**router/go.sumis excluded by!**/*.sum
📒 Files selected for processing (38)
Makefilebuf.yamldemo/code-mode-connect/README.mddemo/code-mode-connect/router-config.yamldemo/code-mode-connect/start.shdemo/code-mode/.gitignoredemo/code-mode/Makefiledemo/code-mode/README.mddemo/code-mode/graph.yamldemo/code-mode/mcp-stdio-proxy/go.moddemo/code-mode/prepare-schemas.shdemo/code-mode/router-config.yamldemo/code-mode/run_subgraphs_subset.shdemo/code-mode/start.shdemo/code-mode/yoko-mock/go.moddemo/code-mode/yoko-mock/main.godemo/code-mode/yoko-mock/main_test.goproto/wg/cosmo/code_mode/yoko/v1/yoko.protorouter-tests/code_mode_named_ops_test.gorouter/go.modrouter/internal/codemode/sandbox/host.gorouter/internal/codemode/sandbox/sandbox_test.gorouter/internal/codemode/server/observability_handler_test.gorouter/internal/codemode/server/search_handler.gorouter/internal/codemode/server/search_handler_test.gorouter/internal/codemode/storage/memory_backend.gorouter/internal/codemode/storage/memory_backend_test.gorouter/internal/codemode/storage/naming.gorouter/internal/codemode/storage/naming_test.gorouter/internal/codemode/storage/redis_backend.gorouter/internal/codemode/storage/redis_backend_test.gorouter/internal/codemode/storage/types.gorouter/internal/codemode/yoko/client.gorouter/internal/codemode/yoko/client_test.gorouter/internal/codemode/yoko/searcher.gorouter/pkg/codemode/varschema/varschema.gorouter/pkg/codemode/varschema/varschema_test.gorouter/pkg/grpcconnector/grpcplugin/grpc_plugin.go
✅ Files skipped from review due to trivial changes (9)
- demo/code-mode/.gitignore
- demo/code-mode/README.md
- demo/code-mode/mcp-stdio-proxy/go.mod
- demo/code-mode-connect/README.md
- demo/code-mode/router-config.yaml
- router/pkg/codemode/varschema/varschema_test.go
- router/internal/codemode/server/observability_handler_test.go
- demo/code-mode-connect/router-config.yaml
- router-tests/code_mode_named_ops_test.go
🚧 Files skipped from review as they are similar to previous changes (7)
- router/internal/codemode/storage/types.go
- demo/code-mode/yoko-mock/go.mod
- router/internal/codemode/storage/naming_test.go
- router/internal/codemode/yoko/client.go
- router/internal/codemode/server/search_handler.go
- router/internal/codemode/sandbox/host.go
- router/internal/codemode/yoko/client_test.go
| mkdir -p "$LOG_DIR" | ||
| mkdir -p "$GOCACHE_DIR" | ||
| rm -f "$PID_FILE" |
There was a problem hiding this comment.
Stop the previous tracked demo before wiping the PID file.
Unconditionally deleting "$PID_FILE" here can orphan an already-running demo. If old subgraphs are still bound to 4001-4010, the new go run commands can fail immediately while wait_url still succeeds against the stale processes.
Proposed fix
mkdir -p "$LOG_DIR"
mkdir -p "$GOCACHE_DIR"
-rm -f "$PID_FILE"
+if [ -f "$PID_FILE" ]; then
+ echo "Existing Code Mode demo detected; stopping tracked processes first"
+ kill_pid_file
+fi
trap cleanup EXIT INT TERM📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| mkdir -p "$LOG_DIR" | |
| mkdir -p "$GOCACHE_DIR" | |
| rm -f "$PID_FILE" | |
| mkdir -p "$LOG_DIR" | |
| mkdir -p "$GOCACHE_DIR" | |
| if [ -f "$PID_FILE" ]; then | |
| echo "Existing Code Mode demo detected; stopping tracked processes first" | |
| kill_pid_file | |
| fi | |
| trap cleanup EXIT INT TERM |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@demo/code-mode/start.sh` around lines 124 - 126, The script currently
unconditionally removes "$PID_FILE", risking orphaning a running demo; before rm
-f "$PID_FILE" update start.sh to detect if "$PID_FILE" exists, read the PID,
and attempt to stop the tracked demo (e.g., call the existing stop/cleanup logic
or send a graceful SIGTERM to that PID and wait for it to exit), only after
confirming the old process is stopped should you remove "$PID_FILE"; keep the
mkdir -p "$LOG_DIR" and mkdir -p "$GOCACHE_DIR" lines but move or guard the rm
-f "$PID_FILE" so it runs after the stop check and successful termination check.
| echo "Starting router in foreground" | ||
| echo "Router output is being teed to $LOG_DIR/router.log" | ||
| # Tee stdout+stderr so the user still sees live output AND we keep a persistent | ||
| # log for post-mortem debugging when the router exits unexpectedly. | ||
| "$ROUTER_BIN" -config "$ROUTER_CONFIG" 2>&1 | tee "$LOG_DIR/router.log" & | ||
| router_pid="$!" | ||
| append_pid router "$router_pid" | ||
|
|
||
| wait "$router_pid" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, check if the file exists and view the relevant lines
fd -t f "start.sh" demo/Repository: wundergraph/cosmo
Length of output: 117
🏁 Script executed:
# View the file context around lines 161-169
if [ -f "demo/code-mode/start.sh" ]; then
wc -l "demo/code-mode/start.sh"
sed -n '155,175p' "demo/code-mode/start.sh"
fiRepository: wundergraph/cosmo
Length of output: 662
🏁 Script executed:
#!/bin/bash
# Test bash behavior with pipes and backgrounding
# Does $! reference the tee process or the router process?
# Create a test directory
mkdir -p /tmp/bash_pid_test
cd /tmp/bash_pid_test
# Test 1: Current approach (pipe to tee with background)
echo "=== Test 1: pipe | tee & ==="
(sleep 100 | tee output.log &) &
last_pid=$!
sleep 0.2
# Check which process the PID refers to
ps -p $last_pid -o pid,cmd | head -2
# Test 2: Process substitution approach
echo ""
echo "=== Test 2: process substitution > (tee) & ==="
(sleep 100 > >(tee output2.log) &) &
last_pid=$!
sleep 0.2
ps -p $last_pid -o pid,cmd | head -2
# Test 3: Check pipeline job details
echo ""
echo "=== Test 3: Check pipeline members with jobs ==="
(sleep 100 | tee output3.log &)
sleep 0.2
jobs -l
# Clean up
cd /
rm -rf /tmp/bash_pid_testRepository: wundergraph/cosmo
Length of output: 477
🌐 Web query:
bash $! variable pipeline backgrounding behavior which process captured
💡 Result:
In Bash,
Citations:
- 1: https://www.gnu.org/s/bash/manual/html_node/Special-Parameters.html
- 2: https://unix.stackexchange.com/questions/679783/reliable-way-to-get-pid-of-piped-background-process
- 3: https://www.gnu.org/software/bash/manual/html_node/Pipelines
- 4: https://www.gnu.org/s/bash/manual/html_node/Job-Control-Builtins.html
Track the router process directly instead of a backgrounded pipeline.
When "$ROUTER_BIN" ... | tee ... & is backgrounded, $! captures the PID of the process group rather than the router itself, making wait unreliable. This can cause the script to incorrectly track or monitor the router process. Use process substitution instead to ensure the router is the direct backgrounded target.
Proposed fix
echo "Starting router in foreground"
echo "Router output is being teed to $LOG_DIR/router.log"
# Tee stdout+stderr so the user still sees live output AND we keep a persistent
# log for post-mortem debugging when the router exits unexpectedly.
-"$ROUTER_BIN" -config "$ROUTER_CONFIG" 2>&1 | tee "$LOG_DIR/router.log" &
+"$ROUTER_BIN" -config "$ROUTER_CONFIG" > >(tee "$LOG_DIR/router.log") 2>&1 &
router_pid="$!"
append_pid router "$router_pid"
wait "$router_pid"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| echo "Starting router in foreground" | |
| echo "Router output is being teed to $LOG_DIR/router.log" | |
| # Tee stdout+stderr so the user still sees live output AND we keep a persistent | |
| # log for post-mortem debugging when the router exits unexpectedly. | |
| "$ROUTER_BIN" -config "$ROUTER_CONFIG" 2>&1 | tee "$LOG_DIR/router.log" & | |
| router_pid="$!" | |
| append_pid router "$router_pid" | |
| wait "$router_pid" | |
| echo "Starting router in foreground" | |
| echo "Router output is being teed to $LOG_DIR/router.log" | |
| # Tee stdout+stderr so the user still sees live output AND we keep a persistent | |
| # log for post-mortem debugging when the router exits unexpectedly. | |
| "$ROUTER_BIN" -config "$ROUTER_CONFIG" > >(tee "$LOG_DIR/router.log") 2>&1 & | |
| router_pid="$!" | |
| append_pid router "$router_pid" | |
| wait "$router_pid" |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@demo/code-mode/start.sh` around lines 161 - 169, The script is backgrounding
a pipeline so $! references the tee/process group instead of the router; change
the invocation so the router is the direct backgrounded process (use process
substitution), e.g. run "$ROUTER_BIN" -config "$ROUTER_CONFIG" > >(tee
"$LOG_DIR/router.log") 2>&1 & then capture router_pid="$!" and call append_pid
router "$router_pid" and wait "$router_pid" so the router process is tracked
directly; update the lines using ROUTER_BIN, ROUTER_CONFIG, LOG_DIR, router_pid,
append_pid and wait accordingly.
| var wg sync.WaitGroup | ||
| results := make(chan []SessionOp, goroutines) | ||
|
|
||
| for i := range goroutines { | ||
| wg.Add(1) | ||
| go func(i int) { | ||
| defer wg.Done() | ||
| resolved, err := backend.Append(ctx, "shared", []SessionOp{{ | ||
| Name: sha, | ||
| Body: body, | ||
| Kind: OperationKindQuery, | ||
| Description: fmt.Sprintf("Shared %d", i), | ||
| }}) | ||
| require.NoError(t, err) | ||
| results <- resolved | ||
| }(i) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's locate and read the test file
fd "memory_backend_test.go" router/Repository: wundergraph/cosmo
Length of output: 117
🏁 Script executed:
# Read the specific lines mentioned in the review
sed -n '320,345p' router/internal/codemode/storage/memory_backend_test.go | cat -nRepository: wundergraph/cosmo
Length of output: 764
🏁 Script executed:
# Get context of the full test function to understand the pattern
sed -n '315,350p' router/internal/codemode/storage/memory_backend_test.go | cat -nRepository: wundergraph/cosmo
Length of output: 1123
🏁 Script executed:
# Check for other uses of require/assert in goroutines in this file
rg -n "go func" router/internal/codemode/storage/memory_backend_test.go -A 15 | head -100Repository: wundergraph/cosmo
Length of output: 451
🏁 Script executed:
# Understand the testify require behavior - check documentation/imports
head -50 router/internal/codemode/storage/memory_backend_test.go | grep -E "^import|require|assert"Repository: wundergraph/cosmo
Length of output: 147
🏁 Script executed:
# Check if there are other concurrent test patterns in this file
rg -n "defer wg.Done()" router/internal/codemode/storage/memory_backend_test.go -B 5 -A 10 | head -80Repository: wundergraph/cosmo
Length of output: 511
🏁 Script executed:
# Look for how errors are handled in other goroutine tests in the project
rg -l "go func.*require\." . --type go | head -10Repository: wundergraph/cosmo
Length of output: 43
🏁 Script executed:
# Check the full test file to see if there are other concurrent tests
wc -l router/internal/codemode/storage/memory_backend_test.goRepository: wundergraph/cosmo
Length of output: 121
🏁 Script executed:
# Get a broader view of test patterns - look for other concurrent patterns
rg "require\." router/internal/codemode/storage/memory_backend_test.go | grep -c "require"Repository: wundergraph/cosmo
Length of output: 64
🏁 Script executed:
# Check if there are other goroutine+require patterns across the entire codebase
rg "go func.*{" . --type go -A 5 | rg "require\." | wc -lRepository: wundergraph/cosmo
Length of output: 165
🏁 Script executed:
# Look at the specific test to understand the complete behavior
sed -n '320,345p' router/internal/codemode/storage/memory_backend_test.goRepository: wundergraph/cosmo
Length of output: 582
🏁 Script executed:
# Check Go test best practices - look for error handling patterns in similar concurrent tests
rg "wg.Wait()" . --type go -B 10 | grep -E "(require\.|wg.Wait|errors|chan)" | head -20Repository: wundergraph/cosmo
Length of output: 1505
🏁 Script executed:
# Let's understand the exact issue - check how require.NoError behaves
# by looking at what FailNow does in the context of goroutines
rg "FailNow" . --type go | head -5Repository: wundergraph/cosmo
Length of output: 448
🏁 Script executed:
# Check if there's any documentation or comments about goroutine testing patterns
rg "goroutine.*require|require.*goroutine|non-main.*goroutine" . --type goRepository: wundergraph/cosmo
Length of output: 43
🏁 Script executed:
# Let's see the complete test function to verify the behavior
sed -n '308,370p' router/internal/codemode/storage/memory_backend_test.goRepository: wundergraph/cosmo
Length of output: 1618
🏁 Script executed:
# Search for how testify's require.NoError is documented to behave
# and check if there are any comments in the codebase about this pattern
rg "require\.NoError.*goroutine|goroutine.*require\.NoError" . --type goRepository: wundergraph/cosmo
Length of output: 43
🏁 Script executed:
# Check if there are any error-collection patterns used elsewhere in the codebase
rg "errs.*:=.*make.*error" . --type go -A 3 | head -30Repository: wundergraph/cosmo
Length of output: 1783
🏁 Script executed:
# Look for any discussion or comments about goroutine assertions in the file
grep -n "goroutine\|parallel\|concurrent" router/internal/codemode/storage/memory_backend_test.goRepository: wundergraph/cosmo
Length of output: 172
🏁 Script executed:
# Let's check the redis_backend_test.go pattern to see how it properly handles goroutine errors
rg -n "errs := make.*error" router/internal/codemode/storage/redis_backend_test.go -A 20 | head -50Repository: wundergraph/cosmo
Length of output: 679
🏁 Script executed:
# Check if the memory backend test continues to use results after potential goroutine failure
sed -n '340,360p' router/internal/codemode/storage/memory_backend_test.goRepository: wundergraph/cosmo
Length of output: 506
🏁 Script executed:
# Quick verification: confirm the redis_backend_test pattern asserts after wg.Wait()
sed -n '290,295p' router/internal/codemode/storage/redis_backend_test.goRepository: wundergraph/cosmo
Length of output: 140
Move require.NoError out of the worker goroutine.
Assertions in spawned goroutines only exit that goroutine on failure, not the test. If one append fails, the test continues draining the results channel without surfacing the error. Send err over a channel and assert after wg.Wait(), following the pattern in redis_backend_test.go.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router/internal/codemode/storage/memory_backend_test.go` around lines 325 -
340, The test currently calls require.NoError inside the worker goroutine after
calling backend.Append, which only fails the goroutine and won't stop the test;
instead, capture the append error in a channel (e.g., make an errs chan error),
have each goroutine send its err (and optionally the resolved result) to that
channel rather than calling require.NoError, then after wg.Wait() drain the errs
channel and call require.NoError for each error (and assert results as needed)
following the pattern used in redis_backend_test.go; reference backend.Append,
require.NoError, results channel, wg.Wait and the worker goroutine to locate and
change the logic.
| if appendedAny { | ||
| session.lastUsed = b.now() | ||
| session.bundle = "" | ||
| session.bundleValid = false | ||
| } |
There was a problem hiding this comment.
Refresh lastUsed even when every incoming op dedupes.
A repeated search that resolves entirely to existing ops leaves the session's recency unchanged here, so an actively used session can still age out via TTL/LRU. Only bundle invalidation needs to stay conditional.
Suggested patch
- if appendedAny {
- session.lastUsed = b.now()
- session.bundle = ""
- session.bundleValid = false
- }
+ session.lastUsed = b.now()
+ if appendedAny {
+ session.bundle = ""
+ session.bundleValid = false
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if appendedAny { | |
| session.lastUsed = b.now() | |
| session.bundle = "" | |
| session.bundleValid = false | |
| } | |
| session.lastUsed = b.now() | |
| if appendedAny { | |
| session.bundle = "" | |
| session.bundleValid = false | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router/internal/codemode/storage/memory_backend.go` around lines 125 - 129,
Move the session.lastUsed update out of the appendedAny conditional so the
session's recency is refreshed whenever ops are processed (even if they dedupe),
but keep session.bundle = "" and session.bundleValid = false only when
appendedAny is true; specifically, in the code handling incoming ops in
memory_backend.go update session.lastUsed = b.now() unconditionally when
processing the batch, and leave bundle invalidation inside the if appendedAny
block (referencing session.lastUsed, appendedAny, session.bundle,
session.bundleValid and b.now()).
| func CanonicalBody(body string) string { | ||
| return strings.Join(strings.Fields(body), " ") |
There was a problem hiding this comment.
Use a GraphQL-aware canonicalizer for operation identity.
strings.Fields also rewrites whitespace inside string and block-string literals, so distinct documents can collapse to the same ShortSHA and dedupe as one stored op. That can surface the wrong operation later. Strip only GraphQL ignored tokens, or canonicalize via parse+print instead.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router/internal/codemode/storage/naming.go` around lines 33 - 34,
CanonicalBody currently uses strings.Fields which collapses whitespace inside
string and block-string literals and can make distinct GraphQL documents collide
(affecting ShortSHA/deduping); replace the whitespace-based approach in
CanonicalBody with a GraphQL-aware canonicalizer: parse the body into an AST
using the project's GraphQL parser and re-print/serialize the AST (or strip only
GraphQL-ignored tokens) so that literals remain intact and only
syntactically-ignored spacing is normalized; update CanonicalBody to call the
parse+print flow (or a dedicated GraphQL canonicalize function) instead of
strings.Fields to preserve literal content.
| _, err = tx.TxPipelined(ctx, func(pipe redis.Pipeliner) error { | ||
| pipe.Set(ctx, opsKey, payload, 0) | ||
| pipe.Expire(ctx, opsKey, b.sessionTTL) | ||
| pipe.Del(ctx, bundleKey) | ||
| return nil |
There was a problem hiding this comment.
Bundle can cache a stale render across a concurrent Append.
Append invalidates bundleKey, but Bundle reads opsKey and later writes bundleKey without checking whether the ops changed in between. An overlapping append can therefore commit newer ops after the read and before the write, leaving an outdated bundle cached until TTL/schema rotation.
Also applies to: 225-263
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router/internal/codemode/storage/redis_backend.go` around lines 141 - 145,
Bundle can cache a stale render because it reads opsKey then later writes
bundleKey without ensuring opsKey didn't change; fix by making the
read-and-write atomic: in Bundle (and other affected code paths around
opsKey/bundleKey) perform a conditional write that verifies opsKey hasn't
changed (use Redis WATCH/MULTI/EXEC or a single EVAL Lua script that reads
opsKey and only writes bundleKey if opsKey still matches the value you read)
instead of the current separate read then tx.TxPipelined Set/Expire/Del
sequence; ensure Append still invalidates bundleKey and that any
pipeline/transaction uses the same opsKey version check to avoid committing
stale bundles.
| func (b *RedisBackend) ListNames(ctx context.Context, sessionID string) ([]string, error) { | ||
| if err := ctx.Err(); err != nil { | ||
| return nil, err | ||
| } | ||
|
|
||
| entries, err := b.readOps(ctx, b.client, b.opsKey(sessionID)) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
|
|
||
| names := make([]string, 0, len(entries)) | ||
| for _, entry := range entries { | ||
| names = append(names, entry.Name) | ||
| } | ||
| return names, nil |
There was a problem hiding this comment.
Read-only access does not keep Redis sessions alive.
ListNames and cache-hit Bundle return without touching TTL, so a session can expire while it's actively being used through those paths. This diverges from MemoryBackend, where reads update recency.
Also applies to: 211-220
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router/internal/codemode/storage/redis_backend.go` around lines 189 - 203,
ListNames currently reads entries via readOps without refreshing the session
TTL, allowing an active session to expire; update ListNames to refresh the
session TTL after a successful read (e.g., call the same TTL-refresh helper used
elsewhere or invoke b.client.Expire(ctx, b.opsKey(sessionID), b.ttl)) so reads
keep sessions alive. Apply the same change to the cache-hit path in Bundle (the
code around the Block starting at lines 211-220) so both ListNames and cache-hit
Bundle refresh the session/ops key TTL after successful reads.
The mood and availability subgraphs publish employee-update events to NATS as part of their mutation resolvers. Two failure modes broke the demo when NATS was not running (or not yet started): 1. cmd/all crashed at boot subgraphs.New() eagerly created and started two NATS adapters and treated any failure as fatal. Without NATS, the entire process exited with "failed to start default nats adapter". 2. Mutations failed at runtime mood.UpdateMood returned "no nats pubsub default provider found" when the adapter map was empty, and availability.UpdateAvailability nil-panicked on the unconditional NatsPubSubByProviderID["default"] lookup. Even when the data write succeeded (storage.Set runs first), clients saw a downstream error response. Changes: - subgraphs.go: extract startNatsAdapter helper that logs and returns nil on failure. NATS adapter startup and JetStream stream provisioning are now best-effort. - mood/availability resolvers: extract publishMoodEvent / publishAvailabilityEvent helpers that nil-check the adapter and log publish errors rather than returning them. Mutations always succeed if the local storage write succeeds. - code-mode demo start.sh and run_subgraphs_subset.sh: switch from per-subgraph cmd/<name> processes to a single cmd/all invocation with explicit port flags. The individual cmd/<name> binaries pass nil for the NATS adapter map; cmd/all wires NATS up correctly when it is available, and now degrades gracefully when it is not. - router-config.yaml: flip require_mutation_approval from true to false so the demo MCP client can run mutations end-to-end without an approval flow. - README.md: document the optional NATS prerequisite (make edfs-infra-up) and explain why the demo runs cmd/all. Trade-off: subscriptions on Employee.currentMood / isAvailable will not deliver updates while NATS is unreachable — direct queries still reflect the new state. The demo prioritizes "queries and mutations always work" over "subscriptions always work". Verified end-to-end with NATS stopped via docker stop cosmo-dev-nats-1: - cmd/all boots cleanly with logged warnings - updateMood and updateAvailability mutations both return successful data responses; resolvers log per-publish skip warnings - after docker start cosmo-dev-nats-1, the same mutations succeed with no resolver-level warnings (publishes go through) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
demo/code-mode/start.sh (1)
95-105: 💤 Low valueUnused helper function
start_background_root.This function is defined but never called in the script. If it's intended for future use, consider adding a comment; otherwise, removing it keeps the script leaner.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@demo/code-mode/start.sh` around lines 95 - 105, The helper function start_background_root is defined but never used; either remove its definition to keep the script lean or add a short comment above start_background_root explaining it's intentionally kept for future use and when it should be invoked; locate the start_background_root function in the script and either delete the entire function block or add the explanatory comment and leave it in place so reviewers understand it's intentionally unused.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@demo/pkg/subgraphs/availability/subgraph/schema.resolvers.go`:
- Around line 18-33: The helper method publishAvailabilityEvent, currently
defined in a gqlgen-generated file, must be moved into a non-generated file
(e.g., create resolver_helpers.go or use resolver.go) in the same package so it
isn't removed by go generate; copy the exact function with the same receiver
(func (r *mutationResolver) publishAvailabilityEvent(ctx context.Context,
providerID, subject, payload string)) and body into that new file, ensure the
same imports (context, log, datasource/nats types) are present, remove the
duplicate from the generated file, and keep all call sites unchanged so
r.publishAvailabilityEvent is resolved at build time.
In `@demo/pkg/subgraphs/mood/subgraph/schema.resolvers.go`:
- Around line 18-33: The helper method publishMoodEvent is defined inside the
gqlgen-generated schema.resolvers.go and may be erased during codegen; move the
publishMoodEvent function (and any small helpers it uses) out of
schema.resolvers.go into a non-generated file (e.g., resolver_helpers.go or add
into resolver.go) within the same package so it’s preserved across go generate;
ensure the function signature (receiver *mutationResolver) and references to
r.NatsPubSubByProviderID and datasource/nats types remain unchanged and update
imports if needed.
In `@demo/pkg/subgraphs/subgraphs.go`:
- Around line 242-251: The code opens a NATS connection (nats.Connect) into
defaultConnection for JetStream provisioning but never closes it, leaking
connections; refactor the block so you first call nats.Connect(url) into
defaultConnection, handle the error, and immediately schedule closure (e.g.,
defer defaultConnection.Close() or ensure Close() is invoked in all
error/success paths) before calling jetstream.New(defaultConnection) and
CreateOrUpdateStream; ensure you reference the same defaultConnection and
defaultJetStream identifiers so the connection is always closed regardless of
whether jetstream.New or CreateOrUpdateStream succeed.
---
Nitpick comments:
In `@demo/code-mode/start.sh`:
- Around line 95-105: The helper function start_background_root is defined but
never used; either remove its definition to keep the script lean or add a short
comment above start_background_root explaining it's intentionally kept for
future use and when it should be invoked; locate the start_background_root
function in the script and either delete the entire function block or add the
explanatory comment and leave it in place so reviewers understand it's
intentionally unused.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ec0597fd-77f1-4c20-828b-1966f9a551dc
📒 Files selected for processing (7)
demo/code-mode/README.mddemo/code-mode/router-config.yamldemo/code-mode/run_subgraphs_subset.shdemo/code-mode/start.shdemo/pkg/subgraphs/availability/subgraph/schema.resolvers.godemo/pkg/subgraphs/mood/subgraph/schema.resolvers.godemo/pkg/subgraphs/subgraphs.go
✅ Files skipped from review due to trivial changes (1)
- demo/code-mode/router-config.yaml
| // publishAvailabilityEvent emits an Employee-updated event for subscription | ||
| // consumers. Failures (missing adapter, broker down) are logged but never | ||
| // fail the mutation — local storage has already been updated. | ||
| func (r *mutationResolver) publishAvailabilityEvent(ctx context.Context, providerID, subject, payload string) { | ||
| adapter := r.NatsPubSubByProviderID[providerID] | ||
| if adapter == nil { | ||
| log.Printf("availability: nats provider %q unavailable, skipping publish to %s", providerID, subject) | ||
| return | ||
| } | ||
| evt := &nats.MutableEvent{Data: []byte(fmt.Sprintf(`{"id":%d,"__typename": "Employee"}`, employeeID))} | ||
|
|
||
| err := r.NatsPubSubByProviderID["default"].Publish(ctx, conf, []datasource.StreamEvent{evt}) | ||
| err := adapter.Publish(ctx, &nats.PublishAndRequestEventConfiguration{ | ||
| Subject: subject, | ||
| }, []datasource.StreamEvent{&nats.MutableEvent{Data: []byte(payload)}}) | ||
| if err != nil { | ||
| return nil, err | ||
| log.Printf("availability: nats publish failed via %q to %s: %v", providerID, subject, err) | ||
| } | ||
| } |
There was a problem hiding this comment.
Move helper method to a non-generated file to fix CI failure.
Same issue as the mood subgraph: publishAvailabilityEvent is defined in a gqlgen-generated file and gets removed during go generate. The CI error confirms this: r.publishAvailabilityEvent undefined.
Move this helper to resolver_helpers.go or resolver.go in this package.
🧰 Tools
🪛 GitHub Actions: Demo CI / 0_build_test.txt
[error] 20-20: validation failed: packages.Load: r.publishAvailabilityEvent undefined (type *mutationResolver has no field or method publishAvailabilityEvent)
🪛 GitHub Actions: Demo CI / build_test
[error] 20-20: Go generate failed: validation failed (packages.Load). subgraph/schema.resolvers.go:20:4: r.publishAvailabilityEvent undefined (type *mutationResolver has no field or method publishAvailabilityEvent)
🪛 GitHub Check: build_test
[failure] 21-21:
r.publishAvailabilityEvent undefined (type *mutationResolver has no field or method publishAvailabilityEvent)
[failure] 20-20:
r.publishAvailabilityEvent undefined (type *mutationResolver has no field or method publishAvailabilityEvent)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@demo/pkg/subgraphs/availability/subgraph/schema.resolvers.go` around lines 18
- 33, The helper method publishAvailabilityEvent, currently defined in a
gqlgen-generated file, must be moved into a non-generated file (e.g., create
resolver_helpers.go or use resolver.go) in the same package so it isn't removed
by go generate; copy the exact function with the same receiver (func (r
*mutationResolver) publishAvailabilityEvent(ctx context.Context, providerID,
subject, payload string)) and body into that new file, ensure the same imports
(context, log, datasource/nats types) are present, remove the duplicate from the
generated file, and keep all call sites unchanged so r.publishAvailabilityEvent
is resolved at build time.
| // publishMoodEvent emits an Employee-updated event for subscription consumers. | ||
| // Failures (missing adapter, broker down) are logged but never fail the | ||
| // mutation — local storage has already been updated. | ||
| func (r *mutationResolver) publishMoodEvent(ctx context.Context, providerID, subject, payload string) { | ||
| adapter := r.NatsPubSubByProviderID[providerID] | ||
| if adapter == nil { | ||
| log.Printf("mood: nats provider %q unavailable, skipping publish to %s", providerID, subject) | ||
| return | ||
| } | ||
| err := adapter.Publish(ctx, &nats.PublishAndRequestEventConfiguration{ | ||
| Subject: subject, | ||
| }, []datasource.StreamEvent{&nats.MutableEvent{Data: []byte(payload)}}) | ||
| if err != nil { | ||
| log.Printf("mood: nats publish failed via %q to %s: %v", providerID, subject, err) | ||
| } | ||
| } |
There was a problem hiding this comment.
Move helper method to a non-generated file to fix CI failure.
The publishMoodEvent helper is placed in a gqlgen-generated file. During go generate, gqlgen regenerates schema.resolvers.go and only preserves recognized resolver implementations—this helper method is treated as "unknown code" and may be removed or misplaced, causing the build failure shown in CI.
Move this method to a separate file like resolver_helpers.go or add it to the existing resolver.go file in this package.
🧰 Tools
🪛 GitHub Actions: Demo CI / 0_build_test.txt
[error] 20-20: validation failed: packages.Load: r.publishAvailabilityEvent undefined (type *mutationResolver has no field or method publishAvailabilityEvent)
🪛 GitHub Actions: Demo CI / build_test
[error] 20-20: Go generate failed: validation failed (packages.Load). subgraph/schema.resolvers.go:20:4: r.publishAvailabilityEvent undefined (type *mutationResolver has no field or method publishAvailabilityEvent)
🪛 GitHub Check: build_test
[failure] 21-21:
r.publishMoodEvent undefined (type *mutationResolver has no field or method publishMoodEvent)
[failure] 20-20:
r.publishMoodEvent undefined (type *mutationResolver has no field or method publishMoodEvent)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@demo/pkg/subgraphs/mood/subgraph/schema.resolvers.go` around lines 18 - 33,
The helper method publishMoodEvent is defined inside the gqlgen-generated
schema.resolvers.go and may be erased during codegen; move the publishMoodEvent
function (and any small helpers it uses) out of schema.resolvers.go into a
non-generated file (e.g., resolver_helpers.go or add into resolver.go) within
the same package so it’s preserved across go generate; ensure the function
signature (receiver *mutationResolver) and references to
r.NatsPubSubByProviderID and datasource/nats types remain unchanged and update
imports if needed.
| if defaultConnection, err := nats.Connect(url); err != nil { | ||
| log.Printf("nats: skipping jetstream stream provisioning (connect failed): %v", err) | ||
| } else if defaultJetStream, err := jetstream.New(defaultConnection); err != nil { | ||
| log.Printf("nats: skipping jetstream stream provisioning (jetstream init failed): %v", err) | ||
| } else if _, err := defaultJetStream.CreateOrUpdateStream(ctx, jetstream.StreamConfig{ | ||
| Name: "streamName", | ||
| Subjects: []string{"employeeUpdated.>"}, | ||
| }) | ||
| if err != nil { | ||
| return nil, err | ||
| }); err != nil { | ||
| log.Printf("nats: skipping jetstream stream provisioning (CreateOrUpdateStream failed): %v", err) | ||
| } |
There was a problem hiding this comment.
NATS connection leak in JetStream provisioning.
defaultConnection is created for stream provisioning but never closed. This leaves an orphaned connection regardless of success or failure. Close the connection after provisioning completes.
🔧 Proposed fix
- if defaultConnection, err := nats.Connect(url); err != nil {
- log.Printf("nats: skipping jetstream stream provisioning (connect failed): %v", err)
- } else if defaultJetStream, err := jetstream.New(defaultConnection); err != nil {
- log.Printf("nats: skipping jetstream stream provisioning (jetstream init failed): %v", err)
- } else if _, err := defaultJetStream.CreateOrUpdateStream(ctx, jetstream.StreamConfig{
- Name: "streamName",
- Subjects: []string{"employeeUpdated.>"},
- }); err != nil {
- log.Printf("nats: skipping jetstream stream provisioning (CreateOrUpdateStream failed): %v", err)
+ if defaultConnection, err := nats.Connect(url); err != nil {
+ log.Printf("nats: skipping jetstream stream provisioning (connect failed): %v", err)
+ } else {
+ defer defaultConnection.Close()
+ if defaultJetStream, err := jetstream.New(defaultConnection); err != nil {
+ log.Printf("nats: skipping jetstream stream provisioning (jetstream init failed): %v", err)
+ } else if _, err := defaultJetStream.CreateOrUpdateStream(ctx, jetstream.StreamConfig{
+ Name: "streamName",
+ Subjects: []string{"employeeUpdated.>"},
+ }); err != nil {
+ log.Printf("nats: skipping jetstream stream provisioning (CreateOrUpdateStream failed): %v", err)
+ }
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if defaultConnection, err := nats.Connect(url); err != nil { | |
| log.Printf("nats: skipping jetstream stream provisioning (connect failed): %v", err) | |
| } else if defaultJetStream, err := jetstream.New(defaultConnection); err != nil { | |
| log.Printf("nats: skipping jetstream stream provisioning (jetstream init failed): %v", err) | |
| } else if _, err := defaultJetStream.CreateOrUpdateStream(ctx, jetstream.StreamConfig{ | |
| Name: "streamName", | |
| Subjects: []string{"employeeUpdated.>"}, | |
| }) | |
| if err != nil { | |
| return nil, err | |
| }); err != nil { | |
| log.Printf("nats: skipping jetstream stream provisioning (CreateOrUpdateStream failed): %v", err) | |
| } | |
| if defaultConnection, err := nats.Connect(url); err != nil { | |
| log.Printf("nats: skipping jetstream stream provisioning (connect failed): %v", err) | |
| } else { | |
| defer defaultConnection.Close() | |
| if defaultJetStream, err := jetstream.New(defaultConnection); err != nil { | |
| log.Printf("nats: skipping jetstream stream provisioning (jetstream init failed): %v", err) | |
| } else if _, err := defaultJetStream.CreateOrUpdateStream(ctx, jetstream.StreamConfig{ | |
| Name: "streamName", | |
| Subjects: []string{"employeeUpdated.>"}, | |
| }); err != nil { | |
| log.Printf("nats: skipping jetstream stream provisioning (CreateOrUpdateStream failed): %v", err) | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@demo/pkg/subgraphs/subgraphs.go` around lines 242 - 251, The code opens a
NATS connection (nats.Connect) into defaultConnection for JetStream provisioning
but never closes it, leaking connections; refactor the block so you first call
nats.Connect(url) into defaultConnection, handle the error, and immediately
schedule closure (e.g., defer defaultConnection.Close() or ensure Close() is
invoked in all error/success paths) before calling
jetstream.New(defaultConnection) and CreateOrUpdateStream; ensure you reference
the same defaultConnection and defaultJetStream identifiers so the connection is
always closed regardless of whether jetstream.New or CreateOrUpdateStream
succeed.
Bundling unrelated writes produced tangled operations with mixed argument shapes; the search-tool description now requires one prompt per logical write and reserves combination for tightly-correlated cascades. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a second MCP server next to the existing per-operation one, exposing two generic tools instead of one-tool-per-operation:
Includes:
Summary by CodeRabbit
New Features
Documentation
Chores
Tests
Checklist
Open Source AI Manifesto
This project follows the principles of the Open Source AI Manifesto. Please ensure your contribution aligns with its principles.