test(router) Proof of SSE buffer issues#2765
Conversation
WalkthroughAdded a new Go test file introducing regression tests for GraphQL SSE subscription payload stability. Tests verify that received payloads remain unchanged after subsequent read operations and during burst message sequences using a custom subscription updater with httptest SSE server. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
router/core/graphql_sse_regression_test.go (1)
84-171: Extract common SSE test wiring to reduce duplication.Both tests repeat nearly identical server/client setup and completion waits; extracting a shared helper will make future SSE regression additions cheaper and less error-prone.
Also applies to: 173-269
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@router/core/graphql_sse_regression_test.go` around lines 84 - 171, The test repeats SSE server and client setup and wait logic (in TestGraphQLSubscriptionClient_SSESingleLinePayloadsRemainStableAfterNextRead and again at lines 173-269); extract a shared helper (eg. setupSSETest or runSSETest) that accepts the payload sequence and an updater (newAliasingSubscriptionUpdater) and returns the collected updates and any error; move common pieces: httptest.NewServer handler wiring (serverDone, flusher, headers, sending events), creating engineCtx/requestCtx and graphql_datasource.NewGraphQLSubscriptionClient with WithReadTimeout, launching client.Subscribe, and the completion/close waits into that helper, then call it from both tests to reduce duplication.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@router/core/graphql_sse_regression_test.go`:
- Around line 91-115: The httptest handler is calling test assertions
(require.True / t.Fatal) inside its goroutine which can panic; instead capture
any handler-side errors and signals and send them over a channel to be asserted
in the main test goroutine. Modify the server handler created in
httptest.NewServer (the anonymous http.HandlerFunc) to replace require.True(t,
ok) and t.Fatal(...) with sending an error or boolean into a dedicated errCh (or
doneCh) and close/send on serverDone/updater-firstUpdate channels as needed,
then in the main test goroutine receive from errCh and perform the
require/assert (e.g., check the flusher ok, timeout errors) and fail the test
there. Ensure you reference the handler's flusher check and timeouts around
updater.firstUpdate and serverDone so all assertions happen in the main test
goroutine.
---
Nitpick comments:
In `@router/core/graphql_sse_regression_test.go`:
- Around line 84-171: The test repeats SSE server and client setup and wait
logic (in
TestGraphQLSubscriptionClient_SSESingleLinePayloadsRemainStableAfterNextRead and
again at lines 173-269); extract a shared helper (eg. setupSSETest or
runSSETest) that accepts the payload sequence and an updater
(newAliasingSubscriptionUpdater) and returns the collected updates and any
error; move common pieces: httptest.NewServer handler wiring (serverDone,
flusher, headers, sending events), creating engineCtx/requestCtx and
graphql_datasource.NewGraphQLSubscriptionClient with WithReadTimeout, launching
client.Subscribe, and the completion/close waits into that helper, then call it
from both tests to reduce duplication.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 4ff9c56e-ae10-42f4-ac82-9f36594f9dba
📒 Files selected for processing (1)
router/core/graphql_sse_regression_test.go
| server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { | ||
| flusher, ok := w.(http.Flusher) | ||
| require.True(t, ok) | ||
|
|
||
| w.Header().Set("Content-Type", "text/event-stream") | ||
| w.Header().Set("Cache-Control", "no-cache") | ||
| w.Header().Set("Connection", "keep-alive") | ||
|
|
||
| _, _ = fmt.Fprintf(w, "event: next\ndata: %s\n\n", firstPayload) | ||
| flusher.Flush() | ||
|
|
||
| select { | ||
| case <-updater.firstUpdate: | ||
| case <-time.After(time.Second): | ||
| t.Fatal("timed out waiting for first SSE update to be observed") | ||
| } | ||
|
|
||
| _, _ = fmt.Fprintf(w, "event: next\ndata: %s\n\n", secondPayload) | ||
| flusher.Flush() | ||
|
|
||
| _, _ = fmt.Fprint(w, "event: complete\n\n") | ||
| flusher.Flush() | ||
|
|
||
| close(serverDone) | ||
| })) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Read-only verification: find FailNow-style assertions in handler goroutines.
rg -nP 'httptest\.NewServer\(http\.HandlerFunc\(func\(' router/core/graphql_sse_regression_test.go
rg -nP '\brequire\.[A-Za-z_]+\(\s*t\s*,' router/core/graphql_sse_regression_test.go
rg -nP '\bt\.Fatalf?\(' router/core/graphql_sse_regression_test.goRepository: wundergraph/cosmo
Length of output: 1280
🏁 Script executed:
cat -n router/core/graphql_sse_regression_test.go | sed -n '85,220p'Repository: wundergraph/cosmo
Length of output: 4772
Avoid require/t.Fatal inside httptest handler goroutines.
require.True(t, ok) at lines 93 and 191, and t.Fatal(...) at lines 105 and 203 run in handler goroutines spawned by httptest.NewServer. Calling these assertion/failure methods outside the main test goroutine will cause a runtime panic. Route handler errors to a channel and assert in the main test goroutine instead.
💡 Suggested fix pattern
+ handlerErr := make(chan error, 1)
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
flusher, ok := w.(http.Flusher)
- require.True(t, ok)
+ if !ok {
+ select { case handlerErr <- fmt.Errorf("response writer does not implement http.Flusher"): default: }
+ return
+ }
...
- _, _ = fmt.Fprintf(w, "event: next\ndata: %s\n\n", firstPayload)
+ if _, err := fmt.Fprintf(w, "event: next\ndata: %s\n\n", firstPayload); err != nil {
+ select { case handlerErr <- fmt.Errorf("write first payload: %w", err): default: }
+ return
+ }
flusher.Flush()
...
- case <-time.After(time.Second):
- t.Fatal("timed out waiting for first SSE update to be observed")
+ case <-time.After(time.Second):
+ select { case handlerErr <- fmt.Errorf("timed out waiting for first SSE update"): default: }
+ return
}
...
}))
...
+ select {
+ case err := <-handlerErr:
+ require.NoError(t, err)
+ default:
+ }Also applies to: lines 189–215
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@router/core/graphql_sse_regression_test.go` around lines 91 - 115, The
httptest handler is calling test assertions (require.True / t.Fatal) inside its
goroutine which can panic; instead capture any handler-side errors and signals
and send them over a channel to be asserted in the main test goroutine. Modify
the server handler created in httptest.NewServer (the anonymous
http.HandlerFunc) to replace require.True(t, ok) and t.Fatal(...) with sending
an error or boolean into a dedicated errCh (or doneCh) and close/send on
serverDone/updater-firstUpdate channels as needed, then in the main test
goroutine receive from errCh and perform the require/assert (e.g., check the
flusher ok, timeout errors) and fail the test there. Ensure you reference the
handler's flusher check and timeouts around updater.firstUpdate and serverDone
so all assertions happen in the main test goroutine.
|
Hi @dbinnersley, would you mind testing against #2486? I've been working on a pretty massive overhaul of most of the subscriptions system in the router for a while now, I believe this is likely fixed there |
I ran the same tests with the updated api, and it does appear to be fixed in this branch. Is there any expected timeline on this one? |
|
@dbinnersley we're very close, it's at review stage but it's a large change and I can't provide a precise answer at this point if you'd like to run it yourself it does publish images for the PR in CI, just be warned about possible instability although it's unlikely this close to release |
|
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
|
Hi @dbinnersley, the new version has officially landed as a stable router release |
Summary by CodeRabbit
Checklist
Open Source AI Manifesto
This project follows the principles of the Open Source AI Manifesto. Please ensure your contribution aligns with its principles.