Skip to content

Feed prior PR conversation to the reviewer as context#29

Open
tarekziade wants to merge 1 commit into
mainfrom
serge-pr-conversation-context
Open

Feed prior PR conversation to the reviewer as context#29
tarekziade wants to merge 1 commit into
mainfrom
serge-pr-conversation-context

Conversation

@tarekziade

Copy link
Copy Markdown
Collaborator

Why

Today a Serge review only sees the PR title/description, the diff, and the single comment that triggered it (prepare_reviewbuild_user_prompt). It never fetches the PR's existing discussion. On PRs where the relevant context lives in the conversation — prior reviews, back-and-forth, earlier inline threads — the model reviews effectively blind.

What

  • GitHubClient: add list_issue_comments, list_review_comments, list_reviews. Extracted a shared _get_paginated helper and reused it for get_pr_files.
  • reviewer: _fetch_pr_conversation + _format_pr_conversation build a compact chronological transcript — top-level comments, review summaries (with APPROVE/REQUEST_CHANGES/COMMENT state), and inline threads tagged path:line.
    • Best-effort: a fetch failure logs and the review continues.
    • Budgeted newest-first to MAX_PR_CONVERSATION_CHARS, with a note for dropped entries.
    • The triggering comment is skipped (already shown separately), and empty/COMMENTED no-body reviews are dropped as noise.
  • prompts: rendered inside an UNTRUSTED PR CONVERSATION block, run through the existing _scrub_delimiters defang — PR comments are attacker-controllable, same trust posture as the title/body.
  • config: MAX_PR_CONVERSATION_CHARS (default 16000; 0 disables the fetch entirely).

Tests

New coverage for formatting/ordering, the newest-first budget + omitted note, trigger-comment skip, empty-review dropping, best-effort-on-error, and the disabled path; the prompt block + delimiter scrubbing; and the config knob. Full suite: 267 passing.

Notes

  • Scoped to the review path. The inline follow-up path (run_followup) still sees only its anchored thread — a candidate follow-up if useful.
  • Adds 3 GitHub API reads per review (paginated). Negligible vs. the LLM call.

🤖 Generated with Claude Code

Until now a review only saw the PR title/description, the diff, and the
single comment that triggered it — never the existing discussion. On PRs
where the relevant context lives in the conversation (prior reviews,
back-and-forth, earlier inline threads) the model was effectively blind.

- GitHubClient: add list_issue_comments / list_review_comments /
  list_reviews (extract a shared _get_paginated helper; reuse it for
  get_pr_files).
- reviewer: _fetch_pr_conversation + _format_pr_conversation build a
  compact chronological transcript (top-level comments, review summaries
  with state, inline threads tagged path:line). Best-effort — a fetch
  failure logs and the review continues. Budgeted newest-first; the
  triggering comment is skipped (already shown separately).
- prompts: render it in an UNTRUSTED, delimiter-scrubbed block, since PR
  comments are attacker-controllable like the title/body.
- config: MAX_PR_CONVERSATION_CHARS (default 16000; 0 disables).
- Tests for formatting/budget/skip/best-effort, the prompt block +
  scrubbing, and the config knob. Docs + .env.example updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@tarekziade

Copy link
Copy Markdown
Collaborator Author

@askserge review this AI generated patch

@tarekziade tarekziade requested a review from sayakpaul June 24, 2026 07:16

@sergereview sergereview Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤗 Serge says:

This PR adds prior PR conversation as context for reviews. The implementation is clean, well-tested, and security-conscious: conversation is treated as untrusted input, scrubbed for delimiter injection, and budgeted newest-first. The GitHub API endpoints and pagination logic are correct, and best-effort error handling ensures a fetch failure won't block reviews. No issues found.

serge v0.1.0 · model: moonshotai/Kimi-K2.6 · 1 LLM turn · 0 tool calls · 24.1s · 11181 in / 4596 out tokens

Comment thread docs/configuration.md
| `MENTION_TRIGGER` | `mention_trigger` | `@askserge` | Phrase that triggers reviews. |
| `REVIEW_EVENT` | `review_event` | `COMMENT` | Fallback review event when the model omits one. |
| `MAX_DIFF_CHARS` | `max_diff_chars` | `200000` | Maximum diff characters sent to the LLM. |
| `MAX_PR_CONVERSATION_CHARS` | `max_pr_conversation_chars` | `16000` | Characters of prior PR conversation (issue comments, review summaries, inline review threads) fed to the model as untrusted context. Newest entries are kept; `0` disables the fetch. |

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

16000 seems like a very small number here given the stuff it's supposed to include in the context? Also, should it be made optional? Because the this context might contain prompt injection attacks. So, one needs to be extra careful.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah we have some protections in place like for the PR body, but it's indeed the same risk. Notice that we are still not executing any code and it's running inside bubblewrap.

for the length it's approxiamtely the last 3k but we could also use our compressor to pack more stuff/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants