Skip to content

add error analysis guide#2892

Merged
annabellscha merged 33 commits into
mainfrom
update-error-analysis-blogpost
May 11, 2026
Merged

add error analysis guide#2892
annabellscha merged 33 commits into
mainfrom
update-error-analysis-blogpost

Conversation

@annabellscha
Copy link
Copy Markdown
Contributor

@annabellscha annabellscha commented May 5, 2026

Disclaimer: Experimental PR review

Greptile Summary

Adds a new cookbook guide on error analysis for LLM applications, covering open coding, failure taxonomy clustering, labeling, and deciding when to build evaluators versus fix prompts. The guide also registers the new page in meta.json.

  • A 360-line step-by-step guide is added, using a "Dad Tech Support" chatbot as a worked example through the full five-step error analysis process.
  • The guide integrates with the existing Langfuse annotation queue and scoring workflow and references the Claude Code Langfuse skill for an interactive walkthrough.

Confidence Score: 4/5

Safe to merge; the only flag is a draft note in the example data that should be cleaned up before the guide goes live.

The guide is well-structured and the process it describes is technically sound. The one thing worth fixing before publishing is the parenthetical in Step 4.2 that reveals the example bar chart is based on only 19 of 100 traces — readers may lose confidence in the example data if that note ships as-is.

content/guides/cookbook/error-analysis-llm-applications.mdx — specifically the Step 4.2 failure rates table and its incomplete-data caveat.

Important Files Changed

Filename Overview
content/guides/cookbook/error-analysis-llm-applications.mdx New 360-line guide walking through a full error analysis workflow; one draft-style note about incomplete example data remains in the published Step 4.2 results section.
content/guides/cookbook/meta.json New guide entry added at the top of the pages list; change is straightforward and correct.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Choose what to annotate\nTrace vs. GENERATION observation] --> B[Select ~100 representative traces\nby latency, cost, tags, multi-turn]
    B --> C[Create annotation queue\nwith open_coding + pass_fail_assessment]
    C --> D[Open code first 30-50 traces\nFree-text observations, no pre-defined categories]
    D --> E{New failure types\nstill appearing?}
    E -- Yes --> D
    E -- No --> F[Cluster into 5-10 named failure categories\nSplit by root cause, merge by same root cause]
    F --> G[Create boolean score configs per category\nNew queue with all 10 score configs]
    G --> H[Label all 100 traces]
    H --> I[Compute failure rates\nLangfuse Dashboard - Scores widget]
    I --> J{For each category:\nCan we just fix it?}
    J -- Yes --> K[Prompt / tool / code fix]
    J -- No --> L{Worth building\nan evaluator?}
    L -- Yes --> M[LLM-as-judge or code-based check]
    L -- No --> N[Monitor / defer]
    K & M & N --> O[Re-run after next\nprompt rewrite, model switch, or incident]
Loading
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
content/guides/cookbook/error-analysis-llm-applications.mdx:275
**Incomplete-data caveat left in published guide**

The parenthetical "(from 19 labeled traces, rates will shift once all 100 are done)" reads as a draft note that was never removed. Readers following the guide will see a bar chart with rates derived from 19 of 100 traces and an explicit admission that the numbers are preliminary — undermining confidence in the example data. Either finish labeling all 100 traces before publishing, or replace this note with the final, stable numbers.

Reviews (1): Last reviewed commit: "add error analysis guide" | Re-trigger Greptile

@vercel
Copy link
Copy Markdown

vercel Bot commented May 5, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
langfuse-docs Ready Ready Preview, Comment May 11, 2026 0:44am

Request Review

@dosubot dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label May 5, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 5, 2026

@claude review

@dosubot dosubot Bot added the documentation Improvements or additions to documentation label May 5, 2026
Comment thread content/guides/cookbook/error-analysis-llm-applications.mdx
Comment thread content/guides/cookbook/error-analysis-llm-applications.mdx Outdated
Comment thread content/guides/cookbook/error-analysis-llm-applications.mdx
Comment thread content/guides/cookbook/error-analysis-llm-applications.mdx Outdated
@review-notebook-app
Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All prior feedback addressed (notebook source added, frontmatter complete, skill link fixed, illustrative-data caveat reworded).

Extended reasoning...

Overview

This documentation-only PR adds a new error-analysis cookbook guide. It includes a hand-written MDX, a corresponding source notebook in cookbook/, a _routes.json registration, a meta.json entry, and a small frameborder -> frameBorder JSX casing fix in a related blog post. No code paths, build scripts, or runtime behavior are changed.

Security risks

None. This is content-only — no auth, network, or data-handling code involved. External links go to langfuse.com pages (verified to exist) and to public references like hamel.dev.

Level of scrutiny

Light scrutiny appropriate for a docs-only change. The earlier higher-stakes risk (regen script silently deleting a hand-edited cookbook MDX with no notebook source) is now eliminated because a backing notebook was added and registered in cookbook/_routes.json.

Other factors

All four issues I raised on earlier revisions have been addressed in the latest commit: backing notebook present, frontmatter has category: Evaluation and sidebarTitle, the Claude Code skill link now points at a path that actually resolves (content/docs/api-and-data-platform/features/agent-skill.mdx), and Step 4.2's draft-style parenthetical has been reworded to frame the partial-sample numbers as illustrative. The Greptile bot's confidence is 4/5 and the only outstanding flag from external review is the same illustrative-data caveat that has now been reworded. Bug hunting system found no bugs.

Comment thread content/guides/cookbook/error-analysis-llm-applications.mdx Outdated
Comment thread content/guides/cookbook/error-analysis-llm-applications.mdx Outdated
Comment thread content/guides/cookbook/error-analysis-llm-applications.mdx
Comment thread content/guides/cookbook/error-analysis-llm-applications.mdx
Comment thread content/guides/cookbook/error-analysis-llm-applications.mdx Outdated
Comment thread content/guides/cookbook/error-analysis-llm-applications.mdx Outdated
Comment thread content/guides/cookbook/error-analysis-llm-applications.mdx
Comment thread content/guides/cookbook/error-analysis-llm-applications.mdx Outdated
Comment thread content/guides/cookbook/error-analysis-llm-applications.mdx Outdated
Comment thread cookbook/_routes.json
Comment thread cookbook/_routes.json
Comment thread lib/cookbook_route_mapping.ts Outdated
annabellscha and others added 2 commits May 11, 2026 10:52
Restores cookbook/error-analysis-llm-applications.ipynb and its
_routes.json entry. Reverts lib/cookbook_route_mapping.ts to main —
the showNotebookBanner field is no longer used. The standard
"this is a Jupyter notebook" banner will show for this page.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread content/guides/cookbook/error-analysis-llm-applications.mdx
Comment thread content/guides/cookbook/error-analysis-llm-applications.mdx
@dosubot dosubot Bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels May 11, 2026
Comment thread content/blog/2025-08-29-error-analysis-to-evaluate-llm-applications.mdx Outdated
Comment thread content/guides/cookbook/error-analysis-llm-applications.mdx Outdated
Comment thread content/guides/cookbook/error-analysis-llm-applications.mdx
Comment thread content/guides/cookbook/error-analysis-llm-applications.mdx
@annabellscha annabellscha added this pull request to the merge queue May 11, 2026
@dosubot dosubot Bot added the auto-merge This PR is set to be merged label May 11, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to no response for status checks May 11, 2026
@dosubot dosubot Bot removed the auto-merge This PR is set to be merged label May 11, 2026
@annabellscha annabellscha added this pull request to the merge queue May 11, 2026
@dosubot dosubot Bot added the auto-merge This PR is set to be merged label May 11, 2026
Merged via the queue into main with commit 64998b9 May 11, 2026
14 checks passed
@annabellscha annabellscha deleted the update-error-analysis-blogpost branch May 11, 2026 20:17
@dosubot dosubot Bot removed the auto-merge This PR is set to be merged label May 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants