docs: add Streamlit integration cookbook#2821
Open
umutdinceryananer wants to merge 50 commits intolangfuse:mainfrom
Open
docs: add Streamlit integration cookbook#2821umutdinceryananer wants to merge 50 commits intolangfuse:mainfrom
umutdinceryananer wants to merge 50 commits intolangfuse:mainfrom
Conversation
Anchors examples/streamlit-demo/ so subsequent commits can add the chat app, notebook sources, and README. Ignores venv, .env, __pycache__, and .ipynb_checkpoints for local dev. Closes #12
.env.example lists Langfuse credentials (public key, secret key, base URL for EU vs US region selection) and the OpenAI API key needed by the baseline chat app. Users copy to .env and fill in. Closes #14
Baseline Streamlit app using st.chat_input and st.chat_message, with a plain OpenAI call against gpt-4o-mini. Chat history persists across reruns via st.session_state. No Langfuse wiring yet — subsequent commits layer in tracing, sessions, feedback. Closes #15
Covers prerequisites, venv setup, dependency install, env file creation, and the streamlit run command. Notes that Langfuse credentials stay as placeholders until later commits wire up tracing. Closes #16
Wraps get_client() + auth_check() in @st.cache_resource so the credential check runs once per session instead of on every Streamlit rerun. No tracing yet — this commit only verifies the client can authenticate against the configured Langfuse instance. Closes #17
Swap `from openai import OpenAI` for `from langfuse.openai import OpenAI`. Every chat.completions.create call now emits a trace + generation to Langfuse without any further code changes.
Extract the OpenAI call into `generate_reply` and decorate it with `@observe()`. The trace now has a `generate_reply` parent span with the auto-instrumented OpenAI generation nested as a child, giving a clean request-level grouping instead of a flat generation.
Register `client.flush` with `atexit` inside the cached initializer so any pending spans in the OpenTelemetry buffer are exported before the Streamlit process tears down. Prevents trace loss on Ctrl+C / hot-reload.
Generate a uuid4 on first page load and keep it in Streamlit's session_state so it survives reruns within the same browser tab. Sets up the identifier that will be attached to Langfuse traces in the next commit.
Pass the session_state uuid into `generate_reply` and call `langfuse.update_current_trace(session_id=...)` inside the @observe span. All traces from one browser tab now group under a single session in the Langfuse Sessions view.
Langfuse SDK v4 removed `update_current_trace` on the client. Trace-level attributes (session_id, user_id, tags, metadata) are now set via the `propagate_attributes` context manager, which must wrap span creation so inheritance works correctly. Wrap the `generate_reply` call in `propagate_attributes(session_id=...)` so the @observe span inherits the session_id at creation time.
Adds a sidebar reset button that clears the chat history and generates a fresh session_id. Lets users start a new Langfuse session on demand within the same browser tab without refreshing the page.
The app relies on v4-only APIs (`propagate_attributes`, the v4 `@observe` and `get_client` pattern). The previous `>=3.0.0` pin would allow a fresh install to resolve to v3 and break at runtime. Bumping the minimum to v4 prevents that drift.
Return the Langfuse trace_id alongside the reply from `generate_reply` and store it on the message dict in `session_state`. Sets up the hook for the thumbs feedback UI, which needs a trace_id to attach scores to.
Render 👍 / 👎 buttons under each assistant message. Clicking either posts a numeric `user-feedback` score (1 or -1) to the message's trace via `langfuse.create_score` and marks the trace as scored in session_state so the buttons don't re-appear on rerun.
…iately The new assistant message was only rendered inline with `st.markdown` and was not passing through the history loop where the thumbs buttons are drawn. Users had to send a second message before the first one showed feedback controls. Triggering `st.rerun()` after the append forces the history loop to redraw the new message with its buttons.
Use CATEGORICAL data type with "positive" / "negative" values instead of NUMERIC 1 / -1. Matches the common convention for binary feedback, renders as a clean label in the Langfuse UI (no trailing zeros) and groups naturally in score aggregations by category.
`create_score` buffers the event on the batch queue, so scores weren't appearing in the Langfuse dashboard until the process exited (when the atexit handler fired). Calling `flush` right after the score keeps the UX immediate: click thumbs, refresh Langfuse, see the score.
Convert `generate_reply` into a generator that yields delta tokens from a streaming chat completion, then render it with `st.write_stream` so users see the reply typed out rather than a sudden block. The @observe wrapper keeps the span open until the generator is exhausted, so the full streamed content is still captured as one Langfuse generation. Trace_id capture moves to a small list passed in by the caller because a generator can't return a second value.
Pass `stream_options={"include_usage": True}` so OpenAI sends a final
chunk with exact prompt/completion/total token counts. The langfuse
wrapper reports OpenAI's numbers instead of falling back to a local
tiktoken estimate, which keeps cost attribution accurate.
Guard the loop against the terminal usage chunk (which has empty
`choices`) so iteration doesn't IndexError on the last frame.
Adds "Your name" text input in sidebar, stores in st.session_state["user_id"]. Defaults to "anonymous" when blank. Closes #28
Adds anthropic and opentelemetry-instrumentation-anthropic to requirements.txt for upcoming multi-provider support. Closes #30
Sidebar dropdown selects model; routes to OpenAI or Anthropic SDK accordingly and tags traces with provider name. Closes #32
Adds the empty notebook shell with NOTEBOOK_METADATA, intro, "What is Streamlit?"/"What is Langfuse?" blockquotes, STEPS_START/STEPS_END markers, and LearnMore trailer. Steps will be filled in subsequent commits. Closes #33
Adds the empty notebook shell with NOTEBOOK_METADATA, intro, "What is Streamlit?"/"What is Langfuse?" blockquotes, STEPS_START/STEPS_END markers, and LearnMore trailer. Also registers the notebook in cookbook/_routes.json so the build script processes it. Steps will be filled in subsequent commits. Closes #33
Adds Step 1 markdown header and %pip install cell covering streamlit, langfuse, openai, anthropic, the OTel Anthropic instrumentor, and python-dotenv. Closes #34
Adds env var setup (Langfuse + OpenAI + Anthropic keys) and get_client() with auth_check() verification. Closes #35
Adds the baseline chat app — OpenAI client, st.session_state for message history, st.chat_input/chat_message UI. No Langfuse instrumentation yet. Closes #36
Layers Langfuse on the baseline: langfuse.openai drop-in for automatic LLM tracing, @observe-wrapped handler for one trace per turn, and @st.cache_resource init for the Langfuse client to survive Streamlit reruns. Closes #37
Generates a session_id with uuid, attaches it to traces via propagate_attributes, and adds a "New conversation" button that resets the message history and mints a fresh session. Closes #38
Switches the handler to a streaming generator with stream=True,
renders chunks via st.write_stream, and requests authoritative
token usage with stream_options={"include_usage": True}. Uses a
trace_holder list to surface the trace_id since generators can't
return a tuple.
Closes #40
Adds a "Your name" sidebar input that persists in st.session_state.user_id (defaults to "anonymous") and is attached to traces via propagate_attributes alongside the session_id. Closes #41
Adds Anthropic alongside OpenAI: AnthropicInstrumentor at startup so OpenTelemetry spans land under the same trace, a sidebar model selector, and provider-named tags via propagate_attributes for filtering in the Langfuse UI. Closes #42
Closes the cookbook with a tour of the Langfuse UI: traces, sessions, users, scores, tags, and shareable trace links. References screenshot assets at langfuse.com/images/cookbook/ integration-streamlit/ to be uploaded with the upstream PR. Closes #43
Official color mark SVG from Streamlit's brand kit (https://streamlit.io/brand), ~1.7KB, aligned with the pattern used for other integration icons in this directory. Closes #46
Adds "streamlit" to content/integrations/other/meta.json so the Streamlit page appears in the left-nav under Integrations → Other, alphabetically between promptfoo and testable-minds. Closes #47
Assistant messages carry a trace_id field in st.session_state for
feedback scoring. OpenAI silently ignores the extra key but Anthropic
rejects the request with:
BadRequestError: messages.1.trace_id: Extra inputs are not permitted
Project the history down to {role, content} inside stream_reply before
handing it to either provider. Applied identically to the cookbook
notebook's Step 9 cell.
Closes #49
Captures referenced by integration_streamlit.ipynb Step 10: - streamlit-trace.png: stream_reply trace detail with nested anthropic.chat generation, session/user/tag metadata, and negative user-feedback score - streamlit-session.png: session view with 3 chronological generations for user umut, including a positive score - streamlit-users.png: users listing with 7 users, event counts, tokens and cost columns Closes #48
Generated by scripts/update_cookbook_docs.sh from cookbook/integration_streamlit.ipynb (routed via _routes.json with isGuide=false). NOTEBOOK_METADATA converted to YAML frontmatter, STEPS markers converted to <Steps> component, MARKDOWN_COMPONENT converted to <LearnMore /> import. Closes #44
Switch from the "Step 1..10" walkthrough to a gradio-style topical layout (Introduction, Setup, Implementation of Chat functions, Run Streamlit App, Explore data in Langfuse). Drops the repeated full-app snippets in favor of one consolidated app.py reference, cutting the rendered page from ~746 to ~415 lines. Replace the three custom retina-limited screenshots under public/images/cookbook/integration-streamlit/ with existing official Langfuse UI assets (docs/session.png, docs/users-list.png) and link to the Tracing docs + public demo project for the trace view.
The original README described the initial OpenAI-only baseline. The app has since grown to cover Langfuse tracing, sessions, user identification, streaming, user-feedback scores, and multi-provider routing — update the description, prerequisites, and how-it-works bullets to match the current state, and add a pointer back to the Streamlit integration guide as the canonical reference.
The file is a local per-user Claude Code permission cache and was tracked by accident in early branch commits. Untrack it and add the path to .gitignore so it cannot be re-committed.
|
@Automaticare is attempting to deploy a commit to the langfuse Team on Vercel. A member of the Team first needs to authorize it. |
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Addresses Greptile review feedback on langfuse#2821: - P1: trace_holder[0] could IndexError if stream_reply raised before the append; fall back to trace_id = None so the UI stays alive and the scored-traces guard naturally hides the thumbs buttons on rows without a trace id. - P2: the abbreviated init_langfuse snippet in the guide only printed the success path, making misconfigured credentials silent. Mirror the else branch from the full app.py so readers copying the snippet also see the auth-failure hint.
Consolidates observe and propagate_attributes into the Setup imports rather than re-importing them mid-document. Adds a brief prose note so readers know where the names came from. Resolves Greptile P2 in both the notebook and the rendered mdx.
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds a Streamlit integration cookbook covering end-to-end Langfuse
tracing in a streaming chat app — sessions, user identification,
user-feedback scores, and live routing between OpenAI and Anthropic
with provider-tagged traces.
cookbook/integration_streamlit.ipynbcontent/integrations/other/streamlit.mdxcookbook/_routes.json(isGuide: false)content/integrations/other/meta.jsonpublic/images/integrations/streamlit_icon.svgexamples/streamlit-demo/Why
Streamlit is one of the most common ways Python devs ship LLM UIs, but
there was no first-party Langfuse walkthrough. The guide mirrors the
structure of the existing Gradio cookbook so readers can jump between
the two without re-learning the layout.
Covered features
@observedecorator wrapping a generator-based streaming handlerlangfuse.openaidrop-in +AnthropicInstrumentorfor auto-tracingpropagate_attributes(session_id, user_id, tags=[provider])st.session_statecreate_scorewithCATEGORICALdata typegpt-4o-mini) / Anthropic (claude-sonnet-4-6) switchvia sidebar selector, with the provider tagged on every trace
Notes for maintainers
the way Gradio does, so the cookbook links out to
examples/streamlit-demo/as a one-command run. If the convention here is docs-only, I can move
the code inline and drop the folder.
docs assets (
/images/docs/session.png,/images/docs/users-list.png)since the dashboard views are framework-agnostic. Happy to capture
Streamlit-specific traces and host them under
static.langfuse.com/cookbooks/streamlit/..., or link a public demotrace, if you'd prefer.
Test plan
pnpm link-checkpasses forcontent/integrations/other/streamlit.mdxexamples/streamlit-demo/app.pyruns against Langfuse Cloud; traces, sessions, users, feedback scores, and provider tags all populatepnpm devrenders the integration page with all assets resolvingDisclaimer: Experimental PR review
Greptile Summary
This PR adds a first-party Streamlit integration cookbook covering end-to-end Langfuse tracing (sessions, user identification, user-feedback scores, multi-provider routing) along with a runnable companion app in
examples/streamlit-demo/.IndexErrorrisk:trace_holder[0]inapp.py(line 161) and the matching MDX snippet are accessed unconditionally. Ifstream_replyfails before its generator body executes,trace_holderis empty and the nextst.rerun()will crash withIndexError. A guard (trace_holder[0] if trace_holder else None) fixes this safely.Confidence Score: 4/5
Safe to merge after guarding the trace_holder access; everything else is documentation/style.
One P1 defect: unchecked index access on trace_holder will crash the app on any streaming failure. The two P2 findings (missing else branch in setup snippet, mid-document import) are minor docs inconsistencies. Fixing the P1 guard is a one-liner.
examples/streamlit-demo/app.py (line 161) and the matching block in content/integrations/other/streamlit.mdx
Important Files Changed
Sequence Diagram
sequenceDiagram participant User as Browser Tab participant ST as Streamlit UI participant LF as Langfuse (@observe) participant OAI as OpenAI / Anthropic User->>ST: types prompt ST->>ST: append user message to session_state ST->>LF: propagate_attributes(session_id, user_id, tags) ST->>LF: stream_reply() [generator, @observe] LF->>LF: start trace, capture trace_id to trace_holder LF->>OAI: streaming API call (langfuse.openai or AnthropicInstrumentor) OAI-->>ST: stream chunks via st.write_stream ST->>ST: append assistant message + trace_id User->>ST: clicks thumbs up or thumbs down ST->>LF: create_score(trace_id, positive/negative, CATEGORICAL) LF-->>ST: flush and ackPrompt To Fix All With AI
Reviews (1): Last reviewed commit: "chore: remove stray .claude/settings.loc..." | Re-trigger Greptile
Context used:
Learned From
langfuse/langfuse-python#1387