Skip to content

Conversation

@cpsievert
Copy link
Contributor

@cpsievert cpsievert commented Dec 10, 2025

Closes #128

Some of the Python changes here are a follow up to #119

@cpsievert cpsievert requested a review from Copilot December 10, 2025 17:22

This comment was marked as resolved.

@cpsievert cpsievert marked this pull request as ready for review December 10, 2025 17:50
@cpsievert cpsievert requested a review from gadenbuie December 10, 2025 17:50
Co-authored-by: Garrick Aden-Buie <[email protected]>
Copy link
Contributor

@gadenbuie gadenbuie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff! It's definitely a big step forward for the docs. I've been submitting feedback as I worked through things. I made it through the stuff that shows up in the Python diffs; I'll pick up with the R things in a bit, although I suspect there's some overlap and that some of the Python comments will directly translate to the R docs

This commit addresses all 30 review comments from PR #162, implementing
comprehensive improvements to both R and Python documentation for consistency,
clarity, and better user experience.

## Capitalization Standardization

- Standardized use of "querychat" (lowercase) when referring to the package/product
  in prose throughout all documentation
- Maintained "QueryChat" (camel case) for Python class names in code examples
- Maintained "QueryChat" (camel case) when referring to class/instances in narrative
- Fixed overcorrections to ensure Python class name remains properly capitalized
- Files affected:
  - R: vignettes/tools.Rmd, vignettes/context.Rmd, README.md
  - Python: index.qmd, context.qmd, build.qmd, tools.qmd, models.qmd,
    greet.qmd, data-sources.qmd, _examples/*.py

## Grammar and Language Fixes

- Fixed "up vote" → "upvote" in both Python and R build documentation
- Removed unnecessary words: "In this case", "(safely)"
- Clarified LLM vs querychat roles: "The LLM generates SQL, querychat executes it"
- Improved sentence structure and flow throughout

## Content Improvements

### Introduction/README Changes (R & Python)
- Changed "For analysts" → "For users" (more inclusive)
- Rewrote developer section in second person for directness
- Made benefits more specific and less generic

### Python index.qmd Enhancements
- Fixed "VSCode" → "VS Code" (official branding)
- Mentioned Positron first, then VS Code
- Clarified that saving to file is optional (can run in console)
- Added QUERYCHAT_CLIENT environment variable example
- Simplified code example by removing explicit client parameter

### Context Documentation Restructuring (R & Python)
- Reorganized intro to be more linear:
  1. What querychat automatically gathers
  2. LLMs don't see actual data
  3. Three ways to customize system prompt
- Moved system prompt definition to footnote (Python) or parenthetical (R)
- Made it clearer that customization is optional enhancement

## Structural Improvements

### Python build.qmd Quarto Enhancements
- Extracted inline app code to separate, runnable files:
  - pkg-py/docs/_examples/titanic-dashboard.py
  - pkg-py/docs/_examples/multiple-datasets.py
- Replaced HTML <details>/<summary> with Quarto code-fold feature
- Used Quarto include syntax for cleaner documentation
- Apps can now be run and tested independently

### Site Tagline
- Reverted docs/index.html tagline to original "Chat with your data in any language"
- Original is more inviting and covers both R/Python + multilingual LLM support
- Fixed capitalization in description text

## Files Changed

Modified (10):
- docs/index.html
- pkg-py/docs/build.qmd
- pkg-py/docs/context.qmd
- pkg-py/docs/data-sources.qmd
- pkg-py/docs/index.qmd
- pkg-py/docs/tools.qmd
- pkg-r/README.md
- pkg-r/vignettes/build.Rmd
- pkg-r/vignettes/context.Rmd
- pkg-r/vignettes/tools.Rmd

Added (2):
- pkg-py/docs/_examples/multiple-datasets.py
- pkg-py/docs/_examples/titanic-dashboard.py

Statistics: 12 files changed, 184 insertions(+), 179 deletions(-)

All changes maintain consistency between R and Python documentation while
respecting their different documentation systems (R Markdown vs Quarto).
)
```

While `querychat_app()` provides a quick way to start exploring data, building bespoke Shiny apps with QueryChat unlocks the full power of integrating natural language data exploration with custom visualizations, layouts, and interactivity. This guide shows you how to integrate QueryChat into your own Shiny applications and leverage its reactive data outputs to create rich, interactive dashboards.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a good idea to start with the simple template, but this article in general assumes you're starting from scratch to build a Shiny app that wraps querychat.

I think it'd be useful to talk about what kinds of apps make good querychat apps up front, which would also help people who have an existing app they want to bring querychat into.

This was the approach that I took in the Programming with LLMs workshop, some of my slides might help: https://posit-conf-2025.github.io/llm/slides/slides-10.html#/querychat

The general idea is to acknowledge that apps that have a single data source plus a bunch of filters that combine to create a reactive data frame that is used in a lot of different places is probably the best use case for a querychat powered Shiny app. (That's the idea with the first two diagrams, at least.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- Claude 4.5 Sonnet
- Google Gemini 3.0

In our testing, we've found that those models strike a good balance between accuracy and latency. Smaller/cheaper models like GPT-4o-mini are fine for simple queries but make surprising mistakes with more complex ones; and reasoning models like o3-mini slow down responses without providing meaningfully better results.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I would have said all the "fast" models are pretty good, at least for most tables, i.e. I haven't had bad experiences with gpt-4.1-mini (even gpt-4.1-nano is okay) or claude-haiku-4-5.

I guess I'd recommend encouraging people to try out the smaller faster models first and to switch if they don't work well for the data set. (Personally, I'd be turned off from even trying the smaller models by the language "make surprising mistakes with more complex ones".)

Copy link
Contributor Author

@cpsievert cpsievert Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't write this -- it is carried over from the current content. I also haven't done enough testing to have formed strong opinions about model recommendations. Do you want to take a stab at it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a minimal pass in 2a321a6


In our testing, we've found that those models strike a good balance between accuracy and latency. Smaller/cheaper models like GPT-4o-mini are fine for simple queries but make surprising mistakes with more complex ones; and reasoning models like o3-mini slow down responses without providing meaningfully better results.

We've also seen some decent results with frontier local models, but even if you have the compute to run the largest models, they still tend to lag behind the cloud-hosted options in terms of accuracy and speed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could call out gpt-oss:20b maybe?

Copy link
Contributor Author

@cpsievert cpsievert Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally have experienced enough "updating of default/recommended model" fatigue that I'd generally like to avoid giving recommendations that will become outdated in a few months


![](../reference/figures/quickstart-summary.png){alt="Screenshot of the querychat's app with a summary statistic inlined in the chat." class="shadow rounded"}

## View the source
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might need a new section for advanced use cases now that you can also use $client() to create a client with these tools with custom callbacks outside of a Shiny context

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, but I'm not sure tools.Rmd is the best place for this? Anyway, since that was added after this PR was started, let's leave it for a follow up (#174)?

Copy link
Contributor

@gadenbuie gadenbuie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I've read through the R vignettes. Not carefully enough to have found every typo, but well enough to give general feedback. It's a huge improvement and I really like how you've organized the topics!

cpsievert and others added 7 commits December 17, 2025 17:39
Co-authored-by: Garrick Aden-Buie <[email protected]>
Apply similar wording improvements from R documentation to Python docs:
- Simplify "full system prompt" to "system prompt"
- Change "see" to "you can inspect" for clarity
- Change "many types" to "several different"
- Clarify database connections are to "a table in any database"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Apply text improvements across R and Python documentation:

- Reframe data privacy language: emphasize we're NOT sending raw data
  to LLM for complex math operations
- Soften tone around auto-generated greetings: describe as "downsides"
  rather than "slow, wasteful, and non-deterministic"
- Remove duplicate "under the hood" phrasing
- Add local model example (gpt-oss:20b)
- Use friendlier duckdb function in R (duckdb_read_csv)

Addresses unresolved feedback items #2, #4, #5, #7, #10 from PR #162

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Switch from querychat_app() to querychat() throughout vignettes:
- querychat() is more convenient and composable
- querychat_app() is a "programmatic dead-end"
- Add $app() calls with comments to show how to launch

Changes:
- data-sources.Rmd: Update text and all database examples
- greet.Rmd: Update greeting example
- models.Rmd: Update model specification examples

Addresses unresolved feedback items #3, #6, #8 from PR #162

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

(R) Update website

3 participants