Skip to content

Use ClickHouse defaults matching out-of-the-box setup#1205

Merged
DZakh merged 9 commits into
mainfrom
claude/clickhouse-local-no-password-EOxK7
May 14, 2026
Merged

Use ClickHouse defaults matching out-of-the-box setup#1205
DZakh merged 9 commits into
mainfrom
claude/clickhouse-local-no-password-EOxK7

Conversation

@DZakh
Copy link
Copy Markdown
Member

@DZakh DZakh commented May 14, 2026

Changes ClickHouse configuration defaults to match the out-of-the-box ClickHouse setup, eliminating the need for credentials in local development while requiring explicit configuration in production.

Summary

Updates default ClickHouse credentials and database name to use the pre-existing default user (with no password) and default database that come with a fresh ClickHouse installation. This allows envio dev to work without additional setup, while envio start continues to require explicit environment variable configuration.

Key Changes

  • CLI defaults (docker_env.rs): Changed ch_password default from "testing" to "" (empty string) and ch_database default from "envio_indexer" to "default"
  • Password handling (Env.res): Added readAllowEmpty function to distinguish between unset environment variables (None) and explicitly set empty passwords (Some("")), allowing passwordless ClickHouse users
  • Database validation (PgStorage.res): Added ENVIO_CLICKHOUSE_DATABASE to required environment variable validation for envio start, ensuring users explicitly configure the database when not using defaults
  • Sink initialization (Sink.res): Clarified that database is not passed to the client constructor; each query qualifies the name explicitly or runs USE first
  • E2E tests: Added explicit ENVIO_CLICKHOUSE_DATABASE environment variable to test configurations
  • Test expectations: Updated error message to include ENVIO_CLICKHOUSE_DATABASE in the list of required variables

Implementation Details

The distinction between "unset" (None) and "set but empty" (Some("")) for the password is critical: ClickHouse supports passwordless users, so an empty string is a valid credential value that must be preserved and not conflated with an unset environment variable.

https://claude.ai/code/session_017sHyqsHsXQ7D3ZrxP5wcev

Summary by CodeRabbit

  • Bug Fixes

    • ClickHouse defaults updated: password defaults to empty and database defaults to "default"
    • Environment reader now distinguishes unset vs explicitly empty ClickHouse password
    • Startup/resume now validate and explicitly use the ClickHouse database value
    • Local teardown improved: network removal force-disconnects attached containers when necessary
  • Tests

    • E2E and unit tests updated to set/expect the shared ClickHouse database
  • Chores

    • Initialization/resume messages clarified to use "storage" terminology

Review Change Stack

claude added 2 commits May 14, 2026 12:45
So that the Playground at http://localhost:8123/play opens with no
credentials prompt, the managed-container defaults change to the
out-of-the-box ClickHouse setup: user=default, password=empty,
database=default. The Docker entrypoint treats user=default +
password=empty as a no-op, so the pre-existing default user with no
password is used. ENVIO_CLICKHOUSE_DATABASE becomes required at runtime
(no fallback to envio_indexer), so `envio start` users must supply all
four ENVIO_CLICKHOUSE_* vars explicitly. Empty passwords now pass
validation via a readAllowEmpty path that distinguishes unset from
set-but-empty.
`process.env[k]` already returns undefined for unset vars, so the
ternary was a no-op.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 14, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Preserves empty-string ClickHouse passwords, requires and validates an explicit ClickHouse database for storage, changes CLI local-dev ClickHouse defaults (password -> "" and database -> "default"), updates tests/E2E to set the database, adjusts binding log/messages wording, and improves Docker network teardown.

Changes

ClickHouse Environment Defaults and Database Validation

Layer / File(s) Summary
Empty-string password environment reader
packages/envio/src/Env.res
A new readAllowEmpty helper in Env.ClickHouse preserves empty-string passwords as Some("") instead of coercing them to None.
Storage and Sink database handling
packages/envio/src/PgStorage.res, packages/envio/src/Sink.res
ClickHouse storage initialization now explicitly reads and validates ENVIO_CLICKHOUSE_DATABASE in makeStorageFromEnv and passes the validated database to Sink.makeClickHouse; the ClickHouse client is no longer constructed with a database name.
CLI local-dev defaults and network teardown
packages/cli/src/docker_env.rs
CLI local-dev defaults updated: ENVIO_CLICKHOUSE_PASSWORD defaults to "" and ENVIO_CLICKHOUSE_DATABASE defaults to "default"; unit-test helper expectations adjusted; down() now inspects networks on 403, force-disconnects owned endpoints before retrying removal, and converts a persistent 403 into a warning.
E2E and test expectations
packages/e2e-tests/src/e2e/e2e.test.ts, scenarios/test_codegen/test/lib_tests/PgStorage_test.res, packages/e2e-tests/src/dependency-tests/install.test.ts
E2E indexer tests and dependency tests now set ENVIO_CLICKHOUSE_DATABASE for indexer runs; test assertion for missing required ClickHouse env vars now includes ENVIO_CLICKHOUSE_DATABASE.
Bindings logging and message wording
packages/envio/src/bindings/ClickHouse.res
Log and thrown messages updated to use “ClickHouse storage” wording and clarify database-not-found guidance during resume/initialize flows.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • enviodev/hyperindex#1115: Modifies ClickHouse Docker/dev environment wiring and propagation of managed ClickHouse credentials into the indexer subprocess.
  • enviodev/hyperindex#1110: Changes ClickHouse env parsing and PgStorage.makeStorageFromEnv wiring that this PR builds on.
  • enviodev/hyperindex#1056: Related E2E ClickHouse sink integration and test configuration edits affecting the same test paths.

Suggested reviewers

  • JonoPrest
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Use ClickHouse defaults matching out-of-the-box setup' clearly and concisely summarizes the main change: updating ClickHouse defaults to match a fresh installation.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

User-facing terminology in config.yaml is `storage.clickhouse`, so the
runtime error messages now match.
A stray container (test harness, manual `docker run`) attached to
`envio-network` was making `envio local docker down` fail with a hard
error. Volume cleanup had already succeeded, so the user's actual ask
(nuke stale state) was met, but the non-zero exit obscured that.

Now: on 403 ("has active endpoints"), inspect the network, force-
disconnect every attached container, and retry the remove. If a fresh
container has already reattached, log a warning and continue rather
than erroring — the volumes are already gone.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/cli/src/docker_env.rs`:
- Around line 1223-1234: The current loop in inspect that unconditionally calls
docker.disconnect_network for every container_id can disconnect non-Envio
containers; change it to only disconnect endpoints that are provably Envio-owned
or stale: for each container_id from inspect.containers, query the container
metadata (e.g., via docker.inspect_container or equivalent) and check for an
Envio-specific label or name prefix (or other provenance marker your codebase
uses) before creating a NetworkDisconnectRequest and awaiting
disconnect_network; skip and do not force-disconnect any container that lacks
those Envio identifiers and instead leave it attached (optionally log/emit the
existing warning path). Ensure this logic ties into the existing
stop_and_remove() flow so you only target containers you created.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 23dc79c1-c07e-4839-99b4-3c68d38090cf

📥 Commits

Reviewing files that changed from the base of the PR and between 82e4350 and 8241a7b.

📒 Files selected for processing (1)
  • packages/cli/src/docker_env.rs

Comment thread packages/cli/src/docker_env.rs
Force-disconnecting every container attached to envio-network was
hostile to parallel dev sessions or any container the user had
intentionally hooked up. Gate the disconnect on CONFIG_HASH_LABEL so
we only touch Envio containers — anything else falls through to the
existing warning path, which logs and continues without nuking the
network.
test:deps runs after test:e2e and resumes from the Postgres checkpoint
written by the previous step. With the CLI's new default-DB
(\"default\"), the indexer would look for \"default.envio_checkpoints\"
while the data lives under \"envio_indexer\". Pin the database name
explicitly so both jobs target the same ClickHouse schema.
The 26.x ClickHouse entrypoint writes
\`/etc/clickhouse-server/users.d/default-user.xml\` whenever
\`CLICKHOUSE_USER\` or \`CLICKHOUSE_PASSWORD\` is set at all (even to
\"default\"/empty). The override removes the image-shipped \`default\`
user (configured with \`<no_password/>\`) and recreates it with
\`<password></password>\`, which then rejects empty-password basic auth
— breaking the very thing this PR set out to fix.

Forward each env var only when it diverges from the image's own
defaults so a vanilla \`envio dev\` leaves the pristine no-password
default user intact.
The 26.x ClickHouse image ships its own
\`/etc/clickhouse-server/users.d/default-user.xml\` that locks down the
\`default\` user with a password element — so even with no
CLICKHOUSE_USER/PASSWORD env vars set, empty-password Basic Auth
requests from the Playground and the indexer get rejected. Skipping
the env vars alone (the previous fix) isn't enough; we have to
override the image's own users.d/ file.

Write a tiny <no_password/> override to
\`<project>/.envio/clickhouse-users.d/default-user.xml\` and bind-mount
that directory at /etc/clickhouse-server/users.d when the user hasn't
customized any ClickHouse env vars. Only applied to envio dev's
managed container; external ClickHouse and user-customized setups go
through unchanged.
The previous fix wrote the override file into <project>/.envio/ and
bind-mounted it. That had several footguns: Docker Desktop file-sharing
requirements on macOS, container restart failures if the host file
disappears, ownership ambiguity with the codegen-owned .envio/ dir, and
hardcoded XML that was overly broad (whole-user replace).

This commit:

- Shrinks the override to the minimum needed — replacing only
  <networks> on the default user, leaving <no_password/>, profile,
  quota, and access_management inherited from the image's users.xml.
- Builds the override as an in-memory tar (via the `tar` crate) and
  PUTs it to /containers/.../archive between create and start, so the
  file lives in the container's writable layer with no host-side
  dependency.
- Bumps ch_config_hash with a v2 marker so containers from older CLI
  versions get recreated automatically.
- Adds a unit test that round-trips the tar to verify the entry path
  and content are what we expect.

Preserves the existing gate: the override is only injected when all
three ENVIO_CLICKHOUSE_USERNAME/PASSWORD/DATABASE values are at the
image's own defaults. If the user customizes any of them, we pass their
values through to the entrypoint untouched.
@DZakh DZakh merged commit e4e4dc4 into main May 14, 2026
8 checks passed
@DZakh DZakh deleted the claude/clickhouse-local-no-password-EOxK7 branch May 14, 2026 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants