Skip to content

fix(rest-postgres): idempotent DB init so a partial init can't silently drop databases#2623

Open
kirson-git wants to merge 1 commit into
NVIDIA:mainfrom
kirson-git:fix/rest-postgres-init-idempotent
Open

fix(rest-postgres): idempotent DB init so a partial init can't silently drop databases#2623
kirson-git wants to merge 1 commit into
NVIDIA:mainfrom
kirson-git:fix/rest-postgres-init-idempotent

Conversation

@kirson-git

Copy link
Copy Markdown

Problem

rest-api/deploy/kustomize/base/postgres/init-configmap.yaml runs once via docker-entrypoint-initdb.d under psql -v ON_ERROR_STOP=1. If any statement errors, psql aborts and the remaining CREATE DATABASE/CREATE USER statements are skipped, leaving a partial state — e.g. nico+keycloak created but temporal/temporal_visibility missing.

Because PostgreSQL only runs init scripts on an empty data dir, the script never re-runs, so the missing databases are silent and permanent. The Temporal install then fails its pre-install hook:

Error: failed pre-install: timed out waiting for the condition
FATAL: password authentication failed for user "temporal"
DETAIL: Role "temporal" does not exist.

Reproduced during a clean v0.10.3 bring-up.

Fix

Make every database/role creation idempotent:

  • SELECT … WHERE NOT EXISTS (…)\gexec guard for CREATE DATABASE
  • DO $$ … IF NOT EXISTS … $$ for roles

The script is now safe to re-run, and a single benign "already exists" error no longer aborts it and skips later databases.

Test

Re-applied against a partially-initialized postgres (had only nico+keycloak): all four DBs (nico, keycloak, temporal, temporal_visibility) present; Temporal install proceeds.

…ly drop databases

init-configmap.yaml runs once via docker-entrypoint-initdb.d under ON_ERROR_STOP.
If any statement errors, psql aborts and the remaining CREATE DATABASE/USER
statements are skipped, leaving a partial state (e.g. nico+keycloak created but
temporal/temporal_visibility missing). Postgres only runs init scripts on an empty
data dir, so it never re-runs -- the missing databases are silent and permanent, and
the Temporal install later fails with 'failed pre-install: timed out' /
FATAL: role "temporal" does not exist.

Make each database/role creation idempotent (\gexec existence guard for databases,
DO-block for roles) so the script is safe to re-run and a single benign error no
longer skips later databases.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: ekirson <ekirson@nvidia.com>
@kirson-git kirson-git requested a review from a team as a code owner June 16, 2026 12:57
@copy-pr-bot

copy-pr-bot Bot commented Jun 16, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Summary by CodeRabbit

Release Notes

  • Chores
    • Updated PostgreSQL database initialization to use conditional creation logic for databases and roles, ensuring safer re-execution and preventing failures from duplicate creation attempts.

Walkthrough

The PostgreSQL initialization ConfigMap SQL is rewritten to be idempotent across all four managed databases (nico, keycloak, temporal, temporal_visibility). Unconditional CREATE DATABASE and CREATE USER statements are replaced with conditional equivalents using WHERE NOT EXISTS ... \gexec and DO $$ ... CREATE ROLE IF NOT EXISTS ... $$ patterns, preserving grants, schema ownership, and extension setup.

Changes

Idempotent PostgreSQL Init Script

Layer / File(s) Summary
Conditional DB and role creation for all four databases
rest-api/deploy/kustomize/base/postgres/init-configmap.yaml
All four database/role blocks (nico, keycloak, temporal, temporal_visibility) are converted from unconditional CREATE DATABASE/CREATE USER DDL to conditional creation via WHERE NOT EXISTS ... \gexec and DO $$ CREATE ROLE IF NOT EXISTS $$, with grants, \c connects, schema ownership, and pg_trgm enablement preserved.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately captures the primary change: making PostgreSQL database initialization idempotent to prevent silent failures from partial initialization.
Description check ✅ Passed The description comprehensively documents the problem, solution, and validation approach, directly addressing the idempotent DB init changes in the ConfigMap.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
rest-api/deploy/kustomize/base/postgres/init-configmap.yaml (1)

46-53: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Missing \c postgres at end of temporal_visibility section breaks consistency.

All preceding sections (nico, keycloak, temporal) conclude with \c postgres to restore the connection context to the default database. This section omits that epilogue, leaving the script terminated while connected to temporal_visibility.

While functionally benign for the current script (as this is the final section), the inconsistency introduces a latent defect: any future extension to this script would inherit an unexpected connection context, potentially causing silent misbehavior.

Suggested fix
     GRANT ALL ON SCHEMA public TO temporal;
     ALTER SCHEMA public OWNER TO temporal;
+    \c postgres
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rest-api/deploy/kustomize/base/postgres/init-configmap.yaml` around lines 46
- 53, The temporal_visibility section in the init-configmap.yaml file is missing
a `\c postgres` statement at the end to restore the connection context to the
default database, inconsistent with the preceding sections (nico, keycloak,
temporal) which all conclude with this epilogue. Add `\c postgres` after the
`ALTER SCHEMA public OWNER TO temporal;` line in the temporal_visibility section
to maintain consistency and prevent connection context issues if the script is
extended in the future.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@rest-api/deploy/kustomize/base/postgres/init-configmap.yaml`:
- Around line 46-53: The temporal_visibility section in the init-configmap.yaml
file is missing a `\c postgres` statement at the end to restore the connection
context to the default database, inconsistent with the preceding sections (nico,
keycloak, temporal) which all conclude with this epilogue. Add `\c postgres`
after the `ALTER SCHEMA public OWNER TO temporal;` line in the
temporal_visibility section to maintain consistency and prevent connection
context issues if the script is extended in the future.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 3a1aefe3-e33f-4572-af3e-a065ed3ff2c6

📥 Commits

Reviewing files that changed from the base of the PR and between 2f2ba99 and 634b8e3.

📒 Files selected for processing (1)
  • rest-api/deploy/kustomize/base/postgres/init-configmap.yaml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant