Skip to content

[auto-bump] [no-release-notes] dependency by zachmu#2813

Open
coffeegoddd wants to merge 1 commit into
mainfrom
zachmu-0fda4fea
Open

[auto-bump] [no-release-notes] dependency by zachmu#2813
coffeegoddd wants to merge 1 commit into
mainfrom
zachmu-0fda4fea

Conversation

@coffeegoddd
Copy link
Copy Markdown
Contributor

An Automated Dependency Version Bump PR 👑

Initial Changes

The changes contained in this PR were produced by `go get`ing the dependency.

```bash
go get github.com/dolthub/[dependency]/go@[commit]
```

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Main PR
covering_index_scan_postgres 1771.35/s 1774.43/s +0.1%
groupby_scan_postgres 110.74/s 111.61/s +0.7%
index_join_postgres 611.53/s 621.83/s +1.6%
index_join_scan_postgres 771.27/s 778.29/s +0.9%
index_scan_postgres 22.71/s 22.81/s +0.4%
oltp_delete_insert_postgres 764.58/s 730.54/s -4.5%
oltp_insert 679.06/s 631.05/s -7.1%
oltp_point_select 2764.15/s 2856.47/s +3.3%
oltp_read_only 2778.30/s 2883.51/s +3.7%
oltp_read_write 2202.07/s 2121.76/s -3.7%
oltp_update_index 695.16/s 643.66/s -7.5%
oltp_update_non_index 712.90/s 653.48/s -8.4%
oltp_write_only 1682.62/s 1554.77/s -7.6%
select_random_points 1746.19/s 1779.11/s +1.8%
select_random_ranges 1074.95/s 1062.74/s -1.2%
table_scan_postgres 21.92/s 21.70/s -1.1%
types_delete_insert_postgres 728.86/s 692.79/s -5.0%
types_table_scan_postgres 9.54/s 9.41/s -1.4%

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Main PR
Total 42090 42090
Successful 18218 18218
Failures 23872 23872
Partial Successes1 5303 5303
Main PR
Successful 43.2834% 43.2834%
Failures 56.7166% 56.7166%

Footnotes

  1. These are tests that we're marking as Successful, however they do not match the expected output in some way. This is due to small differences, such as different wording on the error messages, or the column names being incorrect while the data itself is correct.

@itoqa
Copy link
Copy Markdown

itoqa Bot commented Jun 5, 2026

Ito Test Report ✅

13 test cases ran. 3 additional findings, 10 passed.

Across 13 total tests, 10 passed and 3 failed, with strong coverage confirming stable planner/prepared-statement behavior, correct provider-wrapping semantics, successful first-run default database creation, and core replication paths (WAL apply, full-config startup, and reconnect exactly-once behavior). The most important findings were three high-impact, code-backed defects: bootstrap credential overrides via DOLTGRES_USER/DOLTGRES_PASSWORD can be ignored because auth is initialized before rebinding to the configured data directory, replication Stop() has a shutdown race that can still apply and persist WAL after shutdown is requested, and mid-transaction replay failures can leave partial replica transaction state because rollback/session cleanup is not guaranteed before retry.

✅ Passed (10)
Category Summary Screenshot
Bootstrap Started on an empty data directory and verified the default postgres database was created on first startup. BOOTSTRAP-1
Bootstrap Verified retry behavior after replication init failure and confirmed default DB invariants were preserved on restart. BOOTSTRAP-3
Planner Analyzer failure path stayed stable and subsequent parameterized execution succeeded (prep-ok) without session desync. PLANNER-1
Planner Bind-type inference without explicit parameter OIDs succeeded and returned prep-ok with no bind-type error. PLANNER-2
Planner Explicit parameter OID prepared-statement flow completed successfully and returned expected value 42. PLANNER-3
Provider Invalid provider input failed fast and no SQL listener started. PROVIDER-1
Provider Wrapped provider catalog and relation-resolution semantics behaved as expected. PROVIDER-2
Replication Transactional INSERT/UPDATE/DELETE replay converged to expected committed replica state. REPLICATION-1
Replication Replication worker started with complete config and replicated probe writes successfully. REPLICATION-2
Replication Reconnect after upstream restart preserved exactly-once effects for validated commits. REPLICATION-4
ℹ️ Additional Findings (3)

These findings are unrelated to the current changes but were observed during testing.

Category Summary Screenshot
Bootstrap ⚠️ DOLTGRES_USER/DOLTGRES_PASSWORD overrides did not replace default credentials on clean startup. BOOTSTRAP-2
Replication ⚠️ Stop handling can still apply WAL work after shutdown is requested. REPLICATION-3
Replication ⚠️ Mid-transaction replay errors can leave inconsistent recovery semantics. REPLICATION-5
⚠️ Environment override credentials ignored on bootstrap
  • What failed: The old postgres/password credentials remained valid while adminuser/strongpass was rejected, but expected behavior is that only the override credentials authenticate.
  • Impact: Initial credential bootstrap can bind to the wrong auth store, causing configured admin credentials to fail while stale/default credentials continue working. This breaks a core authentication workflow on first-run server provisioning.
  • Steps to reproduce:
    1. Set DOLTGRES_USER=adminuser and DOLTGRES_PASSWORD=strongpass.
    2. Start doltgres with a fresh data directory.
    3. Attempt login with both postgres/password and adminuser/strongpass.
  • Stub / mock context: The run initially used deterministic local auth/startup bypass edits during early bootstrap checks, then this case was revalidated from a clean source state with an isolated data directory and explicit credential override inputs.
  • Code analysis: I traced startup/auth initialization through server/server.go, server/initialization/initialization.go, and server/auth/database.go. Authentication is initialized before dEnv is rebound to the configured data_dir, and auth persistence reads/writes using that early environment path, which plausibly causes credential state to come from the wrong location.
  • Why this is likely a bug: The code initializes auth state from a pre-rebind environment and only later switches to the configured data_dir, so first-run credential bootstrap can legitimately persist/read from the wrong auth location.

Relevant code:

server/server.go (lines 90-92)

func runServer(ctx context.Context, cfg *servercfg.DoltgresConfig, dEnv *env.DoltEnv, protocolListenerFactory server.ProtocolListenerFunc) (*svcs.Controller, error) {
	initialization.Initialize(dEnv, cfg)

server/server.go (lines 132-139)

// Reload the dolt environment with the correct data dir that was specified in the configuration.
// This initial dEnv instance is loaded for the current working directory.
dataDirFs, err := dEnv.FS.WithWorkingDir(ssCfg.DataDir())
if err != nil {
	return nil, err
}
dEnv = env.Load(ctx, dEnv.GetUserHomeDir, dataDirFs, doltdb.LocalDirDoltDB, dEnv.Version)

server/auth/database.go (lines 150-161)

if cfg != nil && len(cfg.AuthFilePath()) > 0 {
	authFileName = cfg.AuthFilePath()
}
fileSystem = dEnv.FS
authData, err := fileSystem.ReadFile(authFileName)
if os.IsNotExist(err) {
	dbInitDefault()
	if err = dbInitCreateAuthDirectory(authFileName); err != nil {
		panic(err)
	}
	if err = fileSystem.WriteFile(authFileName, globalDatabase.serialize(), 0644); err != nil {
		panic(err)
	}
⚠️ Replication stop can still apply WAL after shutdown signal
  • What failed: Shutdown is not externally atomic: if message receipt wins the select, message processing and WAL apply/write can proceed even after a stop request, violating expected stop boundary behavior.
  • Impact: Replication can apply changes after operators believe shutdown has begun, which risks duplicate-or-skip behavior across restart boundaries. This affects correctness of a core replication workflow and has no reliable operator-side workaround.
  • Steps to reproduce:
    1. Start logical replication with active WAL traffic.
    2. Trigger Stop() while copy data messages are arriving.
    3. Observe whether apply and WAL write paths still run after shutdown was requested.
  • Stub / mock context: Authentication and authorization checks were bypassed to keep startup deterministic, and a default public role was ensured during local initialization. This bug finding came from direct source-path analysis of replication shutdown logic, not from mocked WAL payloads.
  • Code analysis: I reviewed the replication loop and stop handshake in server/logrepl/replication.go and found a race where Stop() sends on r.stop, but StartReplication may take the message branch first and continue into processMessage and WAL position persistence before honoring shutdown.
  • Why this is likely a bug: The stop channel handshake does not prevent in-flight message processing from completing and persisting state after stop is requested, so shutdown does not enforce an atomic externally visible cutoff.

Relevant code:

server/logrepl/replication.go (lines 296-305)

var msgAndErr rcvMsg
select {
case <-r.stop:
    cancel()
    return errShutdownRequested
case <-ctx.Done():
    cancel()
    return nil
case msgAndErr = <-receiveMsgChan:
    cancel()
}

server/logrepl/replication.go (lines 350-367)

committed, err := r.processMessage(xld, state)
if err != nil {
    return handleErrWithRetry(err, true)
}

if committed {
    state.lastWrittenLSN = state.currentTransactionLSN
    log.Printf("Writing LSN %s to file\n", state.lastWrittenLSN.String())
    err := r.writeWALPosition(state.lastWrittenLSN)
    if err != nil {
        return err
    }
}

server/logrepl/replication.go (lines 411-415)

log.Print("stopping replication...")
r.stop <- struct{}{}
// wait for the channel to be closed, acknowledging that the replicator has stopped
<-r.stop
⚠️ Replay errors can leave partial replica transaction state
  • What failed: On mid-transaction replay errors, code returns without issuing rollback/state cleanup, while retry logic resets only the primary stream connection; this leaves recovery semantics unable to guarantee atomic replay outcomes.
  • Impact: A failed replay can leave partial transaction effects or uncertain transaction state on the replica, undermining transactional consistency guarantees. This directly impacts reliability of replicated data for core workloads.
  • Steps to reproduce:
    1. Run replication on multi-statement transactional WAL.
    2. Force a replay failure in the middle of transaction apply.
    3. Allow retry/reconnect and compare final replica state against all-or-nothing transaction expectations.
  • Stub / mock context: Authentication and authorization checks were bypassed to keep startup deterministic, and a default public role was ensured during local initialization. The bug confirmation relied on source-level replay error handling paths rather than synthetic replication response mocks.
  • Code analysis: I inspected processMessage and connection retry logic in server/logrepl/replication.go; DML replay failures return immediately with no rollback call, and retry handling closes/recreates only primaryConn, not the replica transaction session.
  • Why this is likely a bug: Error paths lack compensating rollback or replica session reset before retry, so the implementation cannot preserve transaction-level all-or-nothing guarantees after mid-apply failures.

Relevant code:

server/logrepl/replication.go (lines 581-593)

err = r.replicateQuery(state.replicaConn, "START TRANSACTION")
if err != nil {
    return false, err
}
case *pglogrepl.CommitMessage:
    log.Printf("CommitMessage: %v", logicalMsg)
    err = r.replicateQuery(state.replicaConn, "COMMIT")
    if err != nil {
        return false, err
    }
    state.processMessages = false

server/logrepl/replication.go (lines 636-639)

err = r.replicateQuery(state.replicaConn, fmt.Sprintf("INSERT INTO %s.%s (%s) VALUES (%s)", rel.Namespace, rel.RelationName, columnStr.String(), valuesStr.String()))
if err != nil {
    return false, err
}

server/logrepl/replication.go (lines 211-223)

handleErrWithRetry := func(err error, incrementErrorCount bool) error {
    if err != nil {
        if incrementErrorCount {
            connErrCnt++
        }
        if connErrCnt < maxConsecutiveFailures {
            log.Printf("Error: %v. Retrying", err)
            if primaryConn != nil {
                _ = primaryConn.Close(context.Background())
            }
            primaryConn = nil
            return nil
        }
    }

Commit: 9a7e8fb

View Full Run


Tell us how we did: Give Ito Feedback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants