Skip to content

Add test for JSONB use with dolt_patch()#2793

Open
fulghum wants to merge 2 commits into
mainfrom
fulghum/dolt_patch_jsonb
Open

Add test for JSONB use with dolt_patch()#2793
fulghum wants to merge 2 commits into
mainfrom
fulghum/dolt_patch_jsonb

Conversation

@fulghum

@fulghum fulghum commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Depends on: dolthub/dolt#11153

@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor
Main PR
covering_index_scan_postgres 1895.20/s 1890.31/s -0.3%
groupby_scan_postgres 132.32/s 133.72/s +1.0%
index_join_postgres 645.21/s 644.25/s -0.2%
index_join_scan_postgres 800.90/s 797.32/s -0.5%
index_scan_postgres 23.53/s 23.95/s +1.7%
oltp_delete_insert_postgres 782.29/s 786.03/s +0.4%
oltp_insert 681.11/s 706.01/s +3.6%
oltp_point_select 2881.88/s 2916.77/s +1.2%
oltp_read_only 2925.00/s 2969.82/s +1.5%
oltp_read_write 2335.65/s 2341.97/s +0.2%
oltp_update_index 734.57/s 724.54/s -1.4%
oltp_update_non_index 763.71/s 760.18/s -0.5%
oltp_write_only 1743.75/s 1747.20/s +0.1%
select_random_points 1862.40/s 1884.40/s +1.1%
select_random_ranges 1125.34/s 1124.19/s -0.2%
table_scan_postgres 22.46/s 22.85/s +1.7%
types_delete_insert_postgres 770.04/s 768.47/s -0.3%
types_table_scan_postgres 9.75/s 9.94/s +1.9%

@itoqa

itoqa Bot commented Jun 2, 2026

Copy link
Copy Markdown

Ito Test Report ❌

8 test cases ran. 4 failed, 4 passed.

Overall, the unified run is a failure (4 passed, 4 failed), with two high-severity replay defects repeatedly reproduced showing that dolt_patch emits JSONB INSERT SQL that breaks on valid complex payloads (nested quotes, backslashes, newlines, arrays, nulls) with invalid input syntax for type jsonb. The evidence points to a real production regression introduced with the PR’s Dolt dependency bump and not covered by existing simple-payload smoke coverage, while separate oracle checks passed and confirmed that ORDER BY enforces strict statement sequence whereas unordered assertions are set-based and can mask ordering regressions.

❌ Failed (4)
Category Summary Screenshot
Serialization ⚠️ dolt_patch emitted JSONB INSERT SQL that fails replay for nested-quote/newline payloads with invalid jsonb syntax. SERIALIZATION-1
Serialization ⚠️ dolt_patch emitted boundary JSONB INSERT SQL that is over-escaped and rejected as invalid jsonb during replay. SERIALIZATION-3
Upstream ⚠️ Generated dolt_patch data SQL fails replay with invalid input syntax for type jsonb. UPSTREAM-2
Upstream ⚠️ Multi-row downstream replay fails on first emitted JSONB patch INSERT statement. UPSTREAM-3
⚠️ Complex JSONB payload with nested quotes stays executable
  • What failed: The emitted patch INSERT for complex JSONB content fails with invalid JSONB syntax instead of replaying the original value.
  • Impact: Downstream workflows that execute dolt_patch output can fail for legitimate JSONB data containing nested escaping. This blocks reliable patch replay for affected repositories and automation pipelines.
  • Steps to reproduce:
    1. Create repro(pk int primary key, data jsonb).
    2. Insert a JSONB row with nested quotes and escaped newline content.
    3. Query dolt_patch('HEAD','WORKING','repro') and capture the data INSERT statement.
    4. Truncate target rows and execute the emitted INSERT statement.
    5. Observe invalid input syntax for type jsonb instead of successful replay.
  • Stub / mock context: The run used deterministic local-auth bypassing and startup patches so SQL verification could execute offline; authentication and SCRAM behavior were not under test. Temporary edits in server startup and constraint paths (including server/auth/* and server/analyzer/domain_constraints.go) were applied to reduce unrelated environment failures while validating dolt_patch and replay behavior.
  • Code analysis: I reviewed dependency wiring and test coverage in Doltgres. The server uses Dolt's dfunctions path from the pinned github.com/dolthub/dolt/go revision, and this PR updates that revision; the newly added smoke assertion validates only a simple JSON payload and does not cover escaping-heavy JSON, which matches the observed replay breakage.
  • Why this is likely a bug: The failure reproduces on real SQL replay while the production code path is bound to a newly bumped upstream Dolt implementation that is responsible for emitting these statements.

Relevant code:

go.mod (lines 5-12)

require (
	github.com/PuerkitoBio/goquery v1.8.1
	github.com/cockroachdb/apd/v3 v3.2.3
	github.com/cockroachdb/errors v1.7.5
	github.com/dolthub/dolt/go v0.40.5-0.20260602205139-27fdd3defdfa
	github.com/dolthub/eventsapi_schema v0.0.0-20260310172945-37a9265ade69
	github.com/dolthub/flatbuffers/v23 v23.3.3-dh.2

server/server.go (lines 25-33)

"github.com/dolthub/dolt/go/cmd/dolt/cli"
	"github.com/dolthub/dolt/go/cmd/dolt/commands/sqlserver"
	"github.com/dolthub/dolt/go/libraries/doltcore/doltdb"
	"github.com/dolthub/dolt/go/libraries/doltcore/env"
	doltservercfg "github.com/dolthub/dolt/go/libraries/doltcore/servercfg"
	"github.com/dolthub/dolt/go/libraries/doltcore/sqle/dfunctions"
	"github.com/dolthub/dolt/go/libraries/doltcore/sqle/dsess"
	"github.com/dolthub/dolt/go/libraries/doltcore/sqle/resolve"
	"github.com/dolthub/dolt/go/libraries/utils/argparser"

testing/go/dolt_functions_test.go (lines 3049-3060)

{
			Name: "dolt_patch works with JSONB columns",
			SetUpScript: []string{
				"CREATE TABLE repro (pk int primary key, data jsonb);",
				"INSERT INTO repro VALUES (1, '{\"text\": \"hello\"}');",
			},
			Assertions: []ScriptTestAssertion{
				{
					Query: "SELECT statement_order, table_name, diff_type, statement FROM dolt_patch('HEAD', 'WORKING', 'repro')",
					Expected: []sql.Row{
						{Numeric("1"), "public.repro", "schema", "CREATE TABLE \"repro\" (\n  \"pk\" integer NOT NULL,\n  \"data\" jsonb,\n  PRIMARY KEY (\"pk\")\n);"},
						{Numeric("2"), "public.repro", "data", "INSERT INTO \"repro\" (\"pk\",\"data\") VALUES (1,'{\\\"text\\\": \\\"hello\\\"}');"},
⚠️ Stress JSONB escaping boundaries in generated patch SQL
  • What failed: Replay fails on emitted INSERT statements because JSONB payload text is over-escaped and rejected by JSONB parsing.
  • Impact: Teams relying on patch export/replay lose data portability for complex JSONB records. Automated migration or synchronization workflows can fail at runtime without a practical workaround.
  • Steps to reproduce:
    1. Create repro with a jsonb column in an isolated database.
    2. Insert boundary JSON rows containing nested quotes, backslashes, arrays, nulls, and mixed scalars.
    3. Generate dolt_patch('HEAD','WORKING','repro') data statements.
    4. Replay emitted INSERT statements against a clean table state.
    5. Observe invalid jsonb syntax errors and mismatched replayed data.
  • Stub / mock context: The run used deterministic local-auth bypassing and startup patches so SQL verification could execute offline; authentication and SCRAM behavior were not under test. Temporary edits in server startup and constraint paths (including server/auth/* and server/analyzer/domain_constraints.go) were applied to reduce unrelated environment failures while validating dolt_patch and replay behavior.
  • Code analysis: The same dependency and execution path is used for this stress case, and the PR-level JSONB smoke test remains scoped to one simple fixture. The observed multi-row boundary failure is consistent with an escaping bug in generated patch SQL rather than test scaffolding, because the SQL generation and replay path is the exact production path exercised by dolt_patch.
  • Why this is likely a bug: Boundary JSON payload replay fails in the live SQL path while simple fixtures pass, which indicates escaping logic in generated patch SQL is defective for valid JSONB inputs.

Relevant code:

go.mod (lines 5-12)

require (
	github.com/PuerkitoBio/goquery v1.8.1
	github.com/cockroachdb/apd/v3 v3.2.3
	github.com/cockroachdb/errors v1.7.5
	github.com/dolthub/dolt/go v0.40.5-0.20260602205139-27fdd3defdfa
	github.com/dolthub/eventsapi_schema v0.0.0-20260310172945-37a9265ade69
	github.com/dolthub/flatbuffers/v23 v23.3.3-dh.2

server/server.go (lines 25-33)

"github.com/dolthub/dolt/go/cmd/dolt/cli"
	"github.com/dolthub/dolt/go/cmd/dolt/commands/sqlserver"
	"github.com/dolthub/dolt/go/libraries/doltcore/doltdb"
	"github.com/dolthub/dolt/go/libraries/doltcore/env"
	doltservercfg "github.com/dolthub/dolt/go/libraries/doltcore/servercfg"
	"github.com/dolthub/dolt/go/libraries/doltcore/sqle/dfunctions"
	"github.com/dolthub/dolt/go/libraries/doltcore/sqle/dsess"
	"github.com/dolthub/dolt/go/libraries/doltcore/sqle/resolve"
	"github.com/dolthub/dolt/go/libraries/utils/argparser"

testing/go/dolt_functions_test.go (lines 3051-3060)

SetUpScript: []string{
				"CREATE TABLE repro (pk int primary key, data jsonb);",
				"INSERT INTO repro VALUES (1, '{\"text\": \"hello\"}');",
			},
			Assertions: []ScriptTestAssertion{
				{
					Query: "SELECT statement_order, table_name, diff_type, statement FROM dolt_patch('HEAD', 'WORKING', 'repro')",
					Expected: []sql.Row{
						{Numeric("1"), "public.repro", "schema", "CREATE TABLE \"repro\" (\n  \"pk\" integer NOT NULL,\n  \"data\" jsonb,\n  PRIMARY KEY (\"pk\")\n);"},
						{Numeric("2"), "public.repro", "data", "INSERT INTO \"repro\" (\"pk\",\"data\") VALUES (1,'{\\\"text\\\": \\\"hello\\\"}');"},
⚠️ JSONB patch SQL cannot be replayed safely
  • What failed: The emitted data INSERT contains escaped quote sequences inside the JSON literal and fails at execution with invalid input syntax for type jsonb, instead of replaying the original row.
  • Impact: Consumers that execute exported patch SQL cannot reliably reconstruct JSONB changes. This breaks a core replay/export workflow with no practical workaround besides manual SQL rewriting.
  • Steps to reproduce:
    1. Create repro(pk int primary key, data jsonb) and insert quote-sensitive JSON payloads.
    2. Run dolt_patch('HEAD','WORKING','repro') and capture emitted schema/data SQL statements.
    3. Execute emitted CREATE/INSERT statements in a clean destination database with strict error handling.
    4. Observe replay failure on the JSONB INSERT with invalid input syntax for type jsonb.
  • Stub / mock context: The run used local auth/bootstrap bypasses to keep the database service available, then executed real patch generation and replay SQL in isolated source and consumer databases; no route-level network mocking was used for this case.
  • Code analysis: I reviewed the PR-scoped dependency update in go.mod, the new JSONB smoke expectation in testing/go/dolt_functions_test.go, and runtime JSONB input validation in server/functions/jsonb.go. The code path expects strict JSON text on input, while emitted patch SQL shows double-escaped content that does not satisfy that parser.
  • Why this is likely a bug: Production parsing enforces valid JSONB input, and emitted patch SQL for JSONB data violates that expectation during replay, so the failure is consistent with a real product defect rather than a harness-only artifact.

Relevant code:

go.mod (lines 6-10)

require (
	github.com/PuerkitoBio/goquery v1.8.1
	github.com/cockroachdb/apd/v3 v3.2.3
	github.com/cockroachdb/errors v1.7.5
	github.com/dolthub/dolt/go v0.40.5-0.20260602205139-27fdd3defdfa

testing/go/dolt_functions_test.go (lines 3056-3060)

Query: "SELECT statement_order, table_name, diff_type, statement FROM dolt_patch('HEAD', 'WORKING', 'repro')",
Expected: []sql.Row{
	{Numeric("1"), "public.repro", "schema", "CREATE TABLE \"repro\" (\n  \"pk\" integer NOT NULL,\n  \"data\" jsonb,\n  PRIMARY KEY (\"pk\")\n);"},
	{Numeric("2"), "public.repro", "data", "INSERT INTO \"repro\" (\"pk\",\"data\") VALUES (1,'{\\\"text\\\": \\\"hello\\\"}');"},
},

server/functions/jsonb.go (lines 48-57)

input := val.(string)
inputBytes := unsafe.Slice(unsafe.StringData(input), len(input))
if json.Valid(inputBytes) {
	doc, err := pgtypes.UnmarshalToJsonDocument(inputBytes)
	return doc, err
}
if len(input) > 10 {
	input = input[:10] + "..."
}
return nil, pgtypes.ErrInvalidSyntaxForType.New("jsonb", input)
⚠️ Downstream replay breaks for JSONB heavy patches
  • What failed: Replay fails on the first emitted JSONB data statement with invalid JSONB syntax, preventing downstream state reconstruction.
  • Impact: Multi-row replay/export pipelines break before applying JSONB changes, so downstream systems cannot reconstruct source state. This disrupts a primary integration path for patch consumers.
  • Steps to reproduce:
    1. Insert multiple JSONB rows containing nested quotes, backslashes, arrays, nulls, and mixed scalar values.
    2. Generate patch SQL via dolt_patch('HEAD','WORKING','repro').
    3. Execute emitted statements in a clean downstream database and compare replayed state.
    4. Observe failure on the first JSONB data statement during replay execution.
  • Stub / mock context: Authentication and startup bypasses remained enabled to stabilize local execution, but replay verification used real SQL generation and execution against isolated databases with no synthetic API responses.
  • Code analysis: I examined the same production boundary as UPSTREAM-2 and confirmed the failure generalizes beyond a single payload: dependency-updated patch output format conflicts with strict JSONB parser requirements in server code. The failure repeats across varied JSONB shapes, indicating systemic escaping drift rather than one malformed fixture.
  • Why this is likely a bug: The same code-level mismatch between emitted patch SQL escaping and JSONB parser requirements consistently breaks replay for diverse JSON payloads, which is a reproducible production behavior defect.

Relevant code:

testing/go/dolt_functions_test.go (lines 3049-3056)

Name: "dolt_patch works with JSONB columns",
SetUpScript: []string{
	"CREATE TABLE repro (pk int primary key, data jsonb);",
	"INSERT INTO repro VALUES (1, '{\"text\": \"hello\"}');",
},
Assertions: []ScriptTestAssertion{
	{
		Query: "SELECT statement_order, table_name, diff_type, statement FROM dolt_patch('HEAD', 'WORKING', 'repro')",

server/functions/jsonb.go (lines 50-57)

if json.Valid(inputBytes) {
	doc, err := pgtypes.UnmarshalToJsonDocument(inputBytes)
	return doc, err
}
if len(input) > 10 {
	input = input[:10] + "..."
}
return nil, pgtypes.ErrInvalidSyntaxForType.New("jsonb", input)

go.mod (lines 8-10)

github.com/cockroachdb/errors v1.7.5
github.com/dolthub/dolt/go v0.40.5-0.20260602205139-27fdd3defdfa
github.com/dolthub/eventsapi_schema v0.0.0-20260310172945-37a9265ade69
✅ Passed (4)
Category Summary Screenshot
Oracle Ordered assertion enforces strict row sequence and fails on reversed order as expected. ORACLE-1
Oracle Unordered assertion accepted reversed expected rows; ordered variant exposed the mismatch. ORACLE-2
Serialization dolt_patch returned one schema and one data row for public.repro, and dolt_diff returned one added row with from_pk null and to_pk 1. N/A
Upstream Full smoke suite passed without focus filtering, including the JSONB smoke case. N/A

Commit: 1888a55

View Full Run


Tell us how we did: Give Ito Feedback

@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor
Main PR
Total 42090 42090
Successful 18128 18128
Failures 23962 23962
Partial Successes1 5385 5385
Main PR
Successful 43.0696% 43.0696%
Failures 56.9304% 56.9304%

Footnotes

  1. These are tests that we're marking as Successful, however they do not match the expected output in some way. This is due to small differences, such as different wording on the error messages, or the column names being incorrect while the data itself is correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant