Skip to content

Commit 034b324

Browse files
committed
Add schema migration support
Add schema migration generation, application, upload, download, and check support for SQLite and PostgreSQL. Introduce the cloudsync_alter_* API, pending migration tracking, raw SQL migration operations, createTable payload support, and block-level LWW/augment migration commands. Update libcurl and Apple native network layers, add SQLite/PostgreSQL/cross-engine tests, refresh examples and documentation, and remove the old begin/commit alter API.
1 parent c8897a2 commit 034b324

31 files changed

Lines changed: 8135 additions & 482 deletions

API.md

Lines changed: 447 additions & 18 deletions
Large diffs are not rendered by default.

Makefile

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -210,10 +210,14 @@ $(TEST_TARGET): $(TEST_OBJ)
210210
$(CC) $(filter-out $(patsubst $(DIST_DIR)/%$(EXE),$(BUILD_TEST)/%.o, $(filter-out $@,$(TEST_TARGET))), $(TEST_OBJ)) -o $@ $(T_LDFLAGS)
211211

212212
# Object files
213+
$(BUILD_RELEASE)/fractional_indexing.o: $(FI_DIR)/fractional_indexing.c
214+
$(CC) $(CFLAGS) -Wno-sign-compare -O3 -fPIC -c $< -o $@
213215
$(BUILD_RELEASE)/%.o: %.c
214216
$(CC) $(CFLAGS) -O3 -fPIC -c $< -o $@
215217
$(BUILD_TEST)/sqlite3.o: $(SQLITE_DIR)/sqlite3.c
216218
$(CC) $(CFLAGS) -DSQLITE_DQS=0 -DSQLITE_CORE -c $< -o $@
219+
$(BUILD_TEST)/fractional_indexing.o: $(FI_DIR)/fractional_indexing.c
220+
$(CC) $(T_CFLAGS) -Wno-sign-compare -c $< -o $@
217221
$(BUILD_TEST)/%.o: %.c
218222
$(CC) $(T_CFLAGS) -c $< -o $@
219223

@@ -237,6 +241,15 @@ e2e: $(TARGET) $(DIST_DIR)/integration$(EXE)
237241
fi; \
238242
./$(DIST_DIR)/integration$(EXE)
239243

244+
cross-dialect-migration-test: $(TARGET)
245+
PG_DOCKER_DB_HOST="$(PG_DOCKER_DB_HOST)" \
246+
PG_DOCKER_DB_PORT="$(PG_DOCKER_DB_PORT)" \
247+
PG_DOCKER_DB_NAME="$(PG_DOCKER_DB_NAME)" \
248+
PG_DOCKER_DB_USER="$(PG_DOCKER_DB_USER)" \
249+
PG_DOCKER_DB_PASSWORD="$(PG_DOCKER_DB_PASSWORD)" \
250+
SQLITE3="$(SQLITE3)" \
251+
./test/schema_migration_cross_dialect.sh
252+
240253
OPENSSL_TARBALL = $(OPENSSL_DIR)/$(OPENSSL_VERSION).tar.gz
241254

242255
$(OPENSSL_TARBALL):
@@ -456,6 +469,7 @@ help:
456469
@echo " clean - Remove built files"
457470
@echo " test [COVERAGE=true] - Test the extension with optional coverage output"
458471
@echo " unittest - Run only unit tests (test/unit.c)"
472+
@echo " cross-dialect-migration-test - Test schema migrations between SQLite and PostgreSQL"
459473
@echo " help - Display this help message"
460474
@echo " xcframework - Build the Apple XCFramework"
461475
@echo " aar - Build the Android AAR package"
@@ -466,4 +480,4 @@ help:
466480
# Include PostgreSQL extension targets
467481
include docker/Makefile.postgresql
468482

469-
.PHONY: all clean test unittest e2e extension help version xcframework aar
483+
.PHONY: all clean test unittest e2e cross-dialect-migration-test extension help version xcframework aar

docker/Makefile.postgresql

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@ PG_CORE_SRC = \
5252
src/pk.c \
5353
src/utils.c \
5454
src/lz4.c \
55+
src/migration.c \
5556
src/block.c \
5657
modules/fractional-indexing/fractional_indexing.c
5758

docker/postgresql/docker-compose.debug.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ services:
1010
POSTGRES_DB: cloudsync_test
1111
ports:
1212
- "5432:5432"
13+
command: ["postgres", "-c", "listen_addresses=*"]
1314
ulimits:
1415
core: -1
1516
cap_add:

docs/internal/schema-migrations.md

Lines changed: 309 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,309 @@
1+
# CloudSync Schema Migrations
2+
3+
This document describes the implemented schema migration flow for SQLite Sync.
4+
Schema migrations can originate from an authorized SQLite client or from the
5+
cloud database, and the same payload can be applied to SQLite and PostgreSQL.
6+
7+
## Goals
8+
9+
- Allow schema changes to originate either from a SQLite client or from the cloud database.
10+
- Support an empty SQLite client database that creates its synchronized tables during first sync.
11+
- Keep row CRDT payloads binary and focused on data, while using a separate schema-migration protocol.
12+
- Support SQLite/SQLiteCloud and PostgreSQL backends.
13+
- Keep every database coherent when a schema migration or data sync fails.
14+
- Preserve a raw SQL escape hatch for migrations that cannot be expressed portably.
15+
16+
## Core Model
17+
18+
Schema is not a CRDT. Schema migrations are serialized in one ordered log per
19+
`database_id`, and the backend decides whether a migration may be proposed or
20+
applied based on the API key used for the request.
21+
22+
The extension stores applied migrations locally in `cloudsync_migrations`, and
23+
client-originated migrations waiting for upload in `cloudsync_pending_migration`.
24+
Pending alter operations are not stored in a table: they live in the current
25+
CloudSync context until `cloudsync_alter_apply()` or `cloudsync_alter_clear()`.
26+
27+
The local migration applier is atomic. `cloudsync_migration_apply()` opens a
28+
savepoint, validates the JSON payload, applies every operation, updates the
29+
schema hash, records the migration id, and rolls everything back on failure.
30+
31+
## Public SQL Workflow
32+
33+
Applications should build migrations with declarative SQL functions. They do
34+
not need to write JSON.
35+
36+
```sql
37+
SELECT cloudsync_alter_create_table('notes');
38+
SELECT cloudsync_alter_add_column('notes', 'id', 'text', false);
39+
SELECT cloudsync_alter_add_primary_key('notes', 'id');
40+
SELECT cloudsync_alter_add_column('notes', 'title', 'text', false, '');
41+
SELECT cloudsync_alter_add_column('notes', 'body', 'text', false, '');
42+
SELECT cloudsync_alter_add_column('notes', 'updated_at', 'timestamp', false, '1970-01-01T00:00:00Z');
43+
SELECT cloudsync_alter_augment_table('notes', 'CLS', 1);
44+
SELECT cloudsync_alter_set_block_lww('notes', 'body', char(10));
45+
SELECT cloudsync_alter_apply();
46+
```
47+
48+
`cloudsync_alter_apply()` applies the queued migration locally and stores the
49+
generated payload in `cloudsync_pending_migration`. After that, an authorized
50+
client uploads it with:
51+
52+
```sql
53+
SELECT cloudsync_network_migration_upload();
54+
SELECT cloudsync_network_sync();
55+
```
56+
57+
The zero-argument upload form uploads the next pending local migration and marks
58+
it uploaded only after the backend returns valid JSON. The one-argument form is
59+
still available for custom backends or tests:
60+
61+
```sql
62+
SELECT cloudsync_network_migration_upload(:json_payload);
63+
```
64+
65+
While a local migration is pending upload, `cloudsync_network_send_changes()`
66+
returns an error instead of sending row changes. This prevents data produced
67+
with a new local schema from reaching a server that has not accepted that schema.
68+
69+
## Declarative API
70+
71+
The same SQL API is exposed by the SQLite and PostgreSQL extensions:
72+
73+
- `cloudsync_alter_create_table(table)`
74+
- `cloudsync_alter_add_column(table, column, logical_type, nullable)`
75+
- `cloudsync_alter_add_column(table, column, logical_type, nullable, default_value)`
76+
- `cloudsync_alter_add_column_sqlite(table, column, type_sql, nullable)`
77+
- `cloudsync_alter_add_column_sqlite(table, column, type_sql, nullable, default_sql)`
78+
- `cloudsync_alter_add_column_postgresql(table, column, type_sql, nullable)`
79+
- `cloudsync_alter_add_column_postgresql(table, column, type_sql, nullable, default_sql)`
80+
- `cloudsync_alter_add_primary_key(table, column)`
81+
- `cloudsync_alter_augment_table(table)`
82+
- `cloudsync_alter_augment_table(table, algorithm)`
83+
- `cloudsync_alter_augment_table(table, algorithm, init_flags)`
84+
- `cloudsync_alter_set_block_lww(table, column)`
85+
- `cloudsync_alter_set_block_lww(table, column, delimiter)`
86+
- `cloudsync_alter_set_column(table, column, key, value)`
87+
- `cloudsync_alter_set_filter(table, filter_expr)`
88+
- `cloudsync_alter_set_filter_sqlite(table, filter_expr)`
89+
- `cloudsync_alter_set_filter_postgresql(table, filter_expr)`
90+
- `cloudsync_alter_drop_column(table, column)`
91+
- `cloudsync_alter_rename_column(table, from_name, to_name)`
92+
- `cloudsync_alter_sql(sql)`
93+
- `cloudsync_alter_sqlite(sql)`
94+
- `cloudsync_alter_postgresql(sql)`
95+
- `cloudsync_alter_preview()`
96+
- `cloudsync_alter_apply()`
97+
- `cloudsync_alter_clear()`
98+
- `cloudsync_alter_clear(table)`
99+
100+
`cloudsync_alter_preview()` returns the generated JSON without applying it.
101+
`cloudsync_alter_clear()` discards queued in-memory operations.
102+
103+
The dialect override functions are optional. Use them when the portable logical
104+
type or default is not precise enough:
105+
106+
```sql
107+
SELECT cloudsync_alter_add_column('notes', 'metadata', 'json', false, '{}');
108+
SELECT cloudsync_alter_add_column_sqlite('notes', 'metadata', 'TEXT', false, '''{}''');
109+
SELECT cloudsync_alter_add_column_postgresql('notes', 'metadata', 'JSONB', false, '''{}''::jsonb');
110+
```
111+
112+
The override default is a SQL fragment for that dialect, not a plain value. The
113+
portable `default_value` argument is optional, and its serialization is inferred
114+
from the logical column type.
115+
116+
Raw SQL functions are an escape hatch for migration steps that do not have a
117+
portable command yet. `cloudsync_alter_sql()` runs on every engine, while
118+
`cloudsync_alter_sqlite()` and `cloudsync_alter_postgresql()` are emitted as
119+
dialect-specific raw SQL and skipped by the other engine. They run in queue
120+
order with the structured operations and cannot contain transaction-control
121+
statements.
122+
123+
`cloudsync_begin_alter()` and `cloudsync_commit_alter()` still exist as internal
124+
C primitives used while replaying migrations on already-augmented tables. They
125+
are not public SQL APIs.
126+
127+
## Payload Format
128+
129+
The network payload is JSON. This is intentional even though row sync uses a
130+
binary encoder: schema payloads must be audited, authorized, inspected by a
131+
backend service, and sometimes hand-produced by server tooling. User-facing APIs
132+
generate the JSON automatically, so application code does not need to construct
133+
it directly.
134+
135+
Generated client payloads omit `baseSchemaHash` and `targetSchemaHash` because
136+
raw SQLite and PostgreSQL schema hashes are not necessarily portable across
137+
dialects. Manual payloads may include those fields; when present,
138+
`cloudsync_migration_apply()` enforces them and rolls back on mismatch.
139+
140+
Example generated V1 payload:
141+
142+
```json
143+
{
144+
"type": "cloudsync.schema.migration",
145+
"formatVersion": 1,
146+
"migrationId": "0197097c-8b35-7c11-8ed4-4e59ddfdb928",
147+
"requiredCapabilities": ["schema:write"],
148+
"ops": [
149+
{
150+
"op": "createTable",
151+
"table": "notes",
152+
"columns": [
153+
{"name": "id", "type": "text", "nullable": false, "primaryKey": true},
154+
{"name": "body", "type": "text", "nullable": false, "default": {"type": "text", "value": ""}}
155+
]
156+
},
157+
{"op": "augmentTable", "table": "notes", "algorithm": "CLS", "initFlags": 1},
158+
{"op": "setBlockLww", "table": "notes", "column": "body", "delimiter": "\n"}
159+
]
160+
}
161+
```
162+
163+
## Version 1
164+
165+
Version 1 contains additive and bootstrap operations:
166+
167+
- `createTable`: create a table from logical column definitions.
168+
- `addColumn`: add a nullable column or a `NOT NULL` column with a default value.
169+
- `augmentTable`: call the same internal path as `cloudsync_init()`.
170+
- `setBlockLww`: configure block-level LWW and materialize block metadata.
171+
- `setColumn`: set a CloudSync column setting.
172+
- `setFilter`: set a row filter, with optional dialect-specific filters.
173+
174+
Creating a synchronized table requires both `createTable` and `augmentTable`.
175+
`setBlockLww` must run after the table is augmented and after the target column
176+
exists.
177+
178+
## Version 2
179+
180+
Version 2 is implemented for authorized non-additive changes:
181+
182+
- `dropColumn`
183+
- `renameColumn`
184+
- `rebuildTableSync`
185+
- `rawSql` in V2/destructive payloads
186+
187+
Generated payloads containing `dropColumn` or `renameColumn` use
188+
`formatVersion: 2` and include `schema:destructive` in `requiredCapabilities`.
189+
The backend must enforce this capability from the API key; the payload field is
190+
for audit and policy clarity, not authentication.
191+
192+
The declarative raw SQL functions also emit V2/destructive payloads, even when
193+
the SQL is intended to be additive, because the extension cannot safely infer
194+
the behavioral impact of arbitrary SQL.
195+
196+
`rebuildTableSync` uses `cloudsync_cleanup(..., is_migration = true)` so the
197+
table sync metadata is rebuilt without resetting the database-wide CloudSync
198+
site identity or schema history. The `ddl` and `blockLww` fields are validated
199+
before cleanup/reinit so malformed payloads fail without partially changing the
200+
table.
201+
202+
Version 3 orchestration is deliberately not implemented. Rolling expand/contract
203+
migrations, payload translation across schema epochs, and long-running backfills
204+
belong to a future protocol layer.
205+
206+
## Logical Type Mapping
207+
208+
Portable payloads use logical types and let `migration.c` render backend SQL:
209+
210+
- `text` -> SQLite `TEXT`, PostgreSQL `TEXT`
211+
- `uuid` -> SQLite `TEXT`, PostgreSQL `UUID`
212+
- `integer` -> SQLite `INTEGER`, PostgreSQL `BIGINT`
213+
- `real` -> SQLite `REAL`, PostgreSQL `DOUBLE PRECISION`
214+
- `numeric` -> SQLite `NUMERIC`, PostgreSQL `NUMERIC`
215+
- `blob` -> SQLite `BLOB`, PostgreSQL `BYTEA`
216+
- `boolean` -> SQLite `INTEGER`, PostgreSQL `BOOLEAN`
217+
- `json` -> SQLite `TEXT`, PostgreSQL `JSONB`
218+
- `timestamp` -> SQLite `TEXT`, PostgreSQL `TIMESTAMPTZ`
219+
220+
Use dialect override functions when a migration needs exact SQL types or
221+
database-specific default expressions.
222+
223+
## Backend Protocol
224+
225+
Schema endpoints live beside the existing data endpoints:
226+
227+
- `POST /v2/cloudsync/databases/{databaseId}/{siteId}/schema/check`
228+
- `POST /v2/cloudsync/databases/{databaseId}/{siteId}/schema/upload`
229+
- `GET /v2/cloudsync/databases/{databaseId}/{siteId}/schema/download`
230+
231+
Recommended backend log fields:
232+
233+
- `database_id`
234+
- `schema_version` or `schema_epoch`
235+
- `migration_id`
236+
- `source`: `client` or `server`
237+
- `author_site_id`
238+
- `payload`
239+
- `payload_hash`
240+
- `required_capabilities`
241+
- `authorized_by_key_id`
242+
- `status`: `pending`, `applied`, `rejected`, `failed`
243+
- `created_at`, `applied_at`
244+
- `error`
245+
246+
API key classes:
247+
248+
- `sync`: send/receive data and download already-approved migrations.
249+
- `schema:write`: propose/upload/apply V1 migrations.
250+
- `schema:destructive`: propose/upload/apply V2 destructive migrations.
251+
252+
Normal application clients should use `sync` keys. A schema-capable key is
253+
required for every schema change, including V1 additive changes.
254+
255+
## Sync Flow
256+
257+
Client-originated migration:
258+
259+
1. Application queues operations with `cloudsync_alter_*`.
260+
2. Application calls `cloudsync_alter_apply()`.
261+
3. Extension applies the migration locally and writes `cloudsync_pending_migration`.
262+
4. Application or `cloudsync_network_sync()` uploads the pending migration.
263+
5. Backend authorizes the API key, applies the payload to the cloud database,
264+
records the migration, and returns success.
265+
6. Client sends row changes after the pending migration is uploaded.
266+
267+
Server-originated migration:
268+
269+
1. Backend applies and records a migration.
270+
2. Client calls `cloudsync_network_sync()` or `cloudsync_network_migration_check()`.
271+
3. The network layer downloads the migration when the local schema is missing or stale.
272+
4. `cloudsync_migration_apply()` applies it locally.
273+
5. Data download/retry continues on the new schema.
274+
275+
Empty client first sync:
276+
277+
1. Empty SQLite client calls `cloudsync_network_init(database_id)`.
278+
2. Sync checks schema before returning from the empty local send phase.
279+
3. Backend returns a schema snapshot or migration chain.
280+
4. The client creates tables, augments them, applies block LWW, then downloads data.
281+
282+
## Failure Semantics
283+
284+
- Migrations are atomic per database connection.
285+
- A malformed JSON payload is rejected before DDL is applied.
286+
- A migration id is idempotent through `cloudsync_migrations`.
287+
- Explicit hash guards are enforced when present.
288+
- Raw SQL runs inside the same savepoint as portable operations.
289+
- Row changes are not uploaded while `cloudsync_pending_migration` contains an unuploaded migration.
290+
- V2 migrations should be blocked by the backend when stale/offline clients may still upload incompatible old-epoch payloads, unless the backend has an explicit rejection or translation policy.
291+
292+
## Tests
293+
294+
SQLite coverage is in `test/unit.c` and the mock network tests in
295+
`test/integration.c`.
296+
297+
PostgreSQL coverage is in `test/postgresql/31_alter_table_sync.sql` and
298+
`test/postgresql/52_schema_migrations.sql`.
299+
300+
Cross-dialect coverage is in `test/schema_migration_cross_dialect.sh` and can
301+
be run with:
302+
303+
```sh
304+
make cross-dialect-migration-test
305+
```
306+
307+
The cross-dialect test covers SQLite-generated migrations applied to PostgreSQL,
308+
PostgreSQL-generated migrations with dialect overrides applied to SQLite, and
309+
generic plus dialect-specific raw SQL in both directions.

examples/README.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,12 @@ This directory contains comprehensive examples demonstrating SQLite Sync in vari
1212
- Offline scenarios and network synchronization
1313
- Perfect for understanding core sync mechanics
1414

15+
### [schema-migrations/](./schema-migrations/)
16+
**Schema Migrations**
17+
- Client-to-server and server-to-client migration examples
18+
- Demonstrates the `cloudsync_alter_*` API, generated pending migrations, table creation, `cloudsync_init`, block-level LWW, dialect overrides, raw SQL escape hatches, and V2 rebuild payloads
19+
- Shows how schema-capable API keys fit into migration upload/download
20+
1521
### [sport-tracker-app/](./sport-tracker-app/)
1622
**Advanced Web App - Production Patterns**
1723
- React/TypeScript web application with Vite
@@ -46,4 +52,4 @@ Each example includes detailed setup instructions, code explanations, and securi
4652

4753
---
4854

49-
**Note**: For generic extension loading guides please refer to the [SQLite Extension Guide](https://github.com/sqliteai/sqlite-extensions-guide) repository
55+
**Note**: For generic extension loading guides please refer to the [SQLite Extension Guide](https://github.com/sqliteai/sqlite-extensions-guide) repository

0 commit comments

Comments
 (0)