-
Notifications
You must be signed in to change notification settings - Fork 17
Iceberg TRUNCATE: clean up object-storage files when REST catalog commit fails after write #1609
Description
Summary
During TRUNCATE TABLE on the Iceberg engine (REST catalog path), object storage can receive new artifacts (empty manifest list and updated table metadata.json) before the catalog is updated. If catalog->updateMetadata then fails (network blip, transient REST catalog error, etc.), those files are never removed. Unlike the mutation write path, truncate has no cleanup()-style rollback, so failures can leave orphaned objects that accumulate across retries (manifest list paths embed a snapshot id and are not overwritten by a later success).
Severity / impact
- Medium (audit classification): Storage leak / clutter; no immediate data corruption if catalog never pointed at the failed metadata, but operational and cost impact under repeated failures.
Expected behavior
On any failure after writing truncate artifacts to object storage but before (or during) a successful catalog commit, remove the objects written in that attempt—consistent with Mutations.cpp writeMetadataFiles behavior.
Actual behavior
Truncate writes proceed; if updateMetadata fails afterward, no cleanup runs. Orphaned files remain indefinitely.
Technical anchor
| Item | Detail |
|---|---|
| Code | IcebergMetadata::truncate in src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergMetadata.cpp (sequence after writeMessageToFile / around catalog commit) |
| Comparison | src/Storages/ObjectStorage/DataLakes/Iceberg/Mutations.cpp — writeMetadataFiles uses a cleanup() lambda and catch (...) { cleanup(); throw; } (~lines 400–530 per audit; verify in current tree) |
| Variables to clean (audit names; confirm in code) | storage_manifest_list_name, storage_metadata_name (any other paths written in the same truncate transaction should be included) |
Steps to reproduce (conceptual)
- Iceberg table with REST catalog.
- Trigger
TRUNCATE TABLEwhile forcingcatalog->updateMetadatato fail after manifest list and metadata JSON are written to object storage (injection, fault injection, or simulated network/catalog error). - Observe leftover objects in the table’s storage prefix that are not referenced by the catalog.
Regression / test direction
- Inject a failure after metadata write before catalog commit; assert no orphaned files remain on object storage after the exception.
- Optionally assert catalog state unchanged / table still consistent from a reader’s perspective.
References
- PR introducing truncate (context): Altinity/ClickHouse#1529 — author noted this as follow-up work; not blocking merge of truncate itself.
- Audit note: PR review thread (AI audit by claude-4.6-sonnet-medium-thinking, Mar 2026) — Medium: No cleanup of orphaned files on catalog commit failure during TRUNCATE.