Skip to content

Iceberg TRUNCATE: clean up object-storage files when REST catalog commit fails after write #1609

@Selfeer

Description

@Selfeer

Summary

During TRUNCATE TABLE on the Iceberg engine (REST catalog path), object storage can receive new artifacts (empty manifest list and updated table metadata.json) before the catalog is updated. If catalog->updateMetadata then fails (network blip, transient REST catalog error, etc.), those files are never removed. Unlike the mutation write path, truncate has no cleanup()-style rollback, so failures can leave orphaned objects that accumulate across retries (manifest list paths embed a snapshot id and are not overwritten by a later success).

Severity / impact

  • Medium (audit classification): Storage leak / clutter; no immediate data corruption if catalog never pointed at the failed metadata, but operational and cost impact under repeated failures.

Expected behavior

On any failure after writing truncate artifacts to object storage but before (or during) a successful catalog commit, remove the objects written in that attempt—consistent with Mutations.cpp writeMetadataFiles behavior.

Actual behavior

Truncate writes proceed; if updateMetadata fails afterward, no cleanup runs. Orphaned files remain indefinitely.

Technical anchor

Item Detail
Code IcebergMetadata::truncate in src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergMetadata.cpp (sequence after writeMessageToFile / around catalog commit)
Comparison src/Storages/ObjectStorage/DataLakes/Iceberg/Mutations.cppwriteMetadataFiles uses a cleanup() lambda and catch (...) { cleanup(); throw; } (~lines 400–530 per audit; verify in current tree)
Variables to clean (audit names; confirm in code) storage_manifest_list_name, storage_metadata_name (any other paths written in the same truncate transaction should be included)

Steps to reproduce (conceptual)

  1. Iceberg table with REST catalog.
  2. Trigger TRUNCATE TABLE while forcing catalog->updateMetadata to fail after manifest list and metadata JSON are written to object storage (injection, fault injection, or simulated network/catalog error).
  3. Observe leftover objects in the table’s storage prefix that are not referenced by the catalog.

Regression / test direction

  • Inject a failure after metadata write before catalog commit; assert no orphaned files remain on object storage after the exception.
  • Optionally assert catalog state unchanged / table still consistent from a reader’s perspective.

References

  • PR introducing truncate (context): Altinity/ClickHouse#1529 — author noted this as follow-up work; not blocking merge of truncate itself.
  • Audit note: PR review thread (AI audit by claude-4.6-sonnet-medium-thinking, Mar 2026) — Medium: No cleanup of orphaned files on catalog commit failure during TRUNCATE.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions