Skip to content

Conversation

@DAlperin
Copy link
Member

@DAlperin DAlperin commented Feb 5, 2026

Motivation

Tips for reviewer

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

Snapshot batches can contain millions of rows, causing the DeltaWriter's
seen_rows HashMap to grow unbounded and consume excessive memory.

For snapshots, disable position delete tracking by setting max_seen_rows=0.
All deletes will use equality deletes instead, eliminating the memory
overhead at the cost of slightly slower reads (acceptable for snapshots).

Normal post-snapshot batches continue using position deletes as usual.

Requires iceberg-rust 1b01c099 which adds the disable feature.
For fresh sinks, the catch-up batch was incorrectly starting from
Timestamp::minimum() instead of as_of, causing it to cover a range
where no data exists.

Use max(resume_upper, as_of) as the batch lower bound to handle both:
- Fresh sinks: start from as_of (where data actually begins)
- Resuming sinks: start from resume_upper (where we left off)
Add debug! and trace! logging at key points to help diagnose issues:
- Batch description minting (catch-up and future batches)
- Waiting for first batch description before processing data
- Batch descriptions received by write operator
- Stashed rows (trace level) and periodic stash size warnings
- Batch closing with frontier positions
- Files written per batch

This will help debug snapshot processing issues and frontier advancement.
Track max observed timestamps before init to synthesize an upper when a bounded input closes, and exit cleanly once the frontier is empty after init. Start minting once the frontier reaches as_of/resume_upper instead of waiting past them. Close write batches when the input frontier reaches the batch upper and only rescan when batch/frontier advances.
Ensure inactive mint workers drop the table-ready capability so downstream operators do not block waiting for a ready signal.
@DAlperin DAlperin force-pushed the dov/iceberg-improvements branch from afecddf to fa9d85f Compare February 5, 2026 16:26
@DAlperin DAlperin force-pushed the dov/iceberg-improvements branch 2 times, most recently from 836f684 to b01656a Compare February 5, 2026 17:37
@DAlperin DAlperin force-pushed the dov/iceberg-improvements branch 3 times, most recently from e3d2ace to edcbf1e Compare February 7, 2026 22:33
Switch from using REST catalog for S3 Tables connections to the native
S3TablesCatalog implementation from iceberg-rust.

Changes:
- Add iceberg-catalog-s3tables and aws-sdk-s3tables dependencies
- Update connect_s3tables() to use S3TablesCatalogBuilder with pre-configured
  aws-sdk-s3tables client for S3 Tables API calls
- For static credentials: pass access key/secret as FileIO properties
- For AssumeRole: use CustomAwsCredentialLoader to provide the full
  credential chain (ambient → jump role → user role with external ID)
- Update load_or_create_table() to recognize S3 Tables NotFoundException
  error format ("The specified table does not exist")
- Update workspace to use iceberg-rust rev 1b3541c6 which includes:
  - with_file_io_extension() method for S3TablesCatalog
  - debug tracing for update_table performance tracking

This ensures S3 Tables connections properly propagate auth configuration
for both the control plane (S3 Tables API) and data plane (S3 object access).
@DAlperin DAlperin force-pushed the dov/iceberg-improvements branch from edcbf1e to 72d6254 Compare February 7, 2026 23:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant