Skip to content

spec: clarify partition value when writing data files #14925

@kevinjqliu

Description

@kevinjqliu

Apache Iceberg version

None

Query engine

None

Please describe the bug 🐞

Relates to #14914

In the "Writing data files" section of the spec, it mentions that
"""
All columns must be written to data files even if they introduce redundancy with metadata stored in manifest files (e.g. columns with identity partition transforms).
"""

However, in the "Column Projection" section of the spec, the partition value can be missing from the data files for identity transforms. The identity partition values can be projected by the reader to support the hive migration use case.

We should clarify the "Writing data files" and mention this particular edge case

Willingness to contribute

  • I can contribute a fix for this bug independently
  • I would be willing to contribute a fix for this bug with guidance from the Iceberg community
  • I cannot contribute a fix for this bug at this time

Metadata

Metadata

Assignees

No one assigned

    Labels

    SpecificationIssues that may introduce spec changes.docs

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions