Apache Iceberg version
None
Query engine
None
Please describe the bug 🐞
Relates to #14914
In the "Writing data files" section of the spec, it mentions that
"""
All columns must be written to data files even if they introduce redundancy with metadata stored in manifest files (e.g. columns with identity partition transforms).
"""
However, in the "Column Projection" section of the spec, the partition value can be missing from the data files for identity transforms. The identity partition values can be projected by the reader to support the hive migration use case.
We should clarify the "Writing data files" and mention this particular edge case
Willingness to contribute
Apache Iceberg version
None
Query engine
None
Please describe the bug 🐞
Relates to #14914
In the "Writing data files" section of the spec, it mentions that
"""
All columns must be written to data files even if they introduce redundancy with metadata stored in manifest files (e.g. columns with identity partition transforms).
"""
However, in the "Column Projection" section of the spec, the partition value can be missing from the data files for identity transforms. The identity partition values can be projected by the reader to support the hive migration use case.
We should clarify the "Writing data files" and mention this particular edge case
Willingness to contribute