[Improvement] Fix Lance partition statistics writes failing due to table_id vector type mismatch

### What would you like to be improved?

LancePartitionStatisticStorage defines the table_id Arrow column as a 64-bit integer, but createFragmentMetadata() retrieves it as UInt8Vector and writes a Long table ID into that vector.

Relevant locations: LancePartitionStatisticStorage.java (line 114) and LancePartitionStatisticStorage.java (line 388)

This makes the Lance-backed partition statistics update path inconsistent with its own schema and can fail at runtime when statistics are written.

### How should we improve?

Use the Arrow vector type that matches the declared schema for table_id, such as BigIntVector, instead of UInt8Vector.

Here's a unit test to help:
```
@Test
  public void testUpdateStatisticsWithLargeTableId() throws Exception {
    PartitionStatisticStorageFactory factory = new LancePartitionStatisticStorageFactory();
    String metalakeName = "metalake";
    MetadataObject metadataObject =
        MetadataObjects.of(
            Lists.newArrayList("catalog", "schema", "table"), MetadataObject.Type.TABLE);

    EntityStore entityStore = mock(EntityStore.class);
    TableEntity tableEntity = mock(TableEntity.class);
    when(entityStore.get(any(), any(), any())).thenReturn(tableEntity);
    when(tableEntity.id()).thenReturn(256L);
    FieldUtils.writeField(GravitinoEnv.getInstance(), "entityStore", entityStore, true);

    String location = Files.createTempDirectory("lance_stats_large_table_id").toString();
    Map<String, String> properties = Maps.newHashMap();
    properties.put("location", location);

    LancePartitionStatisticStorage storage =
        (LancePartitionStatisticStorage) factory.create(properties);
    try {
      Map<String, StatisticValue<?>> statistics = Maps.newHashMap();
      statistics.put("statistic0", StatisticValues.stringValue("value0"));

      storage.updateStatistics(
          metalakeName,
          Lists.newArrayList(
              MetadataObjectStatisticsUpdate.of(
                  metadataObject,
                  Lists.newArrayList(
                      PartitionStatisticsModification.update("partition0", statistics)))));

      List<PersistedPartitionStatistics> listedStats =
          storage.listStatistics(
              metalakeName,
              metadataObject,
              PartitionRange.between(
                  "partition0",
                  PartitionRange.BoundType.CLOSED,
                  "partition0",
                  PartitionRange.BoundType.CLOSED));

      Assertions.assertEquals(1, listedStats.size());
      Assertions.assertEquals("partition0", listedStats.get(0).partitionName());
      Assertions.assertEquals(1, listedStats.get(0).statistics().size());
      Assertions.assertEquals("statistic0", listedStats.get(0).statistics().get(0).name());
      Assertions.assertEquals("value0", listedStats.get(0).statistics().get(0).value().value());
    } finally {
      FileUtils.deleteDirectory(new File(location + "/" + tableEntity.id() + ".lance"));
      storage.close();
    }
  }
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Improvement] Fix Lance partition statistics writes failing due to table_id vector type mismatch #10603

What would you like to be improved?

How should we improve?

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Improvement] Fix Lance partition statistics writes failing due to table_id vector type mismatch #10603

Description

What would you like to be improved?

How should we improve?

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions