Skip to content

Add ReadFileError and DecodeColumnError wrappers to provide better context for decoding errors #5602

@wjones127

Description

@wjones127

Currently, errors that happen when decoding a column are missing information about:

  1. What column that is
  2. What file we are reading

This context is available higher up in the call stack that the main error is generate. We can add additional context by adding wrapper errors.

This approach is described nicely in https://sabrinajewson.org/blog/errors.

Basically, want to do things like:

enum DecodeError {
    NumberOutOfRange,
    ...
}

struct DecodeFieldError {
    field_name: String,
    field_id: i32,
    offset: u64,
    source: DecodeError,
}

fn decode(field: Field, data: Bytes) -> Result<_, DecodeFieldError> {
    ...
}

enum ReadFileErrorKind {
    IO(ObjectStoreError),
    Decode(DecodeFieldError),
}

struct ReadFileError {
    path: Path,
    kind: ReadFileErrorKind,
}

fn read_file(path: Path) -> Result<_, ReadFileError> {
    let data = read_data(path)?;
    for field in data.schema() {
        decode(field, data).map_err(|err: DecodeFieldError| {
            ReadFileError { path, kind: ReadFileErrorKind::Decode(err) }
        });
    }
}

And then the error stack would look like:

Error: failed to read file 'data.lance'

Caused by:
    0: failed to read field 'age' (id: 10) for row offset 10
    1: number out of range

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions