Skip to content

validate protobuf message index bounds on deserialize#2271

Open
uwezkhan wants to merge 1 commit into
confluentinc:masterfrom
uwezkhan:protobuf-msgidx-bounds
Open

validate protobuf message index bounds on deserialize#2271
uwezkhan wants to merge 1 commit into
confluentinc:masterfrom
uwezkhan:protobuf-msgidx-bounds

Conversation

@uwezkhan

Copy link
Copy Markdown

What

The protobuf deserializer picks the writer message type by walking the message-index array from the Confluent wire framing into the parsed FileDescriptorProto. SchemaId._read_index_array caps the array length but not the individual index values, and those values are zigzag varints, so a crafted message can carry a negative or out-of-range index. _get_message_desc_proto then indexes desc.message_type / desc.nested_type with the value directly.

Before: a negative index (for example uvarint 0x01, which zigzag-decodes to -1) wraps around and resolves to a different message type in the same file, so the payload is decoded against the wrong descriptor. An out-of-range positive index raises a bare IndexError.

After: an index outside 0 <= index < len(...) raises SerializationError with the index and the available count, matching the bound check the confluent-kafka-go reference (toMessageDesc) already applies. Valid indexes behave the same as today.

Tradeoff: a message that previously decoded against an unintended message type now fails fast instead of returning a mis-typed object. That is the point of the change, but it does convert a silent wrong-result into a surfaced error.

Checklist

  • Contains customer facing changes? Including API/behavior changes
  • Did you add sufficient unit test and/or integration test coverage for this PR?

References

JIRA:

Test & Review

Added test_message_index_in_range and test_message_index_out_of_range to tests/schema_registry/_async/test_proto.py (and the generated _sync counterpart). The out-of-range cases [-1], [2], [0, -1], [0, 5] reproduce the issue on master and pass after the change. _sync files were regenerated with tools/unasync.py.

Open questions / Follow-ups

None.

@uwezkhan uwezkhan requested review from a team and Matthew Seal (MSeal) as code owners June 10, 2026 06:04
@confluent-cla-assistant

Copy link
Copy Markdown

🎉 All Contributor License Agreements have been signed. Ready to merge.
✅ uwezkhan
Please push an empty commit if you would like to re-run the checks to verify CLA status for all contributors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant