Skip to content

init/aws-nitro: support partial read/write ethernet packet forwarding#616

Open
jakecorrenti wants to merge 1 commit intocontainers:mainfrom
jakecorrenti:fix-nitro-partial-read-write
Open

init/aws-nitro: support partial read/write ethernet packet forwarding#616
jakecorrenti wants to merge 1 commit intocontainers:mainfrom
jakecorrenti:fix-nitro-partial-read-write

Conversation

@jakecorrenti
Copy link
Copy Markdown
Member

Ensure the requested number of bytes is forwarded to/from the host vsock providing the network access and the guest TAP device routing the application network traffic.

To write this code I used Cursor and the claude-4.6-opus-high model.

Reproducer

To reproduce the issue, have the examples/nitro.c example execute the following:

TOKEN=$(curl -sf -X PUT 'http://169.254.169.254/latest/api/token' \
          -H 'X-aws-ec2-metadata-token-ttl-seconds: 21600')
echo "IMDSv2 token acquired"
ROLE=$(curl -sf 'http://169.254.169.254/latest/meta-data/iam/security-credentials/' \
              -H "X-aws-ec2-metadata-token: $TOKEN")
echo "--- IAM role: $ROLE ---"
curl -sf "http://169.254.169.254/latest/meta-data/iam/security-credentials/$ROLE" \
                -H "X-aws-ec2-metadata-token: $TOKEN" || { echo "error: failed to fetch credentials for role: $ROLE" >&2; exit 1; }

Without the above, about 90% of the time the final curl command will take about 15 seconds before failing because the client connection is closed. With the new diff, it should work 100% of the time.

@jakecorrenti
Copy link
Copy Markdown
Member Author

cc @tylerfanelli

@jakecorrenti jakecorrenti force-pushed the fix-nitro-partial-read-write branch 2 times, most recently from 7a1bbf4 to 0f318f1 Compare April 2, 2026 15:31
@jakecorrenti jakecorrenti marked this pull request as ready for review April 2, 2026 15:32
@jakecorrenti
Copy link
Copy Markdown
Member Author

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces read_exact and write_all utility functions to ensure robust I/O operations by handling partial reads/writes and EINTR signals. These utilities are integrated into the packet forwarding logic between the vsock and TAP device. Feedback highlights a critical buffer overflow risk where incoming packet lengths are not validated against the MTU. Additionally, improvements are suggested for read_exact to set errno on partial reads and for write_all to handle zero-length writes to avoid infinite loops.

@jakecorrenti jakecorrenti force-pushed the fix-nitro-partial-read-write branch from 0f318f1 to a7b5239 Compare April 2, 2026 15:57
@jakecorrenti jakecorrenti marked this pull request as draft April 2, 2026 15:57
@jakecorrenti jakecorrenti force-pushed the fix-nitro-partial-read-write branch from a7b5239 to 59b5885 Compare April 2, 2026 16:21
@jakecorrenti jakecorrenti marked this pull request as ready for review April 2, 2026 16:22
@jakecorrenti
Copy link
Copy Markdown
Member Author

/gemini review

@jakecorrenti jakecorrenti force-pushed the fix-nitro-partial-read-write branch from 59b5885 to 1f0c166 Compare April 2, 2026 16:23
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the reliability of the TAP-to-vsock forwarding mechanism by implementing read_exact and write_all functions to handle partial reads/writes and system call interruptions. It also updates the buffer allocation to include the Ethernet header length and adds validation for incoming frame sizes. Review feedback suggests using standard exit codes instead of negative values, replacing hardcoded sizes with sizeof(uint32_t), and using ntohl for network-to-host conversion. Additionally, a concern was raised regarding the use of write_all on packet-oriented TAP devices, which could lead to fragmented packets.

@jakecorrenti jakecorrenti force-pushed the fix-nitro-partial-read-write branch from 1f0c166 to 49c412e Compare April 2, 2026 20:29
@jakecorrenti
Copy link
Copy Markdown
Member Author

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves the reliability of the AWS Nitro TAP-to-vsock forwarder by introducing read_exact and write_all helper functions to handle partial I/O and signal interruptions. It also updates buffer allocation to include Ethernet headers and adds validation for incoming frame sizes. Review feedback recommends standardizing exit codes to EXIT_FAILURE instead of errno and refining error checks to explicitly handle partial writes and unexpected EOF during frame reads.

@jakecorrenti jakecorrenti force-pushed the fix-nitro-partial-read-write branch from 49c412e to 1ffa237 Compare April 2, 2026 21:57
@jakecorrenti
Copy link
Copy Markdown
Member Author

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the AWS Nitro TAP-to-vsock forwarder by implementing robust I/O helpers for partial transfers and EINTR handling, adjusting buffer sizes for Ethernet headers, and adding frame validation. Review feedback suggests using fixed-width uint32_t types for protocol headers to ensure portability and recommends handling EINTR for TAP device reads to maintain consistency.

@jakecorrenti jakecorrenti force-pushed the fix-nitro-partial-read-write branch from 1ffa237 to dd79d2d Compare April 2, 2026 22:26
@jakecorrenti
Copy link
Copy Markdown
Member Author

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves the robustness of network packet forwarding in net_tap_afvsock.c by implementing helper functions for complete I/O operations and handling signal interruptions. Key changes include correcting buffer allocations for Ethernet headers, fixing byte-order handling for frame lengths, and enhancing error management. Feedback suggests further improving portability and type safety by using uint32_t for variables used in network size exchanges to avoid platform-dependent issues.

@jakecorrenti jakecorrenti force-pushed the fix-nitro-partial-read-write branch from dd79d2d to de85041 Compare April 2, 2026 22:37
@jakecorrenti
Copy link
Copy Markdown
Member Author

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves the reliability of network packet forwarding between the TAP device and vsock by introducing read_exact and write_all helpers to handle partial I/O and EINTR signals. It also expands the internal buffer to accommodate the Ethernet header. Feedback indicates that the MTU value sent to the host may be insufficient as it excludes the Ethernet header, potentially causing buffer overflows or stream corruption on the host side. Furthermore, a lack of error handling for TAP device reads could result in an infinite busy-loop if a persistent error occurs.

@jakecorrenti jakecorrenti marked this pull request as draft April 2, 2026 22:54
@jakecorrenti jakecorrenti force-pushed the fix-nitro-partial-read-write branch from de85041 to 396a9f5 Compare April 3, 2026 05:11
@jakecorrenti
Copy link
Copy Markdown
Member Author

/gemini review

@jakecorrenti jakecorrenti marked this pull request as ready for review April 3, 2026 05:12
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves the robustness of the Nitro enclave network proxy by implementing helper functions for exact reads and writes, enhancing error handling, and ensuring proper memory management. It also updates the communication protocol to account for the Ethernet header length. Review feedback highlights an inconsistency in endianness during the initial MTU exchange, recommending the use of network byte order to match the subsequent packet framing.

Ensure the total requested number of bytes is forwarded to/from the host
vsock providing the network access and the guest TAP device routing the
application network traffic.

To write this code I used Cursor and the claude-4.6-opus-high model.

Signed-off-by: Jake Correnti <jakecorrenti+github@proton.me>
@jakecorrenti jakecorrenti force-pushed the fix-nitro-partial-read-write branch from 396a9f5 to 5f8024a Compare April 3, 2026 05:23
@jakecorrenti
Copy link
Copy Markdown
Member Author

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the network TAP-to-vsock forwarding mechanism by implementing robust I/O handling and cross-endian compatibility. Key improvements include the introduction of read_exact and write_all helper functions to manage partial reads/writes and system interrupts, the use of network byte order for frame size communication, and the addition of proper memory cleanup on error paths to prevent leaks. Feedback was provided regarding type consistency, specifically suggesting the use of uint32_t for the frame length variable to match the return type of network byte order conversion functions.

}

unsigned int len = htonl(sz);
unsigned int len = ntohl(sz);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the sz variable, len should be defined as uint32_t to match the expected return type of ntohl and ensure consistent behavior when comparing against eth_frame_size. This ensures the code is explicit and portable, aligning with best practices for network protocol implementation.

            uint32_t len = ntohl(sz);
References
  1. Prioritize code readability and focused commits over micro-optimizations. Refactoring for minor efficiency gains may be rejected if it harms readability or expands the scope of a change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants