CLDSRV-873: return checksum in HeadObject and GetObject#6117
Conversation
| const responseHeaders = collectResponseHeaders(objMD, corsHeaders, | ||
| verCfg); | ||
|
|
||
| if (checksumMode === 'ENABLED') { |
There was a problem hiding this comment.
PR title says "return checksum in HeadObject and GetObject" but only HeadObject is implemented. lib/api/objectGet.js has no changes — it does not validate x-amz-checksum-mode or set checksum response headers. Either add GetObject support or update the title.
— Claude Code
Review by Claude Code |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files
... and 2 files with indirect coverage changes @@ Coverage Diff @@
## improvement/CLDSRV-872-store-checksum-in-object-metadata #6117 +/- ##
============================================================================================
+ Coverage 84.69% 84.72% +0.02%
============================================================================================
Files 207 207
Lines 13540 13549 +9
============================================================================================
+ Hits 11468 11479 +11
+ Misses 2072 2070 -2
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
7d130cc to
8e51413
Compare
There was a problem hiding this comment.
Pull request overview
This PR adds end-to-end checksum handling: computing/validating checksums on incoming PUT/UploadPart streams, persisting them in object metadata, and returning checksum headers in responses (notably for HeadObject when checksum mode is enabled).
Changes:
- Add checksum parsing/validation utilities and a new
ChecksumTransformstream to compute digests while streaming uploads. - Store computed checksum data into metadata (including defaulting to CRC64NVME when no checksum is provided) and return checksum response headers for
PutObjectandHeadObject(x-amz-checksum-mode: ENABLED). - Expand unit/functional test coverage for checksum headers, trailer parsing, and rejection paths.
Reviewed changes
Copilot reviewed 26 out of 27 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| yarn.lock | Pins arsenal to a specific commit on the checksum-metadata branch. |
| package.json | Updates arsenal dependency to a git branch providing checksum metadata support. |
| lib/services.js | Persists checksum into ObjectMD and returns checksum in metadataStoreObject result. |
| lib/routes/routeBackbeat.js | Explicitly ignores computed checksum from dataStore for Backbeat data-only writes. |
| lib/auth/streamingV4/trailingChecksumTransform.js | Parses trailing checksum trailer and emits a trailer event for downstream validation. |
| lib/auth/streamingV4/ChecksumTransform.js | New transform to compute per-algorithm digests and validate expected/trailer checksums. |
| lib/api/objectPut.js | Returns x-amz-checksum-* response header when checksum is available from storage result. |
| lib/api/objectHead.js | Adds x-amz-checksum-mode validation and returns checksum headers when enabled. |
| lib/api/apiUtils/object/storeObject.js | Integrates prepareStream checksum pipeline, validates checksum post-put, and deletes data on checksum failure. |
| lib/api/apiUtils/object/prepareStream.js | Builds stream pipeline (V4 / trailing trailer parsing / checksum transform) and returns { error, stream }. |
| lib/api/apiUtils/object/createAndStoreObject.js | Adds checksum handling for zero-byte objects (computes and stores empty-body checksum). |
| lib/api/apiUtils/integrity/validateChecksums.js | Adds trailer-related checksum error types, header parsing (getChecksumDataFromHeaders), and error mapping. |
| tests/unit/auth/TrailingChecksumTransform.js | Refactors and expands trailer parsing tests; updates expectations for new error behavior. |
| tests/unit/auth/ChecksumTransform.js | New unit tests for checksum computation and trailer/non-trailer validation logic. |
| tests/unit/api/utils/metadataMockColdStorage.js | Exports baseMd for reuse in checksum-mode unit tests. |
| tests/unit/api/objectPut.js | Adds unit tests asserting checksum stored in metadata and returned in response headers. |
| tests/unit/api/objectHead.js | Adds unit tests for checksum-mode response headers across algorithms. |
| tests/unit/api/apiUtils/object/storeObject.js | New tests for dataStore() checksum validation/cleanup and callback correctness. |
| tests/unit/api/apiUtils/object/prepareStream.js | New tests validating prepareStream return shape and pipeline behavior for multiple modes. |
| tests/unit/api/apiUtils/integrity/validateChecksums.js | Adds unit tests for new header parsing and error mapping helpers. |
| tests/multipleBackend/routes/routeBackbeat.js | Adds checksum validation expectations for Backbeat data write routes. |
| tests/functional/raw-node/test/trailingChecksums.js | Updates trailing checksum digest constant in raw-node functional test. |
| tests/functional/raw-node/test/routes/routeMetadata.js | Adds end-to-end verification that checksum is stored in metadata after PutObject. |
| tests/functional/raw-node/test/checksumPutObjectUploadPart.js | New extensive raw-node functional suite covering checksum rejection and trailer scenarios. |
| tests/functional/aws-node-sdk/test/object/putVersion.js | Updates metadata comparison exclusions to include checksum. |
| tests/functional/aws-node-sdk/test/object/objectHead.js | Adds SDK-level HeadObject checksum-mode tests and CRT CRC64NVME wiring. |
| tests/functional/aws-node-sdk/test/object/mpuVersion.js | Adds workaround for MPU restore path that does not yet persist checksums. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const checksumMode = request.headers['x-amz-checksum-mode']; | ||
| if (checksumMode !== undefined && checksumMode !== 'ENABLED') { | ||
| log.debug('invalid x-amz-checksum-mode', { checksumMode }); | ||
| return callback(errors.InvalidArgument); | ||
| } |
There was a problem hiding this comment.
PR title mentions returning checksums for both HeadObject and GetObject, but this PR only adds x-amz-checksum-mode handling to HEAD. GET currently does not parse x-amz-checksum-mode nor return x-amz-checksum-* / x-amz-checksum-type headers, so the PR scope/title may be inconsistent.
| "@hapi/joi": "^17.1.1", | ||
| "@smithy/node-http-handler": "^3.0.0", | ||
| "arsenal": "git+https://github.com/scality/Arsenal#8.3.6", | ||
| "arsenal": "git+https://github.com/scality/Arsenal#improvement/ARSN-557-add-checksum-to-object-metadata", |
There was a problem hiding this comment.
package.json now points arsenal to a moving git branch name. Even though yarn.lock pins a commit today, using a branch here makes installs without the lockfile (or future lockfile regenerations) non-reproducible. Prefer pinning to a specific commit hash or a released tag/version once available.
|
|
||
| _transform(chunk, encoding, callback) { | ||
| const input = Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk); | ||
| this.hash.update(input, encoding); |
There was a problem hiding this comment.
ChecksumTransform._transform passes the encoding argument into hash.update(). Some of the checksum hash implementations used here (e.g., @aws-sdk/crc64-nvme-crt / @aws-crypto crc32*) typically accept a single Buffer/Uint8Array argument and may throw if given an extra parameter. Safer is to always call update with the Buffer only (or only pass encoding when the original chunk is a string).
| this.hash.update(input, encoding); | |
| this.hash.update(input); |
| if (this.trailerChecksumValue) { | ||
| // Trailer found in the body but no x-amz-trailer in the headers. | ||
| return { | ||
| error: ChecksumError.TrailerUnexpected, | ||
| details: { name: this.trailerChecksumName, val: this.trailerChecksumValue }, |
There was a problem hiding this comment.
In validateChecksum(), the non-trailer path checks if (this.trailerChecksumValue) to detect an unexpected trailer. This is a truthiness check, so an empty-string trailer value would be treated as “no trailer” and the request could be accepted incorrectly. Use an explicit !== undefined (or similar) check so trailer presence is detected reliably.
| }; | ||
| } | ||
|
|
||
| const trailer = headers['x-amz-trailer']; |
There was a problem hiding this comment.
getChecksumDataFromHeaders() assumes headers['x-amz-trailer'] is a string and calls .startsWith()/.slice() directly. If the header is present but not a string (e.g., an array or other type from a proxy/normalizer), this will throw and turn a client error into a 500. Add a type check and return a TrailerNotSupported (or InvalidRequest) checksum error when the value is not a string.
| const trailer = headers['x-amz-trailer']; | |
| const trailer = headers['x-amz-trailer']; | |
| if (typeof trailer !== 'string') { | |
| return { error: ChecksumError.TrailerNotSupported, details: { value: trailer } }; | |
| } |
No description provided.