Support For Large File Upload#277
Open
PujaDeshmukh17 wants to merge 15 commits into
Open
Conversation
- Add ReadAheadStream: parallel read-ahead queue (4×20MB = 80MB max), backpressure, exponential-backoff retry, client-disconnect detection - handler/index.js: route uploads >400MB to chunked CMIS appendContent path; single-chunk path unchanged for files <=400MB; inline cleanup of incomplete documents retries up to 3 times (2s/4s/8s backoff) - util/index.js: add getContentLength() to detect content size from Buffer, Readable stream, or size-bearing objects - index.cds: add sap.sdm.OrphanCleanupQueue entity to persist objectIds of documents that could not be cleaned up inline - persistence/index.js: enqueueOrphan / dequeueOrphan / getAllOrphans - sdm.js: read Content-Length header and pass contentLength + orphan queue callbacks into attachment data; reconciliation job runs on server startup (cds.on served) to delete any persisted orphans
Root cause (from CF logs, exit 137 OOM on 1GB file): - uploadLargeFileInChunks called streamToBuffer() which collected the entire 1GB content into a single Buffer before ReadAheadStream could process it, negating the chunked upload design entirely. Fixes: - Remove streamToBuffer() call; pass req.data.content directly to ReadAheadStream — the Buffer is already in memory from CDS body parsing, no second copy needed. - ReadAheadStream._preloadChunks: add zero-copy Buffer fast path using buf.slice() references instead of allocUnsafe+copy for each chunk. Queue holds ≤4 slice refs (no extra memory) rather than 4×20MB copies. - ReadAheadStream.startReading: poll for first chunk explicitly so _loadNextChunk is only called once the queue has data (avoids race). - package.json: raise CDS body_parser limit to 2gb so the request body is not rejected before reaching the upload handler. Note: the CF app manifest memory quota must also be raised to ≥2.5GB to accommodate the 1GB body buffer + Node.js overhead + 80MB queue.
…ap-js/sdm into SDMEXT-largeFileUpload-feature
| const config = { | ||
| verbose: true, | ||
| testTimeout: 100000, | ||
| forceExit: true, |
| * chunked upload did not complete successfully and could not be deleted inline. | ||
| * A reconciliation job on server startup retries deletion of each entry. | ||
| */ | ||
| entity sap.sdm.OrphanCleanupQueue { |
Contributor
There was a problem hiding this comment.
why do v need this entity?
| "lint": "npx eslint --fix . --no-cache" | ||
| }, | ||
| "cds": { | ||
| "server": { |
Contributor
There was a problem hiding this comment.
does this limit upload to 2gb
|
|
||
| console.log(`[orphan-queue] Reconciling ${orphans.length} orphaned SDM document(s)...`); | ||
|
|
||
| for (const orphan of orphans) { |
Contributor
There was a problem hiding this comment.
can we document this somewhere in wiki how does this work.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Describe your changes
This PR adds support for uploading files larger than 400 MB to SAP Document Management (SDM) without causing out-of-memory errors.
What changed:
lib/handler/index.js — createAttachment now routes files >400 MB through a new chunked upload path: creates an empty placeholder document in SDM first, then streams the file in 20 MB chunks via appendContentStream. Files ≤400 MB continue through the existing single-POST path unchanged.
lib/ReadAheadStream.js (new) — A bounded read-ahead buffer that pre-loads up to 4 chunks (80 MB max) while the previous chunk is being uploaded, improving throughput. Handles both Buffer and Readable stream inputs, with exponential backoff retry for transient read errors and clean handling of client disconnects.
lib/sdm.js + lib/persistence/index.js — An orphan queue tracks the SDM objectId of any in-progress chunked upload. If the upload fails mid-way, the incomplete document is deleted with retry backoff; if cleanup also fails, the orphan queue entry survives so a startup reconciliation job can clean it up later.
lib/util/index.js — Added getContentLength helper to determine file size when contentLength is not set by the caller.
Any documentation
Test was performed on file sizes 500MB, 1GB and 2GB

Type of change
Please delete options that are not relevant.
Checklist before requesting a review
Upload Screenshots/lists of the scenarios tested