Skip to content

Support For Large File Upload#277

Open
PujaDeshmukh17 wants to merge 15 commits into
developfrom
SDMEXT-largeFileUpload-feature
Open

Support For Large File Upload#277
PujaDeshmukh17 wants to merge 15 commits into
developfrom
SDMEXT-largeFileUpload-feature

Conversation

@PujaDeshmukh17

@PujaDeshmukh17 PujaDeshmukh17 commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Describe your changes

This PR adds support for uploading files larger than 400 MB to SAP Document Management (SDM) without causing out-of-memory errors.

What changed:

  • lib/handler/index.js — createAttachment now routes files >400 MB through a new chunked upload path: creates an empty placeholder document in SDM first, then streams the file in 20 MB chunks via appendContentStream. Files ≤400 MB continue through the existing single-POST path unchanged.

  • lib/ReadAheadStream.js (new) — A bounded read-ahead buffer that pre-loads up to 4 chunks (80 MB max) while the previous chunk is being uploaded, improving throughput. Handles both Buffer and Readable stream inputs, with exponential backoff retry for transient read errors and clean handling of client disconnects.

  • lib/sdm.js + lib/persistence/index.js — An orphan queue tracks the SDM objectId of any in-progress chunked upload. If the upload fails mid-way, the incomplete document is deleted with retry backoff; if cleanup also fails, the orphan queue entry survives so a startup reconciliation job can clean it up later.

  • lib/util/index.js — Added getContentLength helper to determine file size when contentLength is not set by the caller.

Any documentation

Test was performed on file sizes 500MB, 1GB and 2GB
Screenshot 2026-06-17 at 1 11 14 PM

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Checklist before requesting a review

  • I have tested the functionality on my cloud environment.
  • I have provided sufficient automated/ unit tests for the code.
  • I have increased or maintained the test coverage.
  • I have ran integration tests on my cloud environment.
  • I have validated blackduck portal for any vulnerability after my commit.

Upload Screenshots/lists of the scenarios tested

  • I have Uploaded Screenshots or added lists of the scenarios tested in description

- Add ReadAheadStream: parallel read-ahead queue (4×20MB = 80MB max),
  backpressure, exponential-backoff retry, client-disconnect detection
- handler/index.js: route uploads >400MB to chunked CMIS appendContent
  path; single-chunk path unchanged for files <=400MB; inline cleanup
  of incomplete documents retries up to 3 times (2s/4s/8s backoff)
- util/index.js: add getContentLength() to detect content size from
  Buffer, Readable stream, or size-bearing objects
- index.cds: add sap.sdm.OrphanCleanupQueue entity to persist objectIds
  of documents that could not be cleaned up inline
- persistence/index.js: enqueueOrphan / dequeueOrphan / getAllOrphans
- sdm.js: read Content-Length header and pass contentLength + orphan
  queue callbacks into attachment data; reconciliation job runs on
  server startup (cds.on served) to delete any persisted orphans
Root cause (from CF logs, exit 137 OOM on 1GB file):
- uploadLargeFileInChunks called streamToBuffer() which collected the
  entire 1GB content into a single Buffer before ReadAheadStream could
  process it, negating the chunked upload design entirely.

Fixes:
- Remove streamToBuffer() call; pass req.data.content directly to
  ReadAheadStream — the Buffer is already in memory from CDS body
  parsing, no second copy needed.
- ReadAheadStream._preloadChunks: add zero-copy Buffer fast path using
  buf.slice() references instead of allocUnsafe+copy for each chunk.
  Queue holds ≤4 slice refs (no extra memory) rather than 4×20MB copies.
- ReadAheadStream.startReading: poll for first chunk explicitly so
  _loadNextChunk is only called once the queue has data (avoids race).
- package.json: raise CDS body_parser limit to 2gb so the request body
  is not rejected before reaching the upload handler.

Note: the CF app manifest memory quota must also be raised to ≥2.5GB
to accommodate the 1GB body buffer + Node.js overhead + 80MB queue.
Comment thread lib/ReadAheadStream.js Fixed
Comment thread lib/handler/index.js Fixed
Comment thread lib/handler/index.js Fixed
Comment thread test/lib/handler/index.test.js Fixed
Comment thread test/lib/handler/index.test.js Fixed
Comment thread test/lib/handler/index.test.js Fixed
Comment thread test/lib/util/index.test.js Fixed
Comment thread jest.config.js
const config = {
verbose: true,
testTimeout: 100000,
forceExit: true,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do v need this

Comment thread index.cds
* chunked upload did not complete successfully and could not be deleted inline.
* A reconciliation job on server startup retries deletion of each entry.
*/
entity sap.sdm.OrphanCleanupQueue {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do v need this entity?

Comment thread package.json
"lint": "npx eslint --fix . --no-cache"
},
"cds": {
"server": {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this limit upload to 2gb

Comment thread lib/sdm.js

console.log(`[orphan-queue] Reconciling ${orphans.length} orphaned SDM document(s)...`);

for (const orphan of orphans) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we document this somewhere in wiki how does this work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants