-
Notifications
You must be signed in to change notification settings - Fork 108
Benchmark S3ThreadPoolExecutor vs S3AioExecutor before defaulting to async #685
Copy link
Copy link
Open
Description
Summary
Compare performance of S3ThreadPoolExecutor (sync, current default) vs S3AioExecutor (async, new) to validate the switch to AioS3FileSystem as the default in v3.30.0.
Background
PR #684 introduced the S3Executor strategy pattern, replacing hardcoded ThreadPoolExecutor usage with a pluggable interface. This eliminates thread-in-thread nesting when aio cursors use S3FileSystem. Before making AioS3FileSystem the default for async paths, we need empirical performance data.
Related:
- Add S3Executor strategy pattern for async S3 operations #684: Add S3Executor strategy pattern for async S3 operations
- Comprehensive cursor benchmark with memory and performance metrics #644: Comprehensive cursor benchmark with memory and performance metrics
Benchmark Scope
Scenarios
| Scenario | Description |
|---|---|
| Query result fetch | AioS3FSCursor fetch performance (small/medium/large result sets) |
| Large file read | Multipart range read via _fetch_range |
| Large file write | Multipart upload via commit |
| Parallel copy | _copy_object_with_multipart_upload |
Metrics
- Wall-clock time (latency)
- Throughput (MB/s)
- Concurrency behavior under varying
max_workers
Comparison
S3FileSystem+S3ThreadPoolExecutor(sync baseline)AioS3FileSystem+S3AioExecutor(async candidate)
Acceptance Criteria
- Benchmark script(s) covering the scenarios above
- Results showing no significant regression for async path
- Summary with recommendation for v3.30.0 default switch
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels