If you have a bucket s3://mybucket which has prefixes which you are allowed to write into via IAM roles, e.g. s3://mybucket/allowed_dir, you cannot currently use S3FileSystem::CreateDir(path, recursive=true) because a call of HeadBucket.
For e.g.:
(base) ray@ryan-test-head-wkzmw:~$ python
Python 3.12.9 | packaged by Anaconda, Inc. | (main, Feb 6 2025, 18:56:27) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyarrow
>>> import pyarrow.fs as fs
>>> s3 = fs.S3FileSystem()
>>> s3.create_dir("test-bucket/pepperr/test", recursive=True)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pyarrow/_fs.pyx", line 638, in pyarrow._fs.FileSystem.create_dir
File "pyarrow/error.pxi", line 92, in pyarrow.lib.check_status
OSError: When testing for existence of bucket 'test-bucket': AWS Error ACCESS_DENIED during HeadBucket operation: No response body.
In this case, the bucket exists, but the user is not allowed to do the HEAD operation on the root bucket itself. Nonetheless, the user is able to write/do things in this location, so they should not be blocked by this.
>>> boto3.client("s3").put_object(Bucket="test-bucket", Key="pepperr/test/probe.txt", Body=b"hello")
{'ResponseMetadata': {'RequestId': 'FXJ3YG658VKQ8725', 'HostId': '37ezZQHJ0KzbBAQEU6uuDYWIIBa9XpNqs/E6neTpT/8XDpTIh7TE6hBXrHJDT+19STrC6QRcUKlp1zqeIMlLfCnvU/lPvtCp', 'HTTPStatusCode': 200, ....}
I think the check comes in here on line 3177; if recursive=True, then you hit a code block which checks if the bucket exists:
|
Status S3FileSystem::CreateDir(const std::string& s, bool recursive) { |
|
ARROW_ASSIGN_OR_RAISE(auto path, S3Path::FromString(s)); |
|
|
|
if (path.key.empty()) { |
|
// Create bucket |
|
return impl_->CreateBucket(path.bucket); |
|
} |
|
|
|
ARROW_ASSIGN_OR_RAISE(auto backend, impl_->GetBackend()); |
|
|
|
FileInfo file_info; |
|
if (recursive) { |
|
// Ensure bucket exists |
|
ARROW_ASSIGN_OR_RAISE(bool bucket_exists, impl_->BucketExists(path.bucket)); |
|
if (!bucket_exists) { |
|
RETURN_NOT_OK(impl_->CreateBucket(path.bucket)); |
|
} |
|
|
|
auto key_i = path.key_parts.begin(); |
Whereas this needs to actually be more permissive; the S3 call should be issued rather than performing an explicit check for bucket existence, and failure reported up the stack.
Component(s)
C++
If you have a bucket
s3://mybucketwhich has prefixes which you are allowed to write into via IAM roles, e.g.s3://mybucket/allowed_dir, you cannot currently useS3FileSystem::CreateDir(path, recursive=true)because a call of HeadBucket.For e.g.:
In this case, the bucket exists, but the user is not allowed to do the HEAD operation on the root bucket itself. Nonetheless, the user is able to write/do things in this location, so they should not be blocked by this.
I think the check comes in here on line 3177; if recursive=True, then you hit a code block which checks if the bucket exists:
arrow/cpp/src/arrow/filesystem/s3fs.cc
Lines 3162 to 3180 in ebaaf07
Whereas this needs to actually be more permissive; the S3 call should be issued rather than performing an explicit check for bucket existence, and failure reported up the stack.
Component(s)
C++