Skip to content

Refactor: decouple COS and auth from FleetHandler#2108

Merged
Tansito merged 11 commits into
mainfrom
refactor-code-engine-auth
May 15, 2026
Merged

Refactor: decouple COS and auth from FleetHandler#2108
Tansito merged 11 commits into
mainfrom
refactor-code-engine-auth

Conversation

@ElePT
Copy link
Copy Markdown
Collaborator

@ElePT ElePT commented May 11, 2026

Summary

FleetHandler was doing too much: auth setup, COS credential resolution, and CE API calls. This PR splits those responsibilities apart. build_ce_auth is a new shared factory in clients.py that returns a ready (ApiClient, IBMCloudClientProvider). On the COS side, build_cos_client(project) is a standalone factory that fetches HMAC credentials from the CE secret and returns a ready JobCOS. FleetsRunner initializes the two independently via _get_handler() and _get_cos().

As part of this PR work I found that the swagger Configuration class used a shallow-copy singleton, so multiple FleetHandler instances were mutating each other's IAM token dict. This is now fixed by assigning fresh dicts instead of mutating. FleetHandler now just takes the pre-built ApiClient and stays a pure CE API wrapper

Details and comments

I am leaving the PR as a draft because I am having issues testing it on the demo branch, I will make it ready for review once I have verified that it works.

The goal of this and the follow-up PR is to enable the following flow at job creation time (rough implementation) without unnecessarily calling the FleetsRunner or the FleetHandler:

code_engine_project = select_ce_project(compute_profile)
cos = build_cos_client(code_engine_project)
cos.upload_fileobj(fileobj=..., bucket_name=code_engine_project.cos_bucket_user_data_name, key=f"jobs/{job.id}/arguments.json")

ElePT added 4 commits May 11, 2026 11:48
… for both. Fixes a subtle bug where Configuration's TypeWithDefault metaclass returns a shallow copy.
…re-built ce_api_client (ApiClient) instead of building one internally from IBMCloudClientProvider. JobCOS is decoupled from FleetHandler: it now wraps a COSClient directly and no longer holds a reference to the handler. build_cos_client() is a standalone factory that resolves HMAC credentials from the CE secret and returns a ready JobCOS.
@ElePT ElePT marked this pull request as ready for review May 12, 2026 13:30
@ElePT ElePT requested a review from a team as a code owner May 12, 2026 13:30
Copy link
Copy Markdown
Member

@Tansito Tansito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @ElePT in essence I think it's the correct path to follow! A couple of minor comments that doesn't change the implementation but I think they are worth it:

  • We are building factories in the __init__ files to avoid local imports like the one we have in build_ce_auth. You can use get_runner or get_arguments_storage as an example.
  • I would like to propose change build_ce_auth and build_cos_client names by get_* to maintain consistency along all our factories.
  • Maybe not for this PR, totally up to you here, I saw that in build_ce_auth we are returning a Tuple. Maybe it's better to return an object in this case as it can be confusing that a Factory returns a tuple here (it's also something we can do later if we want to move forward with this, no problem, it almost a refactor).

@ElePT
Copy link
Copy Markdown
Collaborator Author

ElePT commented May 15, 2026

Suggestions applied @Tansito, including the refactoring you suggested (returning a CEAuth object instead of a tuple). Let me know what you think!

@Tansito Tansito self-requested a review May 15, 2026 13:06
Copy link
Copy Markdown
Member

@Tansito Tansito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty cool thank you much @ElePT 🎉

@Tansito Tansito added this pull request to the merge queue May 15, 2026
Merged via the queue into main with commit 167a27e May 15, 2026
12 checks passed
@Tansito Tansito deleted the refactor-code-engine-auth branch May 15, 2026 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants