Skip to content

Python: feat(python): support both base64 and url image uploads in Anthropi…#14096

Open
aaarc wants to merge 2 commits into
microsoft:mainfrom
aaarc:python-feat-anthrpoic-image-support
Open

Python: feat(python): support both base64 and url image uploads in Anthropi…#14096
aaarc wants to merge 2 commits into
microsoft:mainfrom
aaarc:python-feat-anthrpoic-image-support

Conversation

@aaarc

@aaarc aaarc commented Jun 17, 2026

Copy link
Copy Markdown

should resolve Issue #12944

Description
The _format_user_message function in python/semantic_kernel/connectors/ai/anthropic/services/utils.py has been updated to correctly parse base64 image data and format it into the expected Anthropic API structure.

Implementation Details:
*Iterates through message.items to check for ImageContent.

If no ImageContent is found, it returns the standard text-only dictionary, preserving existing behavior.
If ImageContent with item.data is present, it unpacks the message items into a list of dictionaries formatted to meet the Anthropic Vision API specifications.
Note: This implementation targets both base64 and url image uploads.
Resolves: ##12944

Motivation and Context
Please help reviewers and future users, providing the following information:

  1. Why is this change required?

To allow Image Uploads (only by user messages) for Anthropic Models

  1. What problem does it solve?

You can now upload images with claude models

  1. What scenario does it contribute to?

Should you have a Chatbot application built with semantic kernel, now anthropic llms can take image uploads

  1. If it fixes an open issue, please link to the issue here.

This should Resolve issue #12944

Contribution Checklist
[Y ] The code builds clean without any errors or warnings
[Y ] The PR follows the SK Contribution Guidelines and the pre-submission formatting script raises no violations
[Y ] All unit tests pass, and I have added new tests where possible
[Y] I didn't break anyone 😄

Modified version of PR#14061 which is now closed

…user messages

Update _format_user_message in �nthropic/services/utils.py to handle now handle both base64 image bytes and image urls

should resolve Issue microsoft#12944
Copilot AI review requested due to automatic review settings June 17, 2026 18:28
@aaarc aaarc requested a review from a team as a code owner June 17, 2026 18:28

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds support for formatting multimodal (text + image) user messages for the Anthropic connector.

Changes:

  • Import ImageContent and detect image items in user messages.
  • Build Anthropic-compatible content blocks for text, base64 images, and URL images.
  • Log a warning for unsupported item types during formatting.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread python/semantic_kernel/connectors/ai/anthropic/services/utils.py
Comment thread python/semantic_kernel/connectors/ai/anthropic/services/utils.py Outdated
Comment thread python/semantic_kernel/connectors/ai/anthropic/services/utils.py Outdated
Comment thread python/semantic_kernel/connectors/ai/anthropic/services/utils.py Outdated
Comment thread python/semantic_kernel/connectors/ai/anthropic/services/utils.py Outdated
@moonbox3 moonbox3 added the python Pull requests for the Python Semantic Kernel label Jun 17, 2026
@github-actions github-actions Bot changed the title feat(python): support both base64 and url image uploads in Anthropi… Python: feat(python): support both base64 and url image uploads in Anthropi… Jun 17, 2026
…user messages

modified and made simple fixes to better follow pep8 guidelines

should resolve Issue microsoft#12944

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 5 | Confidence: 89%

✓ Correctness

The core logic for handling base64 and URL image uploads is sound — the data/uri branching is correct, data_string returns the right base64 encoding, and the Anthropic API message format matches the documented spec. The existing review thread covers the main style/cleanup issues well. I found no high-severity correctness bugs but identified one medium-severity issue: the mime_type fallback expression is redundant and may mask the fact that BinaryContent.default_mime_type is "text/plain", which ImageContent does not override, potentially sending an incorrect media_type to Anthropic for images without an explicit mime_type.

✓ Security Reliability

The PR adds image support for the Anthropic connector. From a security/reliability perspective, there are two concerns: (1) an ImageContent with base64 data but no explicitly set mime_type will send 'text/plain' as the media_type to Anthropic's Vision API, which only accepts image/* types, causing an API rejection at runtime; (2) edge cases where ImageContent items have neither data nor uri can produce an empty content list sent to the API, causing an unhandled failure. Several style/convention issues were already flaged in prior review comments and are not repeated here.

✓ Test Coverage

This PR adds ImageContent support (base64 and URL) to the Anthropic _format_user_message function but includes zero tests for the new functionality. No new test file was created, and the existing test_anthropic_chat_completion.py has no coverage for image handling. This is a significant gap: there are at least five new code paths (text-only fast path, base64 image, URL image, mixed content, unsupported-item warning) and none are tested. Other connectors in the same codebase (Google AI, Azure AI Inference, Bedrock) all have dedicated utils test files that cover their format_user_message with image content, establishing a clear project convention that this PR does not follow.

✗ Failure Modes

The PR introduces a concrete crash path: when ImageContent is constructed with raw bytes (e.g., ImageContent(data=b'\x89PNG...', mime_type='image/png')), content.data is truthy so line 43 enters the base64 branch, but content.data_string internally calls DataUri._data_str() which, because data_format is None for raw-bytes construction (binary_content.py:82-83), falls through to self.data_bytes.decode('utf-8') (data_uri.py:178), raising UnicodeDecodeError on binary image data. This is a realistic user path (e.g., reading a file with open('img.png','rb').read()). The fix is to explicitly base64-encode content.data instead of relying on data_string.

✓ Design Approach

I found one blocking design issue in the new Anthropic image path: it assumes every ImageContent with readable bytes is already base64-encoded, but this repo also treats raw-byte ImageContent instances as valid. That means some supported ImageContent inputs will now be serialized into an invalid Anthropic base64 payload instead of being encoded or rejected explicitly.

Flagged Issues

  • content.data_string on line 43 raises UnicodeDecodeError when ImageContent is constructed with raw bytes (e.g., ImageContent(data=b'\x89PNG...', mime_type='image/png')), because DataUri._data_str() (data_uri.py:178) attempts self.data_bytes.decode('utf-8') when data_format is not 'base64' — which is the case for raw-bytes construction (binary_content.py:82-83 creates DataUri with data_format=None). Use base64.b64encode(content.data).decode('utf-8') instead.

Suggestions

  • The ternary content.mime_type if content.mime_type else content.default_mime_type on line 43 is redundant: BinaryContent.mime_type already falls back to default_mime_type when unset (binary_content.py:156-160), so the else branch is unreachable. Simplify to content.mime_type. Additionally, ImageContent inherits default_mime_type = "text/plain" from BinaryContent, which is invalid for Anthropic's Vision API (only image/jpeg, image/png, image/gif, image/webp are accepted). Consider overriding default_mime_type in ImageContent to a proper image type, or validating that mime_type starts with "image/" before sending.

Automated review by aaarc's agents

Comment on lines +42 to +43
elif isinstance(content, ImageContent):
if (content.data):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

content.data_string will raise UnicodeDecodeError when ImageContent is constructed with raw image bytes (e.g., ImageContent(data=open('img.png','rb').read(), mime_type='image/png')).

Root cause: BinaryContent.__init__ (binary_content.py:82-83) creates DataUri(data_bytes=data, data_format=None, ...) for raw bytes. Then data_string calls DataUri._data_str() (data_uri.py:175-178), which — because data_format is None — falls through to self.data_bytes.decode('utf-8'), crashing on binary image data. The repo has a valid test case for ImageContent(data=b"test_data", mime_type="image/jpeg") without data_format="base64" (test_image_content.py:35-38), confirming this is a supported construction path.

Use base64.b64encode(content.data).decode('utf-8') instead, which correctly handles both raw-bytes and data-URI-originated ImageContent.

Suggested change
elif isinstance(content, ImageContent):
if (content.data):
elif isinstance(content, ImageContent) and (content.data):
content_items.append({"type":"image","source":{"type": "base64","data":base64.b64encode(content.data).decode("utf-8"),"media_type":content.mime_type if content.mime_type else content.default_mime_type})

if isinstance(content, TextContent):
content_items.append({"type": "text", "text": content.text})
elif isinstance(content, ImageContent):
if (content.data):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ternary content.mime_type if content.mime_type else content.default_mime_type is a no-op: BinaryContent.mime_type (binary_content.py:156-160) already falls back to default_mime_type when unset, so the else branch is unreachable. More critically, default_mime_type is "text/plain" (binary_content.py:46), which is invalid for Anthropic's Vision API — it only accepts image/jpeg, image/png, image/gif, and image/webp. Images created without an explicit mime_type will cause an API rejection at runtime with a confusing error. Simplify to content.mime_type and consider validating that it starts with "image/" or overriding default_mime_type in ImageContent.

Suggested change
if (content.data):
content_items.append({"type":"image","source":{"type": "base64","data":content.data_string,"media_type":content.mime_type})

"role": "user",
"content": message.content,
}
if not any(isinstance(item,ImageContent) for item in message.items):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test coverage for all new code paths. This PR adds five new branches (text-only fast path, base64 image, URL image, mixed content, unsupported-item warning) with zero tests. Other connectors in this repo (e.g., tests/unit/connectors/ai/google_ai/services/test_google_ai_utils.py:36-61) test their format_user_message for both text-only and image scenarios — please add equivalent tests for Anthropic in a new test_anthropic_utils.py file.

for content in message.items:
if isinstance(content, TextContent):
content_items.append({"type": "text", "text": content.text})
elif isinstance(content, ImageContent):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Untested base64 path: The base64 image branch accesses content.data_string and content.mime_type/content.default_mime_type, but no test verifies the output dict structure matches the Anthropic Vision API format ({"type": "image", "source": {"type": "base64", "data": .., "media_type": ...}}). A test should construct an ImageContent(data=.., mime_type="image/png"), pass it through _format_user_message, and assert the exact dict shape and values.

content_items.append({"type": "text", "text": content.text})
elif isinstance(content, ImageContent):
if (content.data):
content_items.append({"type":"image","source":{"type": "base64","data":content.data_string,"media_type":content.mime_type if content.mime_type else content.default_mime_type}})

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Untested URL path: The URL image branch is not covered by any test. A test should construct an ImageContent(uri="https://example.com/img.png"), pass it through _format_user_message, and assert the output matches {"type": "image", "source": {"type": "url", "url": "https://example.com/img.png"}}.

@github-actions

Copy link
Copy Markdown
Contributor

Flagged issue

content.data_string on line 43 raises UnicodeDecodeError when ImageContent is constructed with raw bytes (e.g., ImageContent(data=b'\x89PNG...', mime_type='image/png')), because DataUri._data_str() (data_uri.py:178) attempts self.data_bytes.decode('utf-8') when data_format is not 'base64' — which is the case for raw-bytes construction (binary_content.py:82-83 creates DataUri with data_format=None). Use base64.b64encode(content.data).decode('utf-8') instead.


Source: automated DevFlow PR review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python Pull requests for the Python Semantic Kernel

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants