Python: feat(python): support both base64 and url image uploads in Anthropi…#14096
Python: feat(python): support both base64 and url image uploads in Anthropi…#14096aaarc wants to merge 2 commits into
Conversation
…user messages Update _format_user_message in �nthropic/services/utils.py to handle now handle both base64 image bytes and image urls should resolve Issue microsoft#12944
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds support for formatting multimodal (text + image) user messages for the Anthropic connector.
Changes:
- Import
ImageContentand detect image items in user messages. - Build Anthropic-compatible content blocks for text, base64 images, and URL images.
- Log a warning for unsupported item types during formatting.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…user messages modified and made simple fixes to better follow pep8 guidelines should resolve Issue microsoft#12944
There was a problem hiding this comment.
Automated Code Review
Reviewers: 5 | Confidence: 89%
✓ Correctness
The core logic for handling base64 and URL image uploads is sound — the data/uri branching is correct, data_string returns the right base64 encoding, and the Anthropic API message format matches the documented spec. The existing review thread covers the main style/cleanup issues well. I found no high-severity correctness bugs but identified one medium-severity issue: the mime_type fallback expression is redundant and may mask the fact that BinaryContent.default_mime_type is "text/plain", which ImageContent does not override, potentially sending an incorrect media_type to Anthropic for images without an explicit mime_type.
✓ Security Reliability
The PR adds image support for the Anthropic connector. From a security/reliability perspective, there are two concerns: (1) an ImageContent with base64 data but no explicitly set mime_type will send 'text/plain' as the media_type to Anthropic's Vision API, which only accepts image/* types, causing an API rejection at runtime; (2) edge cases where ImageContent items have neither data nor uri can produce an empty content list sent to the API, causing an unhandled failure. Several style/convention issues were already flaged in prior review comments and are not repeated here.
✓ Test Coverage
This PR adds ImageContent support (base64 and URL) to the Anthropic
_format_user_messagefunction but includes zero tests for the new functionality. No new test file was created, and the existingtest_anthropic_chat_completion.pyhas no coverage for image handling. This is a significant gap: there are at least five new code paths (text-only fast path, base64 image, URL image, mixed content, unsupported-item warning) and none are tested. Other connectors in the same codebase (Google AI, Azure AI Inference, Bedrock) all have dedicated utils test files that cover theirformat_user_messagewith image content, establishing a clear project convention that this PR does not follow.
✗ Failure Modes
The PR introduces a concrete crash path: when
ImageContentis constructed with raw bytes (e.g.,ImageContent(data=b'\x89PNG...', mime_type='image/png')),content.datais truthy so line 43 enters the base64 branch, butcontent.data_stringinternally callsDataUri._data_str()which, becausedata_formatisNonefor raw-bytes construction (binary_content.py:82-83), falls through toself.data_bytes.decode('utf-8')(data_uri.py:178), raisingUnicodeDecodeErroron binary image data. This is a realistic user path (e.g., reading a file withopen('img.png','rb').read()). The fix is to explicitly base64-encodecontent.datainstead of relying ondata_string.
✓ Design Approach
I found one blocking design issue in the new Anthropic image path: it assumes every
ImageContentwith readable bytes is already base64-encoded, but this repo also treats raw-byteImageContentinstances as valid. That means some supportedImageContentinputs will now be serialized into an invalid Anthropicbase64payload instead of being encoded or rejected explicitly.
Flagged Issues
-
content.data_stringon line 43 raisesUnicodeDecodeErrorwhenImageContentis constructed with raw bytes (e.g.,ImageContent(data=b'\x89PNG...', mime_type='image/png')), becauseDataUri._data_str()(data_uri.py:178) attemptsself.data_bytes.decode('utf-8')whendata_formatis not'base64'— which is the case for raw-bytes construction (binary_content.py:82-83 creates DataUri withdata_format=None). Usebase64.b64encode(content.data).decode('utf-8')instead.
Suggestions
- The ternary
content.mime_type if content.mime_type else content.default_mime_typeon line 43 is redundant:BinaryContent.mime_typealready falls back todefault_mime_typewhen unset (binary_content.py:156-160), so the else branch is unreachable. Simplify tocontent.mime_type. Additionally,ImageContentinheritsdefault_mime_type = "text/plain"fromBinaryContent, which is invalid for Anthropic's Vision API (onlyimage/jpeg,image/png,image/gif,image/webpare accepted). Consider overridingdefault_mime_typeinImageContentto a proper image type, or validating thatmime_typestarts with"image/"before sending.
Automated review by aaarc's agents
| elif isinstance(content, ImageContent): | ||
| if (content.data): |
There was a problem hiding this comment.
content.data_string will raise UnicodeDecodeError when ImageContent is constructed with raw image bytes (e.g., ImageContent(data=open('img.png','rb').read(), mime_type='image/png')).
Root cause: BinaryContent.__init__ (binary_content.py:82-83) creates DataUri(data_bytes=data, data_format=None, ...) for raw bytes. Then data_string calls DataUri._data_str() (data_uri.py:175-178), which — because data_format is None — falls through to self.data_bytes.decode('utf-8'), crashing on binary image data. The repo has a valid test case for ImageContent(data=b"test_data", mime_type="image/jpeg") without data_format="base64" (test_image_content.py:35-38), confirming this is a supported construction path.
Use base64.b64encode(content.data).decode('utf-8') instead, which correctly handles both raw-bytes and data-URI-originated ImageContent.
| elif isinstance(content, ImageContent): | |
| if (content.data): | |
| elif isinstance(content, ImageContent) and (content.data): | |
| content_items.append({"type":"image","source":{"type": "base64","data":base64.b64encode(content.data).decode("utf-8"),"media_type":content.mime_type if content.mime_type else content.default_mime_type}) |
| if isinstance(content, TextContent): | ||
| content_items.append({"type": "text", "text": content.text}) | ||
| elif isinstance(content, ImageContent): | ||
| if (content.data): |
There was a problem hiding this comment.
The ternary content.mime_type if content.mime_type else content.default_mime_type is a no-op: BinaryContent.mime_type (binary_content.py:156-160) already falls back to default_mime_type when unset, so the else branch is unreachable. More critically, default_mime_type is "text/plain" (binary_content.py:46), which is invalid for Anthropic's Vision API — it only accepts image/jpeg, image/png, image/gif, and image/webp. Images created without an explicit mime_type will cause an API rejection at runtime with a confusing error. Simplify to content.mime_type and consider validating that it starts with "image/" or overriding default_mime_type in ImageContent.
| if (content.data): | |
| content_items.append({"type":"image","source":{"type": "base64","data":content.data_string,"media_type":content.mime_type}) |
| "role": "user", | ||
| "content": message.content, | ||
| } | ||
| if not any(isinstance(item,ImageContent) for item in message.items): |
There was a problem hiding this comment.
Missing test coverage for all new code paths. This PR adds five new branches (text-only fast path, base64 image, URL image, mixed content, unsupported-item warning) with zero tests. Other connectors in this repo (e.g., tests/unit/connectors/ai/google_ai/services/test_google_ai_utils.py:36-61) test their format_user_message for both text-only and image scenarios — please add equivalent tests for Anthropic in a new test_anthropic_utils.py file.
| for content in message.items: | ||
| if isinstance(content, TextContent): | ||
| content_items.append({"type": "text", "text": content.text}) | ||
| elif isinstance(content, ImageContent): |
There was a problem hiding this comment.
Untested base64 path: The base64 image branch accesses content.data_string and content.mime_type/content.default_mime_type, but no test verifies the output dict structure matches the Anthropic Vision API format ({"type": "image", "source": {"type": "base64", "data": .., "media_type": ...}}). A test should construct an ImageContent(data=.., mime_type="image/png"), pass it through _format_user_message, and assert the exact dict shape and values.
| content_items.append({"type": "text", "text": content.text}) | ||
| elif isinstance(content, ImageContent): | ||
| if (content.data): | ||
| content_items.append({"type":"image","source":{"type": "base64","data":content.data_string,"media_type":content.mime_type if content.mime_type else content.default_mime_type}}) |
There was a problem hiding this comment.
Untested URL path: The URL image branch is not covered by any test. A test should construct an ImageContent(uri="https://example.com/img.png"), pass it through _format_user_message, and assert the output matches {"type": "image", "source": {"type": "url", "url": "https://example.com/img.png"}}.
|
Flagged issue
Source: automated DevFlow PR review |
should resolve Issue #12944
Description
The _format_user_message function in python/semantic_kernel/connectors/ai/anthropic/services/utils.py has been updated to correctly parse base64 image data and format it into the expected Anthropic API structure.
Implementation Details:
*Iterates through message.items to check for ImageContent.
If no ImageContent is found, it returns the standard text-only dictionary, preserving existing behavior.
If ImageContent with item.data is present, it unpacks the message items into a list of dictionaries formatted to meet the Anthropic Vision API specifications.
Note: This implementation targets both base64 and url image uploads.
Resolves: ##12944
Motivation and Context
Please help reviewers and future users, providing the following information:
To allow Image Uploads (only by user messages) for Anthropic Models
You can now upload images with claude models
Should you have a Chatbot application built with semantic kernel, now anthropic llms can take image uploads
This should Resolve issue #12944
Contribution Checklist
[Y ] The code builds clean without any errors or warnings
[Y ] The PR follows the SK Contribution Guidelines and the pre-submission formatting script raises no violations
[Y ] All unit tests pass, and I have added new tests where possible
[Y] I didn't break anyone 😄
Modified version of PR#14061 which is now closed