Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
from semantic_kernel.contents.function_call_content import FunctionCallContent
from semantic_kernel.contents.function_result_content import FunctionResultContent
from semantic_kernel.contents.text_content import TextContent
from semantic_kernel.contents.image_content import ImageContent
from semantic_kernel.contents.utils.author_role import AuthorRole
from semantic_kernel.functions.kernel_function_metadata import KernelFunctionMetadata

Expand All @@ -21,6 +22,7 @@
from semantic_kernel.connectors.ai.prompt_execution_settings import PromptExecutionSettings



Comment thread
aaarc marked this conversation as resolved.
def _format_user_message(message: ChatMessageContent) -> dict[str, Any]:
"""Format a user message to the expected object for the Anthropic client.

Expand All @@ -30,10 +32,23 @@ def _format_user_message(message: ChatMessageContent) -> dict[str, Any]:
Returns:
The formatted user message.
"""
return {
"role": "user",
"content": message.content,
}
if not any(isinstance(item,ImageContent) for item in message.items):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test coverage for all new code paths. This PR adds five new branches (text-only fast path, base64 image, URL image, mixed content, unsupported-item warning) with zero tests. Other connectors in this repo (e.g., tests/unit/connectors/ai/google_ai/services/test_google_ai_utils.py:36-61) test their format_user_message for both text-only and image scenarios — please add equivalent tests for Anthropic in a new test_anthropic_utils.py file.

return {"role": "user","content": message.content}
else:
content_items: list[dict[str, Any]] = []
for content in message.items:
if isinstance(content, TextContent):
content_items.append({"type": "text", "text": content.text})
elif isinstance(content, ImageContent):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Untested base64 path: The base64 image branch accesses content.data_string and content.mime_type/content.default_mime_type, but no test verifies the output dict structure matches the Anthropic Vision API format ({"type": "image", "source": {"type": "base64", "data": .., "media_type": ...}}). A test should construct an ImageContent(data=.., mime_type="image/png"), pass it through _format_user_message, and assert the exact dict shape and values.

if (content.data):
Comment on lines +42 to +43

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

content.data_string will raise UnicodeDecodeError when ImageContent is constructed with raw image bytes (e.g., ImageContent(data=open('img.png','rb').read(), mime_type='image/png')).

Root cause: BinaryContent.__init__ (binary_content.py:82-83) creates DataUri(data_bytes=data, data_format=None, ...) for raw bytes. Then data_string calls DataUri._data_str() (data_uri.py:175-178), which — because data_format is None — falls through to self.data_bytes.decode('utf-8'), crashing on binary image data. The repo has a valid test case for ImageContent(data=b"test_data", mime_type="image/jpeg") without data_format="base64" (test_image_content.py:35-38), confirming this is a supported construction path.

Use base64.b64encode(content.data).decode('utf-8') instead, which correctly handles both raw-bytes and data-URI-originated ImageContent.

Suggested change
elif isinstance(content, ImageContent):
if (content.data):
elif isinstance(content, ImageContent) and (content.data):
content_items.append({"type":"image","source":{"type": "base64","data":base64.b64encode(content.data).decode("utf-8"),"media_type":content.mime_type if content.mime_type else content.default_mime_type})

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ternary content.mime_type if content.mime_type else content.default_mime_type is a no-op: BinaryContent.mime_type (binary_content.py:156-160) already falls back to default_mime_type when unset, so the else branch is unreachable. More critically, default_mime_type is "text/plain" (binary_content.py:46), which is invalid for Anthropic's Vision API — it only accepts image/jpeg, image/png, image/gif, and image/webp. Images created without an explicit mime_type will cause an API rejection at runtime with a confusing error. Simplify to content.mime_type and consider validating that it starts with "image/" or overriding default_mime_type in ImageContent.

Suggested change
if (content.data):
content_items.append({"type":"image","source":{"type": "base64","data":content.data_string,"media_type":content.mime_type})

content_items.append({"type":"image","source":{"type": "base64","data":content.data_string,"media_type":content.mime_type if content.mime_type else content.default_mime_type}})

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Untested URL path: The URL image branch is not covered by any test. A test should construct an ImageContent(uri="https://example.com/img.png"), pass it through _format_user_message, and assert the output matches {"type": "image", "source": {"type": "url", "url": "https://example.com/img.png"}}.

elif (content.uri):
content_items.append({"type":"image","source":{"type":"url","url":f"{content.uri}"}})
else:
logger.warning(
"Unsupported item type in User message while formatting chat history for Anthropic AI"
f" Inference: {type(content)}")
return {"role": "user","content": content_items}


def _format_assistant_message(message: ChatMessageContent) -> dict[str, Any]:
Expand Down
Loading