Skip to content

[BUG] fix image not serilzable issue in Graph and Swarm #5

@JackYPCOnline

Description

@JackYPCOnline

Checks

  • I have updated to the lastest minor and patch version of Strands
  • I have checked the documentation and this is not expected behavior
  • I have searched ./issues and there are no duplicates of my issue

Strands Version

123

Python Version

123

Operating System

123

Installation Method

pip

Steps to Reproduce

Checks

  • I have updated to the lastest minor and patch version of Strands
  • I have checked the documentation and this is not expected behavior
  • I have searched ./issues and there are no duplicates of my issue

Strands Version

1.30.0

Python Version

3.13

Operating System

Ubuntu 24.04.4 LTS on WSL2

Installation Method

pip

Steps to Reproduce

  1. Initialize a multi-node agent graph (using GraphBuilder) with S3SessionManager enabled for conversation persistence (builder.set_session_manager(s3_session_manager))
  2. Create a multimodal prompt containing an inline PDF document using the format expected by the LiteLLM proxy endpoint:
    prompt = [
        {"text": "Analyze this PDF"},
        {
            "document": {
                "format": "pdf",
                "name": "document.pdf",
                "source": {
                    "bytes": pdf_bytes  # Raw binary content
                },
            }
        },
    ]
  3. Invoke graph.stream_async(prompt, config) with a session ID
  4. Allow the graph to attempt to persist its state via S3SessionManager

Expected Behavior

The graph should successfully execute the multimodal request and persist the conversation state to S3, with binary payloads either:

  • Automatically encoded (e.g., base64) before JSON serialization, or
  • Excluded from the persisted state if they are unused after the model invocation

Actual Behavior

Graph execution fails with:

TypeError: Object of type bytes is not JSON serializable

The error originates from S3SessionManager attempting to serialize the entire graph state using json.dumps(), which cannot handle Python bytes objects.

Stack trace context: <hidden>/strands/session/s3_session_manager.py line performing:

content = json.dumps(data, indent=2, ensure_ascii=False)

Additional Context

Problem Statement

There is a fundamental incompatibility between:

  1. Model adapter requirement: The LiteLLM proxy's OpenAI-compatible formatter has a hardcoded dependency on document.source.bytes (raw Python bytes) and does not support document.source.s3Location.uri references. The Strands SDK OpenAI/LiteLLM adapter layer inherits this constraint.
  2. Session persistence requirement: S3SessionManager assumes all graph state is JSON-serializable and does not pre-process binary payloads before calling json.dumps()

How We Discovered This

  • Attempt 1 (S3-backed document reference): Tried using document.source.s3Location.uri to avoid embedding bytes, but the LiteLLM proxy's request formatter raised KeyError: 'bytes' because it expects and immediately base64-encodes inline bytes.
  • Attempt 2 (Inline bytes): Provided the expected document.source.bytes format, which fixed the LiteLLM proxy formatter error but immediately exposed the session persistence incompatibility when S3SessionManager attempted to serialize the multi-node graph state.

Current Constraint

As long as both multimodal requests and S3SessionManager are enabled in the same agent graph, they cannot coexist for requests containing binary document attachments. This is particularly problematic when using LiteLLM proxy endpoints, which provide no alternative document source format (S3 URIs are not supported).

Possible Solution

Option A: Pre-process binary payloads in S3SessionManager
Modify the session manager to detect and encode non-JSON-serializable types (e.g., bytes) before calling json.dumps():

# Before serialization
def make_json_serializable(obj):
    if isinstance(obj, bytes):
        return {"__bytes_b64__": base64.b64encode(obj).decode("utf-8")}
    # ... recursively handle dict/list
    return obj

data = make_json_serializable(graph_state)
content = json.dumps(data, indent=2, ensure_ascii=False)

Option B: Exclude binary fields from persisted state
Implement a state serialization filter that strips or replaces binary document content before persistence, then reconstructs it from the request context if needed for subsequent graph steps.

Option C: Support S3-backed document sources in the LiteLLM proxy (upstream)
This would require the LiteLLM proxy to support document.source.s3Location.uri as a valid document source format instead of requiring inline base64-encoded bytes. However, this is a limitation of the LiteLLM proxy itself (not Strands), so the more practical fix for Strands users would be Option A or B.

Related Issues

No response

Expected Behavior

n/a

Actual Behavior

n/a

Additional Context

We need to evaluate we what to exclude it or serialize and deserialize images. What is the trade off and what should we do

Possible Solution

No response

Related Issues

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions