Skip to content

Usage for image generation is incorrect (and causes error in LiteLLM) #7354

@meffmadd

Description

@meffmadd

LocalAI version:
v3.7.0-gpu-nvidia-cuda-12

Environment, CPU architecture, OS, and Version:
Linux local-ai-59687cddd-m5mv8 5.15.0-133-generic

Describe the bug
Image generation endpoints should return input_tokens, output_tokens and input_tokens_details for the usage. See: https://platform.openai.com/docs/api-reference/images/object#images-object-usage

This leads to incorrect parsing with the OpenAI-SDK and an error in LiteLLM:

pydantic_core._pydantic_core.ValidationError: 3 validation errors for ImageResponse
usage.input_tokens
  Input should be a valid integer [type=int_type, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.11/v/int_type
usage.input_tokens_details
  Input should be a valid dictionary or instance of ImageUsageInputTokensDetails [type=model_type, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.11/v/model_type
usage.output_tokens
  Input should be a valid integer [type=int_type, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.11/v/int_type

The OpenAI SDK throws no errors but the usage object is incorrect: usage=Usage(input_tokens=None, input_tokens_details=None, output_tokens=None, total_tokens=0, prompt_tokens=0, completion_tokens=0)

To Reproduce
OpenAI example:

from openai import OpenAI
import base64
client = OpenAI(base_url="http://localhost:9000/v1", api_key="not set")

prompt = """
A children's book drawing of a veterinarian using a stethoscope to 
listen to the heartbeat of a baby otter.
"""

result = client.images.generate(
    model="flux-schnell",
    prompt=prompt,
    response_format="b64_json",
)

print(result)

LiteLLM:

from litellm import image_generation

response = image_generation(prompt="A cute baby sea otter", model="openai/flux-schnell", api_key="not set", api_base="http://localhost:9000/v1")

print(response)

Expected behavior
Return correct usage dict in image generation request. Values are currently 0 anyways.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions