Skip to content

[FEATURE] Improve computer vision for all models #656

Description

@edenreich

Summary

Look into https://github.com/AmrDab/clawdcursor

Which will allow to classify screenshots and check if that works well with deepseek.

It should be way faster instead of relying on Opus models or any vision models really.

Decoupling vision from the models is the right architectural decision for computer use.

Acceptance Criteria

  • Computer use is does not rely on the vendor specific vision
  • There is a clear path toward implementing it
  • A follow up ticket is created with the clear path (if applicable) - in case it's not possible to work with the above mentioned open source project etc, all of these are documented in a follow up

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request
    No fields configured for Feature.

    Projects

    Status
    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions