New

Work in progress: Agents Directory has just launched. Stay tuned, more content is on the way.

Sign In
U

UI-TARS 7B

UI-TARS-1.5 is a multimodal vision-language agent optimized for GUI-based environments, including desktop interfaces, web browsers, mobile systems, and games. Built by ByteDance, it builds upon the UI-TARS framework with reinforcement...

Capabilities:

Input
  • Text input
  • Image input (vision)
  • File input (PDF)
  • Audio input
  • Video input
Output
  • Text output
  • Image output
  • Audio output
Pricing & availability:
  • OpenRouterOpenRouter


    $0.1 / $0.2 per M
Share:
Details:
  • BProvider


    ByteDance
  • Context window


    128K
  • Input price


    $0.1/M
  • Output price


    $0.2/M
  • Open weights


    Yes
  • Knowledge cutoff


    Jan 2025
  • Released


    Jul 2025