Cursor

CursorBench 3.1

Coding

Ambiguous, multi-file tasks from real Cursor sessions that test codebase understanding, bugfinding, planning, and code review.

Score vs. cost
Leaderboard
Share:
Details:
  • Category


    Coding
  • CursorCreated by


    Cursor
  • Models tested


    10
  • Configs tested


    28
  • Leader


    ClaudeClaude Fable 5
  • Top score


    72.9%

Updated June 2026