MCPMarkMCPMark
Leaderboard
Tasks
Docs
Explorer
Contributors
Blog
About
Discord

Blog

Latest updates and announcements from the MCPMark team.

Published on Friday, June 12 2026
NEWS

Introducing MCPMark Verified

MCPMark Verified is a stabilized, version-pinned subset of MCPMark's standard tasks for more reliable and reproducible MCP evaluation.
Published on Wednesday, September 10 2025
NEWS

MCP benchmark Leaderboard 2025-09-10 Update

Explore the latest MCPMark leaderboard update featuring top MCP benchmark models like Qwen-3-Max, Grok-Code-Fast-1, and Kimi-K2-0905. Discover their tool-use capabilities, success rates, and cost efficiency for real-world MCP applications.
Published on Tuesday, August 26 2025
NEWS

Introducing MCPMark: a comprehensive and challenging MCP Benchmark

Introducing MCPMark, a comprehensive MCP benchmark to stress-test AI models on MCP tasks. Featuring 127 expert-crafted samples, diverse environments like Notion, Github, and Postgres. Explore detailed leaderboards, cost analysis, and rigorous task design for real-world AI evaluation.
Published on Tuesday, August 19 2025
NEWS

Introducing EVAL SYS

This MCP benchmark is an EVAL SYS initiative, in collaboration between LobeHub · NUS TRAIL